Cisco UCS Director and the PowerShell Agent – Part 5

In this blog post, we will finally work our way up to what I’ve tried to accomplish with a combination of PowerShell, Cisco UCS Director’s northbound APIs, and the concept of parallel processing with workflows in UCS Director.

The Assumptions and Caveats

First, let’s talk about the assumptions for this PowerShell script and UCS Director workflow. My current network architecture is one that models itself after one of Cisco’s validated designs for data centers. This design has a network spine (provided by the Nexus 7000 switch line) and network leaves (provided by the Nexus 5000 switch lines). As part of the configuration of this design, Cisco FabricPath is enabled across the switching lines.

With the implementation of FabricPath, the process to create a simple L2 VLAN across each switch is relatively simple and lends itself to be easily capable of taking simple variables (mostly in the form of which switch the operation needs to be performed). The only main difference between the two switches that forces me to have to create two different VLAN creation scripts is due to the Nexus 7000 family. This only forces me to have to specify the VDC (virtual device context) before running the small script to create a VLAN and enable FabricPath mode on that VLAN.

Also, as part of my architecture and current usage of these switches is the fact that not all the L2 VLANs were created with Cisco UCS Director. In response to this, I created two different workflows. We’ll call them: Nexus List Add and Nexus List Rollback Setup. Nexus List Add is used when the L2 VLAN is detected to NOT already exist on the switch. This will run a block of NX-OS CLI that will create the VLAN based on a passed VLAN name and VLAN ID variable. From there, FabricPath mode is enabled and then we save the running-config to startup. Regarding Nexus List Rollback Setup, instead of going through and trying to create a VLAN that already exists on the switch, we register a rollback task with UCS Director (for consistent VLAN removal across the entire FabricPath domain) and set ownership of the VLAN in UCS Director for the customer in question. This forces UCS Director to know about the VLAN and which customer owns the VLAN in question.

One last caveat for the PowerShell environment on my Cisco PowerShell Agent servers. I’ll admit that I’ve been rather lazy since deploying these out originally and the PowerShell version on them is still version 4.0. This causes issues with large JSON responses by some of the UCS Director northbound APIs. In version 4.0, there was a maximum size of the response in which you could use the ConvertFrom-Json cmdlet. I was forced to use some borrowed code that adjusts some of the JSON parameters in the system for my runspace and then creates a custom PowerShell object. Unfortunately, this process adds a lot of overhead. I’ve recently found that by upgrading my environment to PowerShell 5.0, the issues go away and I can get away from the custom code to create the PoSH object.

The Code

The code is available at the following GitHub location: My Example Code

Declared Functions

I wrote two specific functions in my PoSH code to be reusable through this script. They are called Get-UCSDURIs and Connect-NestedPSSession. For reference, I’ve previously blogged about why I needed to create Connect-NestedPSSession (Using Invoke-WebRequest with the Cisco PowerShell Agent). Get-UCSDURIs was created so that I could go through my entire switching stack and generate all the specific URIs for calling the two Nexus workflows and filling in all the variables. Since I have 2 Nexus 7000s and 10 Nexus 5000s in this configuration, I need to generate a combination of 12 URIs to send to the UCS Director northbound API, userAPISubmitWorkflowServiceRequest.

In Get-UCSDURIs, I also do a quick lookup of the workflow inputs by using the northbound API userAPIGetWorkflowInputs. The reason for this is that even if the input names on the workflows are the same, in UCS Director, they are given a specific number on the variable name to make them unique (example below from one of the JSON returns on Nexus List Add).

Screen Shot 2017-03-14 at 2.31.23 PM

A total of three parameters are passed to the API when executing userAPISubmitWorkflowServiceRequest.  The first, param0, is a string of the name of the workflow you wish to execute.  The second is a lengthy string that’s passed as param1 in the userAPISubmitWorkflowServiceRequest URI.  Most of the code in Get-UCSDURIs focuses on creating this parameter.  Since all this information is put into the URI (and not using any sort of request body), I could not create this as a JSON literal.  I had to create this as a large string object, which is why the code looks the way it does with multiple uses double quotes and back quotes.  Lastly, we send a UCS Director service request ID (or SRID) to param2.  In my case, I usually send the SRID of -1.  This means that each workflow call is independent of themselves and does not register rollback tasks with the parent workflow.  I handle rollbacks in a different way later, since I also want to use parallel processing in removing everything that I created.

The Main Script Body

I start the main script body by passing in multiple arguments from UCS Director.  Many of these are specific to my environment, but the gist is that I need to pass in which of my site’s this needs to be created, what VLAN ID is being requested, what the requested VLAN name is going to be, which UCS Director group do I want to assign ownership to, and which parent UCSD SR ID I want to assign the rollback tasks too (however, in the prior paragraph, this is almost always -1, but I wanted the function to be usable for anything else too, not just this specific use case).

I’m also passing in specifics for the PowerShell Agent at the site in question (the PS Agent) and the username and password I want to use to initiate the nested remote PowerShell session using Connect-NestedPSSession.

With UCS Director and the northbound APIs, there is no Basic authorization header to work with.  UCS Director creates an API key for each login that is accessible through the GUI.  To be able to use that key, you need to store it into a hashtable in which one of the keys is labeled X-Cloupia-Request-Key.  I do this by creating an empty hashtable first, the using the Add method on that object and insert the key/value pair into the hashtable.  For the most part, this is the only thing required for the northbound APIs I’m using in this script.

After setting the header, I now have everything I need to start initiating northbound API calls to UCS Director.  Before I start initiating any specific workflow, I need to go through a couple of lists to check to see what needs to be worked with and to determine which subsequent workflow (Nexus List Add or Nexus List Rollback Setup).  I use the northbound API, userAPIGetTabularReport, to get these lists of networking equipment (MANAGED-NETWORK-ELEMENTS-T52) and the VLANs that have already been pulled from the switch inventories and in the UCS Director databases (VLANS-T52).

After running through these APIs and parsing through the responses (and filtering down to very specific information), we then begin the process of cross checking to see if the VLAN exists on the equipment.  Depending upon whether it exists or doesn’t AND the model type of the equipment being checked, the switch information gets placed into one of four lists.  Each is labeled with either a 5k or 7k string and whether it’s a rollback or not.  These four lists are then processed and the URIs are generated using the Get-UCSDURI function.  Lastly, all four URI string returns are smashed together into one large, comma-separated list which should contain, depending upon the site, either 10 lines or 12 lines of URIs to process.

We start the execution process, but taking our URI list and initiating each call using a foreach loop.  I store the SR ID into a hashtable that I can use for keeping an eye on the workflows I just initiated.  Now, to keep an eye on each one of these API requests, we utilize the northbound API, userAPIGetWorkflowStatus.  This returns a numerical code to tell us what the status of the workflow is.  If the status code returned is one in which the workflow has reached some sort of stop, I remove the SR ID from the hashtable.  Also, I put the SR ID into a string called $sr_list.  This list is going to be used to store all the workflow SR IDs and use it for rollback purposes.  I will also return this list to UCS Director.  Once all the SR IDs have been removed from the hashtable, my while loop will shut down and the script will finish.

The assumption is that all my API calls have completed (without critical stop) and that a Layer 2 VLAN has been pushed out, in a parallel processed way.

Parallel Processing

I haven’t mention much on this concept up to this point, having focused on the code instead.  The reason that I wanted to perform this in parallel is that each switch acts independently of each other.  While FabricPath requires you to create a VLAN on all the switches for traffic purposes, there’s no major error checking that forces you to have it all created, at once.

Taking this idea, you can see that instead of waiting in a long sequential order (switch1 -> switch2 -> … -> switch10), I can initiate all 10 requests, at the same time.  This is what I mean by parallel processing.  Before I created this script, my entire process of creating a L2 VLAN was taking upwards of 10-12 minutes.  In a cloud world, this seemed extremely high, especially for something as simple as a L2 VLAN.

After this implementation, I have execution times as low as three minutes at one site (has less VLANs to parse through) and as high as five minutes (due to 12 switches with 1500 VLANs on them; this is roughly around 175,000 lines of raw JSON information).  Oh, and I forgot to mention I also ADDED new functionality to the overall workflow (these efficiencies were just one part of the major fixes requested).  I added automatic additions of vSphere created distributed portgroups and adding those to a customer’s network policy in their virtual data center.  I also added the ability to go through UCS Central and find any customer service profiles and add the VLAN to the vNICs of each of those service profiles.  Included in all of this were also permissions granting to our VMware hosted environment for those distributed portgroups.  So, I added some major automated assumptions AND drastically cut the time of execution.

Conclusion

I hope that either you read through the entire series or at least got something new and useful out of this blog series.  I was amazed just how little was being published from either Cisco SMEs or from the community with regards to PowerShell and Cisco UCS Director.  I hope that you, the reader, have realized just how powerful the PowerShell capabilities can be and how easy you can extend systems with no direct support in Cisco UCS Director.

Posted in Technical | Tagged , , , , | Leave a comment

Cisco UCS Director and the PowerShell Agent – Part 4

In this blog post, we will be going over some advanced use cases of PowerShell with Cisco UCS Director and the PowerShell Agent.  We will go over a scenario in which we need multiple bits of data returned from a PowerShell script and how that can be handled with some custom tasks for parsing XML data and returned as UCS Director macros/variables.

Real World Use Case

In my own environment, I had one great use case that made me start leveraging more data returns from PowerShell scripts.  In my lab and in production, we run Cisco UCS Central to provide a centralized repository for many Cisco UCS specific constructs (like MAC pools, service profile templates, vNIC templates, etc).  As we grew to multiple data centers, we started to worry about major overlap problems with pools in each Cisco UCS Manager instance, so we decided to start using UCS Central to provide and divide these entities up from a global perspective.

Cisco UCS Director had included a certain number of tasks that Cisco themselves had authored.  Unfortunately, was with many out-of-the-box task implementations in UCS Director, they didn’t quite fit everything we needed to perform our specific processes when it came to building UCS devices for either our own virtualization environments or for bare metal for our customers.

My main use case came from a limitation in the code for the task for cloning a global service profile template.  After upgrading UCS Central from 1.2 to 1.3, this task started to return the value of “undefined” for values like the MAC addresses assigned to our NICs or the HBA WWPN information.  We found that there was now a delay that had to occur to properly pull this information from the cloned service profile and return it back to UCS Director.

PowerShell Saves the Day

As with most of the Cisco UCS Director out-of-the-box tasks, you are unable to see the Javascript/CloupiaScript code within.  This made it impossible to resolve the issue through the existing task (although, a TAC case was logged about the issue).  We resorted to recreating the main functionality in PowerShell using the Cisco UCS Central module (available in the Cisco UCS PowerTool Suite:  Cisco UCS PowerTool Suite).

The Code

A caveat before we continue.  This code was written well over a year ago, so some of the code within may have changed drastically as the UCS Central PowerShell module may have gone through some revisions from earlier iterations.  Also, you are going to notice that I hard coded the password to send to the Connect-UcsCentral cmdlet.  Ask any security person about this practice and you’ll likely get hit with any sort of random object from the size of an eraser to that of a city bus.

The Script

Import-Module CiscoUcsCentralPs

$ucsc_org = ($args[0].Split(";"))[1]   # Passing a UCS Director variable and getting the Central Org DN from it
$ucsc_account = ($args[0].Split(";"))[0]  # Passing a UCS Director variable and getting the UCS Central account (if multiples)
$uscs_gspt = ($args[1].Split(";"))[2]  # Passing a UCS Director variable and getting the Global Service Profile Template DN from it
$customer_id = $args[2]  # Passing a string for usage in creating the name of the service profile
$device_sid = $args[3]  # Passing a string for usage in creating the name of the service profile

$ucsc_username = "*Insert UserName to authenticate to Central*"
$ucsc_password = ConvertTo-SecureString -String "*Password for Central account*" -AsPlainText -Force
$ucsc_credential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $ucsc_username, $ucsc_password
$ucsc_conn = Connect-UcsCentral -Name -Credential $ucsc_credential

$gsp_name = $customer_id + "-" + $device_sid    # Create combined global service profile name
$new_gsp = Get-UcsCentralServiceProfile -Dn $uscs_gspt | Add-UcsCentralServiceProfileFromTemplate -NamePrefix $gsp_name -Count 1 -DestinationOrg (Get-UcsCentralOrg -Dn $ucsc_org) | Rename-UcsCentralServiceProfile -NewName $gsp_name   # Create GSP from template and rename to remove "1" from end of name

Start-Sleep 15   # Sleep for 15 seconds to allow for UCS Central to process global pool values into GSP
$new_gsp = Get-UcsCentralServiceProfile -Name $new_gsp.Name   # Reload the service profile

$ucsd = @{}   # Create our hashtable to store values

# Create the hashtable values for the various parts of the global service profile to be used by later UCS Director tasks

$ucsd["VNIC1_MAC"] = ($new_gsp | Get-UcsCentralVnic -Name ESX_Mgmt_A).Addr   # MAC for Mgmt NIC/PXE Boot NIC, named ESX_Mgmt_A
$ucsd["VNIC2_MAC"] = ($new_gsp | Get-UcsCentralVnic -Name ESX_Mgmt_B).Addr   # Secondary MAC for Mgmt NIC, named ESX_Mgmt_B
$ucsd["VHBA1_WWPN"] = ($new_gsp | Get-UcsCentralvHBA -Name vHBA1).Addr   # WWPN of vHBA1, used for zoning, named vHBA1
$ucsd["VHBA2_WWPN"] = ($new_gsp | Get-UcsCentralvHBA -Name vHBA2).Addr   # WWPN for vHBA2, used for zoning, named vHBA2
$ucsd["VHBA1_WWN"] = ($new_gsp | Get-UcsCentralvHBA -Name vHBA1).NodeAddr + ":" + ($new_gsp | Get-UcsCentralvHBA -Name vHBA1).Addr  # WWN used for EMC initiator creation for vHBA1
$ucsd["VHBA2_WWN"] = ($new_gsp | Get-UcsCentralvHBA -Name vHBA2).NodeAddr + ":" + ($new_gsp | Get-UcsCentralvHBA -Name vHBA2).Addr  # WWN used for EMC initiator creation for vHBA2
$ucsd["ServiceProfileIdentity"] =  $ucsc_account + ";" + $ucsc_org + ";" + $new_gsp.Dn   # UCS Central Service Profile Identity, in UCS Director variable format

return $ucsd   # Return hashtable to UCS Director for processing with custom task

From the beginning, you’ll notice that we must import the modules we wish to use.  The PowerShell agent does not have full access to things like Windows profiles or scripts to load these into the runtime environment for us.  You must declare all the modules you wish to use (and that are installed on the device in question) in all your PowerShell scripts you wish the PowerShell Agent to interact with!

Our next block of code is bringing in arguments in which we sent to the script in question.  At the end of my last blog post, I explained how we can use the system array of $args to pass arguments from Cisco UCS Director to our PowerShell scripts.  From the code, I’m passing in a total of four arguments, but I’m creating five PowerShell variables (all strings) from those arguments.

Now, some object types in Cisco UCS Director are formatted in certain ways.  Take $args[0] that I’m sending to this PowerShell script.  You can tell that by using the Split function and by specifying how to split the string, that it’s semi-colon delimited.  The format of the string (which I believe is to specify how UCS Director sees UCS Central organization objects) looks like this:  1;org-root/org-XXXXX.  UCS Central organization objects appear like this to specify the registered UCS Central instance ID in Director (or the “1” in this example) and the Cisco distinguished name in UCS Central (or DN) of the organization.  So, from one argument, we can get two variables for use by this script.

After our argument section of the script, we perform our operations to create a PSCredential object and use that object to log into UCS Central.  I have a line of code specific to my organizations naming convention in how to name the service profile next.  Follow that up with our second UCS Central specific cmdlet in Get-UcsCentralServiceProfile.  From here, we have examples of how some of the object passing can happen between the Cisco UCS Central cmdlets using the PowerShell pipeline.  This command is getting the global service profile template (we passed this in as an argument), cloning a global service profile template from it, and naming the global service profile template with our custom naming convention.

Now, the code that fixed the issues we were having with the Cisco UCS Director out-of-the-box task were the next couple of lines.  Start-Sleep allows for you to put a hard wait in the execution of the script and wait for background processes to occur.  Once we waited 15 seconds, we re-read the profile information we just created.  This changed all the variables that were listed as “undefined” to their proper values.

Returning the Information

The last part of this code focuses on using a PowerShell object called a hash table.  A hash table is an object containing many PowerShell dictionary objects.  A dictionary object made up of two properties:  a key (or a term in dictionary speak) and a value (or a definition in dictionary speak).  Using this knowledge, we can use a hash table to store multiple pieces of UCS Central global service profile information and give that back to UCS Director for parsing and processing.

You’ll see in the code that first we declare our hash table.  From there, we can declare our keys and start storing values into the table.  You’ll notice that some of the keys that I chose to put in this hash table are address values from the vNICs in the service profile or the vHBAs in the service profile.  Lastly, I also created a value, in the form of the key ServiceProfileIdentity that will be returned to UCS Director.  The value is semi-colon delimited and in the format, that UCS Director expects for a UCS Central Service Profile Identity object.

Lastly, we tell the script to return the contents of the hash table.  At this point, the PowerShell Agent will create the XML response and specifically list all the hash table contents within the XML response.  We need to utilize some XML parsing on the Cisco UCS Director side to then store these values as macros for other parts of our UCS Director workflows.

Parsing the Response

Ages ago, I found this great example out on the UCS Director Community site (UCS Director Community Site.  This laid the foundation for how to parse through the response and get the variables out that I needed to create macros for usage by UCS Director.

I downloaded the workflow example within the above URL and imported it into my UCS Director environment.  When I did this, it automatically exported the custom task for parsing.  I cloned a copy of that task and started to make some edits to make it my own.  This all can be found by navigating (as a system administrator) in UCS Director to Policies > Orchestration and then clicking on the Custom Workflow Tasks tab.

We can start to edit this custom task by selecting the cloned task and clicking on the Edit button in the bar above it.  I usually skip the Custom Task Information section and proceed to the next section, Custom Task Inputs.  In this section, you can see the following:

Screen Shot 2017-03-01 at 4.21.09 PM

The input expected is going to be called xml.  We will be passing the output from the Execute PowerShell task to this input.

Moving along to the next screen, this is where the customization begins.  Knowing what we have for our key value names coming from our PowerShell hash table, we can create outputs based on those very names.  Here’s what I put for my values:

Screen Shot 2017-03-01 at 4.23.29 PM

Everything coming out of this task is going to be a generic text value, except ServiceProfileIdentity.  The reason for this is that my workflow is going to have a task requiring it to be sent this object type to be able to perform the task against it.

We skip past the controller section, as we are not going to be performing any marshalling on this task.  That leads us to this script:

importPackage(com.cloupia.lib.util);
importPackage(java.util);

var xml = input.xml

// Try and parse the ... section

var objects_xml = XMLUtil.getValue("Objects", xml);

// Parse the objects list now (should also be a single section):

object_list = XMLUtil.getTag("Object",objects_xml.get(0));

// Parse the object_list to get properties:

property_list = XMLUtil.getTag("Property",object_list.get(0));

// PowerShell returns arrays weirdly to UCSD, alternating rows of keys/values

// Like this:

//   ip

//   192.168.100.1

//   server_name

//   New Server

//

// Store output in a HashMap:

var variable_map = new HashMap();

// Store previous keys in buffer:

var key_buffer = "";

// Loop through all values taking even as keys and odd as values:

for (i = 0; i < property_list.size(); i++) {

// Remove XML tags (can't seem to coax the XML library to do this for me!)

property_list.set(i, property_list.get(i).replaceAll("",""));

// Keys

if ((i % 2) == 0) {

key_buffer = property_list.get(i);

}

// Values

else {

variable_map.put(key_buffer, property_list.get(i));

}

}

// Match desired output to HashMap fields:

output.VNIC1_MAC = variable_map.get("VNIC1_MAC");
output.VNIC2_MAC = variable_map.get("VNIC2_MAC");
output.VHBA1_WWPN = variable_map.get("VHBA1_WWPN");
output.VHBA2_WWPN = variable_map.get("VHBA2_WWPN");
output.VHBA1_WWN = variable_map.get("VHBA1_WWN");
output.VHBA2_WWN = variable_map.get("VHBA2_WWN");
output.ServiceProfileIdentity = variable_map.get("ServiceProfileIdentity");

The section to focus on to get our outputs is the last lines.  The code parses through the XML return and creates a JavaScript version of a hash table (called a Hash Map) and then we can get the values out of the keys within.  By assigning those values to the output variables in the script, we are creating the last steps to pull that information in as UCS Director macros and thusly, able to pass the information onto other parts of our workflow!

You can see here that I can take the MAC of my primary NIC for OS deployment and assign it as the NIC to use for PXE job creation on the Bare Metal Agent server by passing in the macro that I created:

Screen Shot 2017-03-01 at 4.32.19 PM

To Be Continued

 The last part of this blog series will try to take the PowerShell Agent to another level by showing how you can use the PowerShell Agent service to perform northbound API calls to other systems OR even to UCS Director itself.  I’ll show examples of recent enhancements I did to my workflows to enable parallel processing and gain massive efficiencies to the overall time of execution for some of the tasks within my datacenter.

Posted in Technical | Tagged , , , , , , | Leave a comment

Observations on Blame Cultures and the S3 Outage

One would think that this was scripted the way it happened, but I can assure you, that was not the case.  I’m in the middle of reading a book (a really good book on blame cultures; I highly suggest a copy:  Here).  The day after I finished reading the book, my tech social media feeds were aflame with mentions of problems with AWS (specifically in the S3 service and in the US-East-1 region).  Much has been said about the need for proper application architecture using cloud building blocks and much reflection on whether the cost of this resiliency is worth a significant outage.  I fully expect that there’s plenty of discussions happening within organizations about these very factors.

I found myself not necessarily focused on the incident itself.  I was more interested, strangely enough, on any sort of public post-mortem that would be brought forth.  Having read many DevOps books recently, the concept of a public post-mortem is not new to me, but I can guess that for many private organizations, this could seem like a foreign concept.  When an incident occurs, many in the organization just want the incident to go away.  There’s an extreme negative connotation associated with incident and incident management in many organizations.  To me, post-mortems give me great insight into how an organization treats blame.

Recently, I’ve been doing quite a bit of research into how organizations, specifically IT organizations, deal with blame.  Now, in Amazon’s case, they’ve listed “human error” as a contributing cause to the outage.  What comes after this, in the post-mortem, goes to show how Amazon handles blame internally.  The two quotes, taken from the post-mortem (available here:  https://aws.amazon.com/message/41926/) are telling in how this was handled internally:

I’ve put in bold my key terminology of this event.  Notice that outside of one mention of the authorized S3 team member, every other mention has something to do with the tools to perform that action or in the process that would have helped to prevent the issue.  In this case, the root cause is NOT the operator that entered in the command, it was the process that lead to the input and the associated actions the system took based on the runbook operation.

At 9:37AM PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process. Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.

We are making several changes as a result of this operational event. While removal of capacity is a key operational practice, in this instance, the tool used allowed too much capacity to be removed too quickly. We have modified this tool to remove capacity more slowly and added safeguards to prevent capacity from being removed when it will take any subsystem below its minimum required capacity level. This will prevent an incorrect input from triggering a similar event in the future. We are also auditing our other operational tools to ensure we have similar safety checks

So, why the long-winded breakdown of the S3 post-mortem?  This got me thinking about all the organizations that I’ve worked for in the past and made me realize that when it comes to any sort of employment change, especially one that requires on-call or primary production system ownership, I’ve got a perfect question to ask of a potential employer.  Ask that employer about their last major internal incident.  While you might not get a full post-mortem, especially if the organization doesn’t believe in the benefit of such a document, key information about the incident and the handling of the incident should become immediately prevalent.  If the incident was a human error, ask about how the operator that performed the action was treated.

Unfortunately, in many IT organizations, the prevailing thought is that a root cause can easily be established that the operator was incapable of performing their role and immediate termination is a typical reaction to the event.  If not immediate termination, you can be rest assured that the organization will forever assign a hidden asterisk to your HR file and the incident will always be held against you.  Either way, this sort of thought process is what ends up causing more harm to the organization, long term.  Sure, you think you removed the “bad apple” from the mix, but there’s going to be collateral damage in the ranks of those that still must deal with the imperfect technical systems that still need their “care and feeding” to be able to function optimally.

Honestly, if this is the sort of response you get from a potential employer, I would end the interview at that time and no more discussions with that organization would take place.  Based on their response to the incident, you can easily see that:

  • The organization has no real sense or appreciation for the fact that the technical systems IT staff works with on a day-to-day basis are extremely complex
  • Those systems are also designed to be that where updating or changing the system is considered a mandatory operational requirement
  • When change occurs, you can never guarantee the desired outcome 100% of the time. Failure is inevitable.  All you can do is mitigate the damage failure can do to the system in question.
  • Reacting to the incident by putting the entire root cause on the operator is a knee-jerk reaction and occludes you from ever getting to the root cause(s) of your incident
  • By levying a punishing of termination onto the operator in question causes a ripple effect to the rest of the staff. The staff are now less likely to accurately report incident information, out of fear (of employment, being branded a “bad apple”).  This leads to obscuring root causes, which then leads, ultimately, more failures in the system.

Are you sure you want to work for an organization that puts its pride on “just enough analysis” and breeding a culture of self-interest and self-preservation?  No, me either.  Culture matters in an organization and with those seeking opportunities in that organization.  It’s best to figure out what the culture really is before realizing you made a major mistake working for an organization that loves to play the name/blame/shame game.

Posted in Technical | Tagged , | 1 Comment

Cisco UCS Director and the PowerShell Agent – Part 3

In this blog post, we will be discussing how we can use the Cisco UCS Director macros (also known as variables) from either workflow inputs or from task outputs.  We will also show how these macros can be passed as arguments to our PowerShell scripts through the PowerShell Agent.

UCS Director Macros

Cisco UCS Director uses a variable system to be able to use inputs or outputs of tasks in subsequent tasks.  Based on the Cisco documentation, they call these macro variables or “macros” for short.  Not only can these macros come from workflow tasks, but there is also a slew of system macros that are available.  The orchestration within UCS Director can allow for you to not only use these macros for workflow input and task outputs, but you can also use them for many virtual machine level annotations.

When you create a workflow, you can define the inputs that you wish to either be entered in by the person running the workflow or by defining admin level inputs that require no manual entry.  You can access this functionality by navigating an admin level session to UCS Director to Policies > Orchestration.  From the Workflows tab, you can click on the +Add Workflow button to create a brand-new workflow.

screen-shot-2017-02-20-at-2-54-12-pm

 

Above is the first screen given to you.  From here, you can set some of the workflow settings, like name, description, and context.  You can also select some default behaviors of the workflow, like whether you want to set default email notifications to the initiating user of the workflow.  For the sake of this example, we’ll just fill out the bare basics (Workflow Name, Folder to place the Workflow).  Keep in mind that a workflow name CANNOT be duplicated, regardless of folder placement.  You will need unique names for all workflows!  Click on Next to advance.

screen-shot-2017-02-20-at-2-58-00-pm

This next screen is where the workflow input magic happens.  By clicking on the + button below the “Associate to Activity” section, you will begin the process of adding a new workflow input to the workflow.

screen-shot-2017-02-20-at-2-59-15-pm

At a bare minimum, the only thing required for UCS Director to enable an input is to give it a label (extremely important for later!) and the input type.  The input type comes in handy when dealing with task inputs that require the input to be in a preformatted type.  We will show examples of this later.  For this post, we will just show creating a Generic Text Input type.

I put in a label of Test and clicked on the Select button.  From there, a listing of all input types is available to be searched through.  In the upper right hand corner of that screen, enter in “generic” and it will filter the listing and look like this:

screen-shot-2017-02-20-at-3-01-14-pm

Clicking on the checkbox for Generic Text Input will highlight the option.  If we click on the Select button, we will see that the workflow input should look like this:

screen-shot-2017-02-20-at-3-03-11-pm

You’ll notice there are a couple of other checkboxes now available.  The first is the Multiline/MultiValue Input checkbox.  This option will allow for you to common separate multiple inputs and can be extremely useful when processing multiple values with a task that can take it as a workflow.  Otherwise, you can process this list in a Start…End loop in the UCS Director Workflow Designer.  We will get into looping in a workflow in a future blog post.

The last option that is available is the Admin Input checkbox box.  By checking this box, the admin can either select the object from the UCS Director database or enter in a hardcoded value for this variable that cannot be changed.

If neither of these checkboxes is selected, upon executing the workflow, the person executing will be presented with fields in which they will have to enter in their own text string.  Clicking on the Submit button will place this macro into the screen and you can close out the original workflow creation by clicking on Next button to the User Outputs (this comes in handy once you start implementing the concept of Parent/Child workflows, covered in later blog posts).  Lastly, click on Submit to save the workflow.

Using the Macro in a Workflow Task

Now that we’ve registered a workflow input, we can pass it to any task that accepts a Generic Text Input type for input.  Open the Workflow Designer on this new workflow we’ve created.  Let’s drag a test task over to show this.  Since I like the Execute PowerShell Command task, I’ve dragged that over and have begun filling out the task and advanced the screens to the User Input Mapping section.

screen-shot-2017-02-20-at-3-11-13-pm

In this example, we can see that PowerShell Agent takes an input type of Generic Text Input.  You can click on the Map to User Input checkbox and the User Input drop down will have all Generic Text Input macros available from either workflow inputs or other task outputs.  Since we have no other task outputs right now, the only macro to choose from was our previously created Test macro.

We can also use this macro as an inline macro for a text field.  If we click on Next, we can advance to the Task Inputs screen.  You can put the value inline by referencing the macro in the following format:  ${<macro name>}

In this case, we will place ${Test} into one of the fields.

screen-shot-2017-02-20-at-3-15-55-pm

The Label field will now automatically use whatever value is input by workflow executor.

Passing Macro Information to PowerShell

 Now that we’ve shown that you do put the macro value into inline fields, we can use this information for passing arguments into PowerShell scripts.  From this same task, let’s say that there is a PowerShell script called “HelloWorld.ps1”.  I need to pass the Test macro to it for processing.  In the Command/Script field, I would put the following:

screen-shot-2017-02-20-at-3-19-06-pm

This is a very primitive way to pass arguments to a PowerShell script.  Inside my script, to use this value, you could easily store this string information with a single command using the $args array.  You could do this like so:

screen-shot-2017-02-20-at-3-22-39-pm

You can pass many more macros this way, just remember which position you put those arguments.  From there, you can take the information in those macros and perform any of the PowerShell options you have at your disposal.

To Be Continued…

In the next blog post, we will explore some more advanced techniques using PowerShell.  One of the use cases I’ve found to be highly unique is returning multiple values from a PowerShell script and storing them into multiple task outputs for future usage using PowerShell hash tables and UCS Director CloupiaScript XML parsing, in the form of a Custom Task.

Posted in Technical | Tagged , , , | Leave a comment

Cisco UCS Director and the PowerShell Agent – Part 2

In this blog post, we will be discussing how to utilize the Cisco PowerShell Agent and the provide Cisco UCS Director task, Execute PowerShell Command.  We will also go over what it’s going to take to parse the response of this task and retrieve information to be used as Cisco UCS Director variables for other tasks in our workflow.

Execute PowerShell Command

 First things first, we need to create a new workflow to begin using this task.  You can easily navigate to the workflow designer, by using the menu bars while logged in as an administrator.  Navigate to the following location:  Policies > Orchestration.  Make sure you are on the Workflows tab.  Create a new workflow from the menu options in these screens.

Once you’ve created the workflow, enter the Workflow Designer.  Along the left side of the window, you should be able to see what sort of tasks are available to be placed in the designer.  In the text entry field near the top, go ahead and enter the word “PowerShell”.  You will find the Cisco created task under the Cloupia Tasks > General Tasks folder.  Click on the task and drag it to the designer layout portion of the screen.  Once you’ve done that, double click on the task to begin editing the task.

You can proceed right through the section for User Input Mapping, as we don’t have any sort of inputs we are assigning to required values of the task.  Proceed to the “Task Inputs” section of the task edit process.  You should see something like this:

screen-shot-2017-02-10-at-3-42-29-pm

As you can see, I have already entered in some of the values for this task.  This looks very like what we had entered in the last blog post (Cisco UCS Director and the PowerShell Agent – Part 1).  The only major difference is that there is a PowerShell Agent selection box.  Populating this box is the different PowerShell agents we’ve registered with UCS Director.

One of the other major differences is that the screen has a rather lengthy scrollbar.  Using the scrollbar, we can see that there are some other entries that can be made.  For instance, you can perform a rollback of this task, in the form of calling upon another script.  This comes in handy for cleaning up whatever was added or changed in your environment.  As a good example, if you use PowerShell to perform operations in Cisco UCS Manager, when you rollback the workflow and remove those services, you would need to remove the changes you just made.  If you create a service profile and associated it with a blade server, you’d want to disassociate the service profile and delete it when that service is no longer necessary.

screen-shot-2017-02-10-at-3-45-46-pm

Other key parts of the task inputs include these last options:

screen-shot-2017-02-10-at-3-50-03-pm

The task has the ability for you to specify how you want to handle the task output.  Up until recently, the only output format that was available was XML.  Since UCS Director 6.0 was released, the option to return the output in JSON format was introduced.  The Depth option comes in handy for JSON format.  The last component is the Maximum Wait Time.  This is very important in determining how long you want UCS Director to keep an eye on this task before it automatically ends checking on the task.  Before setting up this task, it’s highly recommended to see exactly how long you expect this task to take and account for some extra time.

Lastly, pay attention to this final output variable:

screen-shot-2017-02-10-at-3-53-03-pm

When it comes to parsing the output of the script, this is the value we need to pass to a parsing task to retrieve information for other UCS Director tasks in our workflow.  Note that this comes back as UCS Director’s implementation of a generic text input object.

Parsing the Response

 As recently as UCS Director 6.0, a new Cisco-created task was included in UCS Director called Parse PowerShell Output.  This task is relatively decent at retrieving simple values from the returned text and creating a single UCS Director variable.  To work with the task, drag the task into the Workflow Designer.  Upon getting to the User Input Mapping section of the task, we need to map this value to the output of our Execute PowerShell Command task.  You should be able to find it in the drop down menu when you select that you want to map this object to user input.  It should look something like this:

screen-shot-2017-02-10-at-3-59-31-pm

In the output section, you’ll see the following values that should be available after processing the text we are giving to this task:

screen-shot-2017-02-10-at-4-01-45-pm

These variables will store parsed information from our PowerShell script and allow for us to use these values as inputs into other UCS Director tasks.

Caveats

If you’ve worked with PowerShell, you can easily see that this task seems to only have one set of key/value pairs defined.  If you are attempting to return many pieces of information, this is going to be a problem.  This is where some custom task authoring is going to come in handy.  I would highly suggest some examples from the UCS Director communities site (UCSD Workflow INDEX).  Armed with some of these older workflows, you can go through some of the CloupiaScript/JavaScript code to see how the XML return can be parsed and all values returned, especially if you are returning a PowerShell hashtable.

To be continued…

In the next blog post, we continue the discussion of how to send arguments to your PowerShell scripts…

Posted in Technical | Tagged , , , | Leave a comment

Cisco UCS Director and the PowerShell Agent – Part 1

If you’ve installed Cisco UCS Director before, you know that there exists a small component that can be installed onto a Windows device that allows for remote execution of PowerShell scripts.  These scripts can be harnessed and used in ways to add automation and orchestration functionality to UCS Director where native integration into UCS Director may not already be happening.  This series of posts is to explain what the use cases for the PowerShell Agent are, what the PowerShell agent does, and how you can utilize PowerShell for some advanced techniques within the Cisco UCS Director platform.

Installation and Configuration

 I will not go into the full details of the base install and configuration of the agent onto a Windows server in your UCS Director environment.  I will make sure to mention, however, that you should remember these key details:

  1. Cisco UCS Director and the PowerShell Agent services default communication port is 43981/tcp
  2. An access key is created by the main UCS Director services that will be used for secure communication on that port. Make sure to copy that key from the main UCS Director interface and enter it into the PowerShell Agent configuration on the Windows server the Agent was installed
  3. Consider your third-party modules that you will plan on installing and check with the support matrices to see what maximum versions of the .NET Framework and PowerShell you can install on the Agent. I would recommend going as high as you can to start.  (Special Note:  At the time of writing this post, PowerShell 6.0 was still in alpha and I would not consider that version ready for production use, especially due to many issues I’ve personally had with opening remote PowerShell sessions).
  4. You will likely have to configure WinRM on any device you plan to have the agent server communicate with. Most WinRM configurations are set, by default, to disallow every host in their TrustedHosts configuration

The Flow of Communication

One of the misconceptions of the PowerShell Agent (or at least what I thought for the longest time) was that by default the PowerShell Agent processes all the PowerShell requests locally.  This was proven to be quite incorrect.  The PowerShell Agent initiates a remote PowerShell session (PSSession) to any particular device that may be included in the WinRM configuration and that you can communicate with on the default WinRM port.

Breaking through this wall was key to start understanding exactly how to troubleshoot some potential problems with the PowerShell Agent and some key PowerShell cmdlets.  You can read more about that in another blog post I wrote here:  Using Invoke-WebRequest with the Cisco PowerShell Agent

This idea comes in handy, especially when you have certain PowerShell modules that may not necessarily function correctly unless on a host with certain Windows features.  During many of the test sessions I’ve opened with various devices in my lab, I did come across a unique case in which Windows Active Directory cmdlets could not be executed unless ran from a device that had a specific Windows Active Directory role associated with it.

Testing Communication

You can easily test communication with a PowerShell Agent in the UCS Director interface.  As an administrator of UCS Director, navigate to the follow location:  Administration > Virtual Accounts.  Select the PowerShell Agents tab.  You should see your PowerShell agent that you registered with your Director instance.  If you select your instance, two new task options appear in the bar above your list.  You can select Test Connection if you need to test just simple network communication to the device.  You can select Execute Command if you would like to initiate communication with the PowerShell Agent service and get information back from PowerShell to UCS Director.

screen-shot-2017-02-10-at-11-32-25-am

Above is the screen that you are presented.  You must provide these five objects whenever communicating with PowerShell Agent.  As many of these are self-explanatory, I won’t go into basic details, but I will state that there are some nuances to the Commands/Script field.  There are certain character sequences that you may have problems with, like “/” and “$”.  The reason for this is that the PowerShell Agent is essentially running an Invoke-Command cmdlet with the remote PSSession and there are some nuances in sending special characters to that cmdlet that must be taken into consideration or your syntax is going to be off when remotely executing the command.

What I’m going to do here is show a quick example of how information is returned from the PowerShell Agent.  In my lab, I have a device, called UCSD-PowerShell, that I’m going to run the simple cmdlet of Get-Host.  This screen will show you what I filled out:

screen-shot-2017-02-10-at-12-51-22-pm

After clicking on the Execute button, I am told that my command completed successfully.

screen-shot-2017-02-10-at-12-52-09-pm

If I scroll down, I can see some formatted output of the response:

screen-shot-2017-02-10-at-12-52-52-pm

This appears to be the object information I would get from a Get-Host, with some of the PowerShell session information sprinkled in there.

To Be Continued

 In the next post in this series, we will explore using these building blocks and creating a small UCS Director workflow that uses the Execute PowerShell Command task and what we can do with the response to use returned PowerShell object information in other UCS Director workflow tasks.

Posted in Technical | Tagged , , , , | 1 Comment

Using Invoke-WebRequest with the Cisco PowerShell Agent

I’ve had a major initiative at the Day Job to overhaul some of our existing Cisco UCS Director workflows and try to squeeze more efficiency and reduce the potential critical stopping points in them, as we’ve continued to evolve our technical processes.  I decided that while Cisco UCS Director has some great visual tools for mapping out workflows, sometimes the tasks within are very rigid and the way task flow happens tends to lend a single task focus.  This doesn’t bode well when trying to reduce the over all execution times of the workflow, as you end up having to work in a pattern in which tasks can’t be executed independently from each other.  As an example, to execute the fourth task, which has no dependency upon tasks one through three, you have to wait for the execution of task one through three.  In my opinion, as long as I understand how my workflow is going to function, this is highly ineffecient.

To try to curb this issue, I decided to research some more into the idea of parallel processing with UCS Director.  There are some examples using the native JavaScript implementation of the tool (UCSD Parallel Workflow Execution Example), but I was much more interesting in flexing my muscle with PowerShell, as that’s my language of choice.  Cisco ships a tool for remote execution of PowerShell code in the form of an agent that can be installed on a Windows device that is added to the configuration of Cisco UCS Director.  From there, you can specify the account information and the script/code block you wish to execute.  A return response is sent back to UCS Director and you can use some XML parsing techniques, which can be very handy if you need variables back for other parts of workflows.

To make this happen, I realized that we would need to execute some of Cisco UCS Director’s REST APIs to be able to launch workflows within my script.  In PowerShell, this usually means pulling out the Invoke-WebRequest cmdlet.  In the case of this cmdlet and Cisco UCS Director, you will typically need three things to make calls to some of the REST APIs:  the URI, the header (including your X-Cloupia-Request-Key name/value pair in the form of a hashtable), and the method type.

Unfortunately, this didn’t exactly work as easily advertised.  When starting to trace exactly what the Cisco PowerShell Agent does, I found that the service really does nothing more than create a remote PowerShell session to the target you specify.  In my case, I tend to redirect the PSA to itself, as I have my modules and scripts easily accessible from this device.  When trying to execute an Invoke-WebRequest cmdlet through this created session, I receive the following error:

screen-shot-2017-01-18-at-2-28-30-pm

When looking through the Cisco PowerShell Agent log file we find that an error 3 is a pretty generic error.  Any sort of PowerShell error will trigger the PSA to report this back.  The log file includes some of the error message, so I was able to find that the specific error was a “Object reference not set to an instance of an object.”  Anyone who’s done enough PowerShell authoring knows this response well.  Typically, one of your arguments is either null or of the wrong error type.  So, I decided to try a couple of troubleshooting techniques to see what the issue was with Invoke-WebRequest.  I first tried to nest this in a try..catch sequence.  This way, I could potentially get a look at the error in question.  Unfortunately, the same problem occurred, but I was not presented with any sort of error.

I felt this was very odd, as this meant that something was happening at the level of the Cisco PSA called remote PowerShell session.  Armed with the idea that the error message being reported meant something might be up with my arguments to the cmdlet, I decided to look into some of the Invoke-WebRequest parameters.  I found the –UseBasicParsing parameter and decided to give that a whirl.  As you can see by the results below, it worked.

screen-shot-2017-01-18-at-2-32-26-pm

Now, UseBasicParsing isn’t a required parameter of the cmdlet.  Now, I wanted to test to make sure that I was potentially going to catch an error message within this remote PowerShell session, so I found another parameter in –Proxy and fed a dummy domain name and port to it.  This was the response.

screen-shot-2017-01-18-at-2-34-30-pm

Now that’s more like it!  That’s the type of error object I was expecting.  For this test, I only fed the Proxy parameter and did not apply the UseBasicParsing parameter.  At this point, I’m really starting to think there is something going on with the Cisco PSA and my cmdlet here.  To rule out any sort of remote PowerShell session issues, I wrote a quick script (script below) that created a new remote PowerShell session (to the same server) and tried launching it through the Cisco PSA.  Also, I did not use the UseBasicParsing parameter on the cmdlet.

Script:


$username = "*username*"
$the_password = ConvertTo-SecureString -String "*password*" -AsPlainText -Force
$the_cred = New-object -TypeName System.Management.Automation.PSCredential -ArgumentList $username, $the_password
$the_session = New-PSSession -ComputerName *PSA IP/FQDN* -Credential $the_cred
$response = Invoke-Command -Session $the_session -ScriptBlock { invoke-webrequest -Uri *HTTP/HTTPS URI*}
$the_session | Disconnect-PSSession | Remove-PSSession
return $response 

Result:

 

screen-shot-2017-01-18-at-2-40-00-pm

From the response, I got what I need.  I have a Content property, along with the properties of the remote PSSession that was created to get this response.

I have this information in to the Cisco UCS Director people and (long story) once I get my support contract renewed, I may run this through Cisco TAC to see if this can be logged as a potential defect (due to my belief that Invoke-WebRequest isn’t being handled correctly by the PSA).

So, forewarning for those trying to use PowerShell and Invoke-WebRequest (and to some degree Invoke-RestMethod), be wary of some weird issues with the session and remember to potentially use the UseBasicParsing parameter on the cmdlet OR resort to nesting remote PowerShell sessions.  In my case, I stuck with the remote PowerShell session nesting.  I’ll provide an update when/if I get this case to TAC.

Posted in Technical | Tagged , , , , | 2 Comments