Imagine, if you will, you are someone in a server operations team. As a member of that team, you are expected to keep the multiple layers of that server up to date. When you have only a handful of servers, this isn’t nearly a monumental task. However, as the business you work for grows, the server farm grows larger and larger. Your laissez faire approach to the upkeep of said servers quickly consumes what limited time you have. Unfortunately, your vendor (or heaven forbid, multiple vendors) of choice chooses to continue with a proprietary set of technology and tools to perform these needed upgrades. First, the scale of the task has gotten out of control and how the tools have become cumbersome as well. Now, you are really in a bind and no matter the sheer amount of screaming you do, it is not going to get better. Or is it?
The Way We’ve Always Done It
For decades, server maintainers have had the unfortunate pleasure of being presented IPMI (Intelligent Platform Management Interface) as their primary interface to interact with their servers in an out-of-band management fashion. This has led to the rise of BMC (Baseboard Management Controller) in many of the servers we see in our data centers today. If you’ve ever connected to a DRAC/iDRAC, HPE iLO, IBM Remote Supervisor Adapter, or Cisco IMC device, you’ve had the unfortunate pleasure of interacting at this level.
Now, the problem with these systems wasn’t IPMI itself. Standards are always a good thing (well, unless you have a bad standard to start from), generally. The problem was that each of the companies listed above did their own interpretation/implementation of those standards. This meant that the approach Dell EMC used greatly differed from competitors in that very same space, like Cisco, HPE, or Lenovo. This meant for each server brand, there was a completely different and unique set of tools for interacting with IPMI standards with that device. If you have a large datacenter with multiple vendors, the last thing you ever look forward to is MORE TOOLS to manage it!
Enough is Enough
Somewhere along the line, I believe the server vendors realized that their own proprietary methods were causing entirely too much strife in their customer base(s). Beginning in late 2015, the DMTF (Distributed Management Task Force), especially with the help of chairpersons from Dell, started to create and begin the process of ratification of a new standard called Redfish. This standard was to drive a common (RESTful) API mechanism that would be used to interface with any vendor’s server and perform many of the rudimentary tasks that become so proprietary. Personally, I have heard of Redfish and adoption of that standard recently, however, I was unaware of the history of the standard and how influential Dell (and Dell EMC) has been to the standard.
While recently attending Tech Field Day 16, a very important question was asked to Dell EMC. Why did this take so long to become a reality? Honestly, this question is likely very complex to answer. Let’s be frank about all vendors here. All vendors LOVE their unique ways of approaching complex problems. Many of them pride themselves on their intellectual property. There’s a level of inventiveness and creativity to some of the vendor approaches for using the IPMI “standard”. Unfortunately, what a vendor wants always goes to where their users are trending. The users spoke, and they wanted less nerd knobs and more shared experiences from vendor to vendor.
Meltdown and Spectre
As if server technicians were already under the gun for trying to keep their growing server farms up to date, along came a double whammy. There’s no need to go into the details of these two vulnerabilities. We will go into what this means for a server operations staff in a large enterprise environment. It means firmware updates and many variants of them.
Now, while not every large enterprise had the wherewithal to keep up with the necessary patching before these vulnerabilities first came to light, this forced everyone to have to get up to speed on their processes and procedures for updating all their servers. Any potential stance that involved firmware “set it and forget it” quickly went up in flames and, hopefully, that stance would never be heard from again. Many of these organizations finally came face to face with a cold, hard fact; firmware updating a large server farm is the absolute worst of the worst!
So Long and Thanks for All the Fish?
Now, from a personal perspective, I have vivid recollections of having to roll multiple firmware updates across server farms in the thousands of devices. It was not uncommon for myself and my team to have to spend inordinate amounts of time just working with firmware updating tools that felt half-baked and required much handholding to perform their documented task. Many hours of productivity were lost, and it felt as if you were drowning in firmware updates in that environment. It’s very unfortunate that it took this long for the Redfish API standards to appear.
Now, if there is a good note about the development of the Redfish API standard, it’s that it’s going to have siblings. Dell EMC is continuing work with DMTF to drive development into other API standards for the datacenter. Keep an eye out, as you might see APIs coming for shared storage (“Swordfish”), network switch, power, HVAC, and security systems.
While these new standards may not set the world alight from a technical perspective, they are something to pay attention too. Complexity at scale is something that turns a rudimentary operation into a monumental nightmare. Anything, and I mean anything, is better than the current vendor-specific implementations on these platforms we have today. Kudos to Dell (and now Dell EMC) for continuing the drive to common APIs to lessen this pain.