Extending Chef’s reach: “Managed Nodes” for External Entities.

Note: this post is very technical and relates to detailed Chef design patterns used by Crowbar. I apologize in advance for the post’s opacity. Just unleash your inner DevOps geek and read on. I promise you’ll find some gems.

At the Opscode Community Summit, Dell’s primary focus was creating an “External Entity” or “Managed Node” model. Matt Ray prefers the term “managed node” so I’ll defer to that name for now. This model is needed for Crowbar to manage system components that cannot run an agent such as a network switch, blade chassis, IP power distribution unit (PDU), and a SAN array. The concept for a managed node is that there is an instance of the chef-client agent that can act as a delegate for the external entity. We’ve been reluctant to call it a “proxy” because that term is so overloaded.

My Crowbar vision is to manage an end-to-end cloud application life-cycle. This starts from power and network connections to hardware RAID and BIOS then up to the services that are installed on the node and ultimately reaches up to applications installed in VMs on those nodes.

Our design goal is that you can control a managed node with the same Chef semantics that we already use. For example, adding a Network proposal role to the Switch managed node will force the agent to update its configuration during the next chef-client run. During the run, the managed node will see that the network proposal has several VLANs configured in its attributes. The node will then update the actual switch entity to match the attributes.

Design Considerations

There are five key aspects of our managed node design. They are configuration, discovery, location, relationships, and sequence. Let’s explore each in detail.

A managed node’s configuration is different than a service or actuator pattern. The core concept of a node in chef is that the node owns the configuration. You make changes to the nodes configuration and it’s the nodes job to manage its state to maintain that configuration. In a service pattern, the consumer manages specific requests directly. At the summit (with apologies to Bill Clinton), I described Chef configuration as telling a node what it “is” while a service provide verbs that change a node. The critical difference is that a node is expected to maintain configuration as its composition changes (e.g.: node is now connected for VLAN 666) while a service responds to specific change requests (node adds tag for VLAN 666). Our goal is the maintain Chef’s configuration management concept for the external entities.

Managed nodes also have a resource discovery concept that must align with the current ohai discovery model. Like a regular node, the manage node’s data attributes reflect the state of the managed entity; consequently we’d expect a blade chassis managed node to enumerate the blades that are included. This creates an expectation that the manage node appears to be “root” for the entity that it represents. We are also assuming that the Chef server can be trusted with the sharable discovered data. There may be cases where these assumptions do not have to be true, but we are making them for now.

Another essential element of managed nodes is that their agent location matters because the external resource generally has restricted access. There are several examples of this requirement. Switch configuration may require a serial connection from a specific node. Blade SANs and PDUs management ports are restricted to specific networks. This means that the manage node agents must run from a specific location. This location is not important to the Chef server or the nodes’ actions against the managed node; however, it’s critical for the system when starting the managed node agent. While it’s possible for managed nodes to run on nodes that are outside the overall Chef infrastructure, our use cases make it more likely that they will run as independent processes from regular nodes. This means that we’ll have to add some relationship information for managed nodes and perhaps a barclamp to install and manage managed nodes.

All of our use cases for managed nodes have a direct physical linkage between the managed node and server nodes. For a switch, it’s the ports connected. For a chassis, it’s the blades installed. For a SAN, it’s the LUNs exposed. These links imply a hierarchical graph that is not currently modeled in Chef data – in fact, it’s completely missing and difficult to maintain. At this time, it’s not clear how we or Opscode will address this. My current expectation is that we’ll use yet more roles to capture the relationships and add some hierarchical UI elements into Crowbar to help visualize it. We’ll also need to comprehend node types because “managed nodes” are too generic in our UI context.

Finally, we have to consider the sequence of action for actions between managed nodes and nodes.  In all of our uses cases, steps to bring up a node requires orchestration with the managed node.  Specifically, there needs to be a hand-off between the managed node and the node.  For example, installing an application that uses VLANs does not work until the switch has created the VLAN,  There are the same challenges on LUNs and SAN and blades and chassis.  Crowbar provides orchestration that we can leverage assuming we can declare the linkages.

For now, a hack to get started…

For now, we’ve started on a workable hack for managed nodes. This involves running multiple chef-clients on the admin server in their own paths & processes. We’ll also have to add yet more roles to comprehend the relationships between the managed nodes and the things that are connected to them. Watch the crowbar listserv for details!

Extra Credit

Notes on the Opscode wiki from the Crowbar & Managed Node sessions

CAP Chasm: why clouds say “no SANank you” to SANs

My personal bias against SANs in cloud architectures is well documented; however, I am in the minority at my employer (Dell) and few enterprise IT shops share my view.  In his recent post about CAP theorem, Dave McCrory has persuaded me to look beyond their failure to bask in my flawless reasoning.  Apparently, this crazy CAP thing explains why some people loves SANs (enterprise) and others don’t (clouds).

The deal with CAP is that you can only have two of Consistency, Availability, or Partitioning Tolerance.  Since everyone wants Availablity, the choice is really between Consitency or Partitioning.  Seeking Availability you’ve got two approaches:

  1. Legacy applications tried to eliminate faults to achieve Consistency with physically redundant scale up designs. 
  2. Cloud applications assume faults to achieve Partitioning Tolerance with logically redundant scale out design.

According to CAP, Legacy and cloud approaches are so fundamentally different that they create a “CAP Chasm” in which the very infrastructure fabric needed to deploy these applications is different.

As a cloud geek, I consider the inherent cost and scale limitations of a CA approach much too limited.   My first hand experience is that our customers and partners share my view: they have embraced AP patterns.  These patterns make more efficient use of resources, dictate simpler infrastructure layout, scale like hormone-crazed rabbits at a carrot farm, and can be deployed on less expensive commodity hardware.

As a CAP theorem enlightened IT professional, I can finally accept that there are other intellectually valid infrastructure models. 

See Mom?  I can play nicely with others after all.

Rethinking Storage

Or “UNthinking SANs”

Back in 2001, I was co-founder of a start-up building the first Internet virtualized cloud.  Dual CPU 1U pizza box servers were brand new and we were ready to build out an 8 node, 64 VM cloud!  It was going to be a dream – all that RAM and CPU just begging to be oversubscribed.  It was enough to make Turing weep for joy.

Unfortunately, all those VMs needed lots and lots of storage.

Never fear, EMC was more than happy to quote us a lovely SAN with plenty of redundant FBAs and interconnected fabric switches.  It was all so shiny and cool yet totally unscalable and obscenely expensive.   Yes, unscalable because that nascent 8 node cloud was already at the port limit for the solution!  Yes, expensive because that $50,000 hardware solution would have needed a $1,000,000 storage solution!

The funny part is that even after learning all that, we still wanted to buy the SAN.  It was just that cool.

We never bought that SAN, but we did buy a very workable NAS device.  Then it was my job to change (“pragmatic-ize”) our architecture so that our cloud management did not require expensive shiny objects.

Our ultimate solution used the NAS for master images that were accessed by many nodes.  These requests were mainly reads and optimized.  Writes were made to differencing disks kept on local disk and highly scalable.  In systems, we were able to keep the masters local and save bandwidth.  This same strategy could easily be applied in current “stateless” VM deployments.

Some of the SANless benefits are:

  • Less cost
  • Simplicity of networking and management
  • Nearer to linear scale out
  • Improved I/O throughput
  • Better fault tolerance (storage faults are isolated to individual nodes)

Of course, there are costs:

  • More spindles means more energy use (depending on drive selection and other factors)
  • Lack of centralized data management
  • Potentially wasted space because each system carries excess capacity
  • The need to synchronize data stored in multiple locations

These are real costs; however, I believe the data management problems are unsolved issues for SAN deployments too.  Data proliferation is simply hidden inside of the VMs.

Today, I observe many different SAN focused architectures and cringe.  These same solutions could be much simpler, more scalable and dramatically affordable with minimal (or even no) changes.  If you’re serious about deploying a cloud based on commodity system then you need seriously need to re-evaluate your storage.