Are VMs becoming El Caminos? Containers & Metal provide new choices for DevOps

I released “VMS ARE DEAD” this post two weeks ago on DevOps.com.  My point here is that Ops Automation (aka DevOps) is FINALLY growing beyond Cloud APIs and VMs.  This creates a much richer ecosystem of deployment targets instead of having to shoehorn every workload into the same platform.

In 2010, it looked as if visualization had won. We expected all servers to virtualize workloads and the primary question was which cloud infrastructure manager would dominate. Now in 2015, the picture is not as clear. I’m seeing a trend that threatens the “virtualize all things” battle cry.

IMG_20150301_170558985Really, it’s two intersecting trends: metal is getting cheaper and easier while container orchestration is advancing on rockets. If metal can truck around the heavy stable workloads while containers zip around like sports cars, that leaves VMs as a strange hybrid in the middle.

What’s the middle? It’s the El Camino, that notorious discontinued half car, half pick-up truck.

The explosion of interest in containerized workloads (I know, they’ve been around for a long time but Docker made them sexy somehow) has been creating secondary wave of container orchestration. Five years ago, I called that Platform as a Service (PaaS) but this new generation looks more like a CI/CD pipeline plus DevOps platform than our original PaaS concepts. These emerging pipelines obfuscate the operational environment differently than virtualized infrastructure (let’s call it IaaS). The platforms do not care about servers or application tiers, their semantic is about connecting services together. It’s a different deployment paradigm that’s more about SOA than resource reservation.

On the other side, we’ve been working hard to make physical ops more automated using the same DevOps tool chains. To complicate matters, the physics of silicon has meant that we’ve gone from scale up to scale out. Modern applications are so massive that they are going to exceed any single system so economics drives us to lots and lots of small, inexpensive servers. If you factor in the operational complexity and cost of hypervisors/clouds, an small actual dedicated server is a cost-effective substitute for a comparable virtual machine.

I’ll repeat that: a small dedicated server is a cost-effective substitute for a comparable virtual machine.

I am not speaking against virtualize servers or clouds. They have a critical role in data center operations; however, I hear from operators who are rethinking the idea that all servers will be virtualized and moving towards a more heterogeneous view of their data center. Once where they have a fleet of trucks, sports cars and El Caminos.

Of course, I’d be disingenuous if I neglected to point out that trucks are used to transport cars too. At some point, everything is metal.

Want more metal friendly reading?  See Packet CEO Zac Smith’s thinking on this topic.

Need a physical ops baseline? Crowbar continues to uniquely fill gap

Robots Everywhere!I’ve been watching to see if other open “bare metal” projects would morph to match the system-level capabilities that we proved in Crowbar v1 and honed in the re-architecture of OpenCrowbar.  The answer appears to be that Crowbar simply takes a broader approach to solving the physical ops repeatably problem.

Crowbar Architect Victor Lowther says “What makes Crowbar a better tool than Cobbler, Razor, or Foreman is that Crowbar has an orchestration engine that can be used to safely and repeatably deploy complex workloads across large numbers of machines. This is different from (and better than, IMO) just being able to hand responsibility off to Chef/Puppet/Salt, because we can manage the entire lifecycle of a machine where Cobbler, Razor and Chef cannot, we can describe how we want workloads configured at a more abstract level than Foreman can, and we do it all using the same API and UI.”

Since we started with a vision of an integrated system to address the “apply-rinse-repeat” cycle; it’s no surprise that Crowbar remains the only open platform that’s managed to crack the complete physical deployment life-cycle.

The Crowbar team realized that it’s not just about automation setting values: physical ops requires orchestration to make sure the values are set in the correct sequence on the appropriate control surface including DNS, DHCP, PXE, Monitoring, et cetera.  Unlike architectures for aaS platforms, the heterogeneous nature of the physical control planes requires a different approach.

We’ve seen that making more and more complex kickstart scripts or golden images is not a sustainable solution.  There is simply too much hardware variation and dependency thrash for operators to collaborate with those tools.  Instead, we’ve found that decomposing the provisioning operations into functional layers with orchestration is much more multi-site repeatable.

Accepting that physical ops (discovered infrastructure) is fundamentally different from cloud ops (created infrastructure) has been critical to architecting platforms that were resilient enough for the heterogeneous infrastructure of data centers.

If we want to start cleaning up physical ops, we need to stop looking at operating system provisioning in isolation and start looking at the full server bring up as just a part of a broader system operation that includes networking, management and operational integration.