Immutable Infrastructure Delivery on Metal : See RackN at Data Center World



The RackN team is heading to San Antonio, TX next week for Data Center World, March 12 – 15. Our co-founder/CEO Rob Hirschfeld is giving a talk on immutable infrastructure for bare metal in the data center (see session information below).

We are interested in meeting and talking with fellow technologists. Contact us this week so we can setup times to meet at the event. If you are able to attend Rob’s session be sure to let him know you saw it here on the RackN blog.

RackN Session

March 12 at 2:10pm
Room 206AM
Session IT7
Tracks: Cloud Services, Direct Access

Operate your Data Center like a Public Cloud with Immutable Infrastructure

The pressure on IT departments to deliver services to internal customers is considerably higher today as public cloud vendors are able to operate on a massive scale, forcing CIOs to challenge their own staff to raise the bar in data center operation. Of course, enterprise IT departments don’t have the large staff of an AWS or Azure; however, the fundamental process running those public clouds is now available for consumption in the enterprise. This process is called “immutable infrastructure” and allows servers to be deployed 100% ready to run without any need for remote configuration of access. It’s called immutable because the servers are deployed from images produced by CI/CD process and destroyed after use instead of being reconfigured. It’s a container and cloud pattern that has finally made it to physical. In this talk, we’ll cover the specific process and its advantages over traditional server configuration.

Hidden costs of Cloud? No surprises, it’s still about complexity = people cost

Last week, Forbes and ZDnet posted articles discussing the cost of various cloud (451 source material behind wall) full of dollar per hour costs analysis.  Their analysis talks about private infrastructure being an order of magnitude cheaper (yes, cheaper) to own than public cloud; however, the open source price advantages offered by OpenStack are swallowed by added cost of finding skilled operators and its lack of maturity.

At the end of the day, operational concerns are the differential factor.

The Magic 8 Cube

The Magic 8 Cube

These articles get tied down into trying to normalize clouds to $/vm/hour analysis and buried the lead that the operational decisions about what contributes to cloud operational costs.   I explored this a while back in my “magic 8 cube” series about six added management variations between public and private clouds.

In most cases, operations decisions is not just about cost – they factor in flexibility, stability and organizational readiness.  From that perspective, the additional costs of public clouds and well-known stacks (VMware) are easily justified for smaller operations.  Using alternatives means paying higher salaries and finding talent that requires larger scale to justify.

Operational complexity is a material cost that strongly detracts from new platforms (yes, OpenStack – we need to address this!)

Unfortunately, it’s hard for people building platforms to perceive the complexity experienced by people outside their community.  We need to make sure that stability and operability are top line features because complexity adds a very real cost because it comes directly back to cost of operation.

In my thinking, the winners will be solutions that reduce BOTH cost and complexity.  I’ve talked about that in the past and see the trend accelerating as more and more companies invest in ops automation.

How DefCore is going to change your world: three advisory cases

The first release of the DefCore Core Capabilities Matrix (DCCM) was revealed at the Atlanta summit.  At the Summit, Joshua and I had a session which examined what this means for the various members of the OpenStack community.   This rather lengthy post reviews the same advisory material.

DefCore sets base requirements by defining 1) capabilities, 2) code and 3) must-pass tests for all OpenStack products. This definition uses community resources and involvement to drive interoperability by creating the minimum standards for products labeled “OpenStack.”

As a refresher, there are three uses of the OpenStack mark:

  • Community: The non-commercial use of the word OpenStack by the OpenStack community to describe themselves and their activities. (like community tweets, meetups and blog posts)
  • Code: The non-commercial use of the word OpenStack to refer to components of the OpenStack framework integrated release (as in OpenStack Compute Project Nova)
  • Commerce: The commercial use of the word OpenStack to refer to products and services as governed by the OpenStack trademark policy. This is where DefCore is focused.

In the DefCore/Commerce use, properly licensed vendors have three basic obligations to meet:is_it_openstack_graphic

  1. Pass the required Refstack tests for the capabilities matrix in the version of OpenStack that they use. Vendors are expected (not required) to share their results.
  2. Run and include the “designated sections” of code for the OpenStack components that you include.
  3. Other basic obligations in their license agreement like being a currently paid up corporate sponsor or foundation member, etc.

If they meet these conditions, vendors can use the OpenStack mark in their product names and descriptions.

Enough preamble!  Let’s see the three Advisory Cases

MANDATORY DISCLAIMER: These conditions apply to fictional public, private and client use cases.  Any resemblence to actual companies is a function of the need to describe real use-cases.  These cases are advisory for illustration use only and are not to be considered definitive guidenance because DefCore is still evolving.

Public Cloud: Service Provider “BananaCloud”

A popular public cloud operator, BananaCloud has been offering OpenStack-based IaaS since the Diablo release. However, they don’t use the Keystone component. Since they also offer traditional colocation and managed services, they have an existing identity management system that they use. They made a similar choice for Horizon in favor of their own cloud portal.


  1. They use Nova a custom scheduler and pass all the Nova tests. This is the simplest case since they use code and pass the tests.
  2. In the Havana DCCM, the Keystone capabilities are a must pass test; however, there are no designated sections of code for Keystone. So BananaCloud must implement a Keystone-compatible API on their IaaS environment (an effort they had underway already) that will pass Refstack, and they’re good to go.
  3. There are no must pass tests for Horizon so they have no requirements to include those features or code. They can still be OpenStack without Horizon.
  4. There are no must pass tests for Trove so they have no brand requirements to include those features or code so it’s not a brand issue; however, by using Trove and promoting its use, they increase the likelihood of its capabilities becoming must pass features.

BananaCloud also offers some advanced OpenStack capabilities, including Marconi and Trove. Since there are no must pass capabilities from these components in the Havana DCCM, it has no impact on their offering additional services. DefCore defines the minimum requirements and encourages vendors to share their full test results of additional capabilities because that is how OpenStack identifies new must pass candidates.

Note: The DefCore DCCM is advisory for the Havana release, so if BananaCloud is late getting their Keystone-compatibility work done there won’t be any commercial impact. But it will be a binding part of the trademark license agreement by the Juno release, which is only 6 months away.

Private Cloud: SpRocket Small-Business OpenStack Software

SpRocket is a new OpenStack software vendor, specializing in selling a Windows-powered version of OpenStack with tight integration to Sharepoint and AzurePack. In their feature set, they only need part of Nova and provide an alternative object storage to Swift that implements a version of the Swift API. They do use Heat as part of their implementation to set up applications back ended by Sharepoint and AzurePack.

  1. sprocketFor Nova, they already use the code and have already implemented all required capabilities except for the key-store. To comply with the DefCore requirement, they must enable the key-store capability.
  2. While their implementation of Swift passes the tests, We are still working to resolve the final disposition of Swift so there are several possible outcomes:
    1. If Swift is 0% designated then they are OK (that’s illustrated here)
    2. If Swift is 100% designated then they cannot claim to be OpenStack.
    3. If Swift is partially designated then they have to adapt their deploy to include the required code.
  3. Their use of Heat is encouraged since it is an integrated project; however, there are no required capabilities and does not influence their ability to use the mark.
  4. They use the trunk version of Windows HyperV drivers which are not designated and have no specific tests.

Ecosystem Client: “Mist” OpenStack-consuming Client Library

Mist is a client library for load+kt programmers working on applications using the OpenStack APIs. While it’s an open source project, there are many commercial applications that use the library for their applications. Unlike a “pure” OpenStack program, it also supports other Cloud APIs.

Since the Mist library does not ship or implement the OpenStack code base, the DefCore process does not apply to their effort; however, there are several important intersections with Mist and OpenStack and Core.

  • First, it is very important for the DefCore process that Mist map their use of the OpenStack APIs to the capabilities matrix. They are asked to help with this process because they are the best group to answer the “works with clients” criteria.
  • Second, if there are APIs used by Mist that are not currently tested then the OpenStack community should work with the Mist community to close those test gaps.
  • Third, if Mist relies on an API that is not must-pass they are encouraged to help identify those capabilities as core candidates in the community.

Notes from 2011 Cloud Connect Event Day 2 (#ccevent)

With the OpenStack launch behind me, I have some time to attend the Cloud Connect Event.  I missed all the DevOps sessions, but was getting to geek out on the NoSQL & Big Data sessions.   I jumped to the private cloud track (based on Twitter traffic) and was rewarded for the shift.

I’m surprised at how much focus this cloud conference is dedicated to private cloud.  At other cloud conferences I’ve attended, the focus has been on learning how to use the cloud (specifically the public cloud).  This is the first cloud show I’ve attended that has so much emphasis, dialog and vendor feeding around private.  This was a suits & slacks show with few jeans, t-shirts, and pony tails.  Perhaps private cloud is where the $$$ is being spent now?

It definitely feels like using cloud has become assumed, but the best practices and tools are just emerging.

The twitter #ccevent stream is interesting but temporal.  I’m posting my raw (spelling optional) notes (below the more tag) because there is a lot of great content from the show to support and extend the twitter stream.  I’ll try to italicize some of the better lines.

Continue reading

Cloud Gravity & Shards

This post is the final post laying out a rethinking of how we view user and buyer motivations for public and private clouds.

In part 1, I laid out the “magic cube” that showed a more discrete technological breakdown of cloud deployments (see that for the MSH, MDH, MDO, UDO key).  In part 2, I piled higher and deeper business vectors onto the cube showing that the cost value of the vertices was not linear.  The costs were so unequal that they pulled our nice isometric cube into a cone.

The Cloud Gravity Well

To help make sense of cloud gravity, I’m adding a qualitative measure of friction.

Friction represents the cloud consumer’s willingness to adopt the requirements of our cloud vertices.  I commonly hear people say they are not willing to put sensitive data “in the cloud” or they are worried about a “lack of security.”  These practical concerns create significant friction against cloud adoption; meanwhile, being able to just “throw up” servers (yuck!) and avoiding IT restrictions make it easy (low friction) to use clouds.

Historically, it was easy to plot friction vs. cost.  There was a nice linear trend where providers simply lowered cost to overcome friction.  This has been fueling the current cloud boom.

The magic cube analysis shows another dynamic emerging because of competing drivers from management and isolation.  The dramatic saving from outsource management are inhibited by the high friction for giving up data protection, isolation, control, and performance minimums.  I believe that my figure, while pretty, dramatically understates the friction gap between dedicated and shared hosting.  This tension creates a non-linear trend in which substantial customer traction will follow the more expensive offerings.  In fact, it may be impossible to overcome this friction with pricing pressure.

I believe this analysis shows that there’s a significant market opportunity for clouds that have dedicated resources yet are still managed and hosted by a trusted 3rd party.  On the other hand, this gravity well could turn out to be a black hole money pit.  Like all cloud revolutions, the timid need not apply.

Post Script: Like any marketing trend, there must be a name.  These clouds are not “private” in the conventional sense and I cringe at using “hybrid” for anything anymore.  If current public clouds are like hotels (or hostels) then these clouds are more like condos or managed community McMansions.  I think calling them “cloud shards” is very crisp, but my marketing crystal ball says “try again.”  Suggestions?

Cloud Business Vectors

In part 1 of this series, I laid out the “magic cube” that describes 8 combinations for cloud deployment.  The cube provides a finer grain understanding than “public” vs. “private” clouds because we can now clearly see that there are multiple technology axis that create “private IT” that can be differentiated. 

 The axis are: (detailed explanation)

  • X. Location: Hosted vs. On-site
  • Y. Isolation: Shared vs. Dedicated
  • Z. Management: Managed vs. Unmanaged

Cloud Cost Model

In this section, we take off our technologist pocket protectors and pick up our finance abacus.  The next level of understanding for the magic cube is to translate the technology axis into business vectors.  The business vectors are:

X. Capitalization:  OpEx vs. CapEx.  On the surface, this determines who has ownership of resource, but the deeper issue is if the resource is an investment (capital) or consumable (operations).   Unless you’re talking about a co-lo cage, hosting models will be consumable because the host is leveraging (in the financial sense) their capital into operating revenue.

Y. Commitment: PayGo vs. Fixed.  Like a cell phone plan, you can pay for what you use (Pay-as-you-go) or lock-in to a fixed contract.  Fixed is generally pays a premium based on volume even though the per unit cost may be lower.  In my thinking, the fixed contract may include dedicated resource guarantees and additional privacy.

Z. Management: Insource vs. Outsource.  Don’t over think this vector, but remember I not talking about off shoring!  If you are directly paying people to manage your cloud then you’ve insourced management.  If the host provides services, process or automation that reduces hiring requirements then you’re outsourcing it IT.  It’s critical to realize that you can’t employee fractional people.  There are fundamental cloud skillsets and tools that must be provided to operate a cloud (including, not limited to DevOps).


If you were willing to do some cerebral calisthenics about these vectors then you realized that they are not equal cost weights.  Let’s look at them from least to most.

  1. The commitment vector is very easy to traverse this vector in either direction.  It’s well established human behavior that we’ll pay more for to be more predictable, especially if that means we get more control or privacy.  If I had had a dollar for everyone who swoons over cloud bursting I’d go buy that personal jet pack.
  2. The capitalization vector has is part of the driver to cloud as companies (and individuals) seek to avoid buying servers up front.  It also helps that clouds let you buy factional servers and “throw away” servers that you don’t need.  While these OpEx aspects of cloud are nice, servers are really not that expensive to lease or idle.  Frankly, it’s the deployment and management of those assets that drives the TCO way up for CapEx clouds, but that’s not this vector so move along.
  3. The management vector is the silverback gorilla standing in the corner of our magic cube.  Acquiring and maintaining the operations expertise is a significant ongoing expense.  In many cases, companies simply cannot afford to adequately cover the needed skills and this severely limits their ability to use technology competitively.  Hosts are much better positioned to manage cloud infrastructure because they enjoy economies of scale distributed between multiple customers.  This vector is heavily one directional – once you fly without that critical Ops employee in favor of a host doing the work, it is unlikely you’ll hire that role.

The unequal cost weights pull our cube out of shape.  They create a strong customer pull away from the self-managed & CapEx vertices and towards outsourced & OpEx.  I think of this distortion as a cloud gravity well that pulls customers down from private into public clouds. 

That’s enough for today.  You’ll have to wait for the gravity well analysis in part 3.