CAP Chasm: why clouds say “no SANank you” to SANs

My personal bias against SANs in cloud architectures is well documented; however, I am in the minority at my employer (Dell) and few enterprise IT shops share my view.  In his recent post about CAP theorem, Dave McCrory has persuaded me to look beyond their failure to bask in my flawless reasoning.  Apparently, this crazy CAP thing explains why some people loves SANs (enterprise) and others don’t (clouds).

The deal with CAP is that you can only have two of Consistency, Availability, or Partitioning Tolerance.  Since everyone wants Availablity, the choice is really between Consitency or Partitioning.  Seeking Availability you’ve got two approaches:

  1. Legacy applications tried to eliminate faults to achieve Consistency with physically redundant scale up designs. 
  2. Cloud applications assume faults to achieve Partitioning Tolerance with logically redundant scale out design.

According to CAP, Legacy and cloud approaches are so fundamentally different that they create a “CAP Chasm” in which the very infrastructure fabric needed to deploy these applications is different.

As a cloud geek, I consider the inherent cost and scale limitations of a CA approach much too limited.   My first hand experience is that our customers and partners share my view: they have embraced AP patterns.  These patterns make more efficient use of resources, dictate simpler infrastructure layout, scale like hormone-crazed rabbits at a carrot farm, and can be deployed on less expensive commodity hardware.

As a CAP theorem enlightened IT professional, I can finally accept that there are other intellectually valid infrastructure models. 

See Mom?  I can play nicely with others after all.

PaaS, much ado about network services

There’s a surprising about of a hair pulling regarding IaaS vs PaaS.  People in the industry get into shouting matches about this topic as if it mattered more than Lindsay Lohan’s journey through rehab.

The cold hard reality is that while pundits are busy writing XaaS white papers, developers are off just writing software.  We are writing software that fits within cloud environments (weak SLA, small VMs), saves money (hosted data instead of data in VMs), and changes quickly (interpreted languages).  We’re doing using an expanding tool kit of networked components like databases, object stores, shared cache, message queue, etc.

Using network components in an application architecture is about as novel as building houses made of bricks.  So, what makes cloud architectures any better or different?

Nothing!  There is no difference if you buy VMs, install services, and wire together your application in its own little cloud bubble.  If I wanted to bait trolls, I’d call that an IaaS deployment.

However, there’s an emerging economic driver to leverage lower cost and more elastic infrastructure by using services provided by hosts rather than standing them up in a VM.  These services replace dedicated infrastructure with managed network attached services and they have become a key differentiator for all the cloud vendors

  • At Google App Engine, they include Big Tables, Queues, MemCache, etc
  • At Microsoft Azure, they include SQL Azure, Azure Storage, AppFabric, etc
  • At Amazon AWS, they include S3, SimpleDB, RDS (MySQL), Queue & Notify, etc

Using these services allows developers to focus on the business problems we are solving instead of building out infrastructure to run our applications.  We also save money because consuming an elastic managed network service is less expensive (and more consumption based) than standing up dedicated VMs to operate the services.

Ultimately, an application can be written as stateless code (really “externalized state” is more a accurate description) that relies on these services for persistence.  If a host were to dynamically instantiate instances of that code based on incoming requests then my application resource requirements would become strictly consumption based.   I would describe that as true cloud architecture. 

On a bold day, I would even consider an environment that enforced offered that architecture to be a platform.  Some may even dare to describe that as a PaaS; however, I think it’s a mistake to look to the service offering for the definition when it’s driven by the application designers’ decisions to use network services.

While we argue about PaaS vs IaaS, developers are just doing what they need.  Today they may stand-up their own services and tomorrow they incorporate 3rd party managed services.  The choice is not a binary switch, a layer cake, or a holy war.

The choice is about choosing the right most cost effective and scalable resource model.

Introducing BravoDelta: Erlang BDD based on Cucumber

I highly recommend Armstrong's Programming Erlang

I ❤ Erlang.  I learned about Erlang while simultaneously playing w/ CouchDB (written in Erlang) and reading Siebel’s excellent Coders At Work interview of Erlang creator Joe Armstrong.  Erlang takes me back to my lisp and prolog days – it’s interesting, powerful and elegant.  Even better, it’s performant, time tested and proven.

To whet my Erlang skills, I decided to port of the most essential development tools I’ve used: Cucumber BDD.  I think that using BDD is one of the most critical success criteria for a project that wants to move quickly and respond to customers.  If you’d like to see Cucumber in action, check out my WhatTheBus project.  A Cucumber test is written in “simple English” and looks like this:

Scenario: Web Page1
    When I go to the home page.
    Then I should see "Districts".

To run Bravo Delta, you’ll need Erlang installed on your system.  You may also want to setup the WhatTheBus project because the initial drop uses that RoR web site as it’s target.  I’ve uploaded the code onto GitHub project BravoDelta (code contributions welcome).

NOTE: This is a functional core – it is not expected to be a complete Cuke replacement at this point!

The code base consists of the following files:

  • bdd.erl (main code file, start using bdd:test(“scenario”).)
  • bdd.config (config file)
  • bdd_webrat.erl (standard steps that are used by many web page tests)
  • bravodelta.erl (same custom steps, must match feature file name)
  • bravodelta.feature (BDD test file)
  • bdd_utils.erl (utilities called by bdd & webrat)
  • bdd_selftest.erl (unit tests for utils – interesting pattern for selftest in this file!)
  • bdd_selftest.data (data for unit tests)

Erlang makes parsing the feature file very easy.  Unlike Cucumber, there is no RegEx craziness because Erlang has groovy pattern matching.  Basically, each step decomposes into a single line starting with Given, When, or Then.  The code is designed so that developers can easily add custom steps and there are pre-built steps for common web tasks in the “webrat” step file.  A step processor looks like this in Erlang:

step(_Config, _Given, {step_when, _N, ["I go to the home page"]}) ->
	bdd_utils:http_get(_Config, []);
step(_Config, _Result, {step_then, _N, ["I should see", Text]}) ->
	bdd_utils:html_search(Text,_Result).

The steps are called by an Erlang recursive routine in BDD for each scenario in the feature file.  Explaining that code will have to wait for a future post.

The objective for Bravo Delta is to demonstrate simple Erlang concepts.  I wanted to make sure that the framework was easy to extend and could grow overtime.  My experience with Erlang is that my code generally gets smaller and more powerful as I iterate.  For example, moving from three types of steps (given_, when_, then_) to a single step type with atoms for type resulted in a 20% code reduction.

I hope to use it for future BDD projects and grow its capability because it is fast and simple to extend – I found Cucumber to be very slow.  It should also be possible to execute features in parallel with minimal changes to this code base.  That makes Bravo Delta very well suited to large projects with lots of tests and automated build systems.

Rethinking the “private cloud” as revealed by the Magic 8 Cube

The Magic 8 Cube

This is the first part of 3 posts that look into the real future for “private clouds.”

This concept is something that was initially developed with Greg Althaus, my colleague at Dell and then further refined in discussions with by our broader team.  It grew from my frustration with the widely referenced predictions by the Gartner Group of a private cloud explosion.  Their prognostication did not ring true to me because the economics of “public cloud” are so compelling that going private seems to be like fighting your way out of a black hole.

We’ll get to the gravity well (post 3 of 3) in due time.  For now, we need to look into the all knowing magic 8 cube.

Our breakthrough was seeing cloud hosting as a 3 dimensional problem.  We realized that we could cover all the practical cloud scenarios with these 8 cases.  Showing in the picture (right).

Here are the axis:

  1. X: Hosted vs. On-site – where are the servers running?  On-site means that they are running at your facility or in a co-lo cage that is basically an extended extraterritorial boundary of your company.
  2. Y: Shared vs. Dedicated – are other people mixing with your solution?  Shared means that your bytes are secretly nuzzling up to someone else’s bytes because you’re using a multi-tenant infrastructure.
  3. Z: Managed vs. Unmanaged – do you’re Ops people (if you have any) able to access the infrastructure that runs your applications?  Unmanaged means that you’re responsible for keeping the system operating.

With 3 axis, we have a 8 point cube.

  1. MSH – a PaaS offering in which every aspect of your application is managed and controlled.  GAE or Heroku.
  2. MSO – remember when people used to buy a mainframe and them lease off-hours extra cycles back to kids like Bill Gates?  That’s pretty much what this model means.
  3. MDH – a “mini-cloud” run by a cloud provider by dedicated to just one customer.  Dr. Evil thinks this costs one milllllllllion dollars.
  4. MDO – a cloud appliance.  You install the hardware but someone else does all the management for you.
  5. USH – IaaS.  I think that Amazon EC2 is providing USH.  It may be a service, but you’ve got to do a lot of Ops work to make your application successful.
  6. USO – OpenStack or other open source cloud DYI frameworks let a hosting provider create a shared, hosted model if they have the Ops chops to run it.
  7. UDH – Co Lo.
  8. UDO – The mythical “private cloud.”  Mine, mine, all mine.

In thinking this over, we realized that cloud customers were not likely to jump randomly around this cube.  If they were using MSH then they may want to consider MDH or MSO.  It seemed unlikely that they would go directly from MSH to UDO as Mr. Bittman suggests; however, the market is clearly willing to move directly from UDO to MSH.

We had a good old-fashioned mystery on our hands… the answer will have to wait until my next post.

Alert the villagers, it’s Frankencloud!

I’m growing more and more concerned about the preponderance of Frankencloud offerings that I see being foisted into the market place (no, my employer, Dell, is not guiltless).  Frankenclouds are “cloud solutions” that are created by using duct tape, twine, wishful marketing brochures, and at least 4 marginally cloud enabled products.

The official Frankencloud recipe goes like this:

  • Take 1 product that includes server virtualization (substitutions to VMware at your own risk)
  • Take 1 product that does storage virtualization (substitutions to SAN at your own risk)
  • Take 1 product that does network virtualization (substitutions to VLANs at your own risk)
  • Take 1 product that does IT orchestration (your guess is as good as any)
  • Take 1 product that does IT monitoring
  • Take 1 product that does Virtualization monitoring
  • Recommended: an unlimited Pizza budget for your IT Ops team

Combine the ingredients at high voltage in a climate conditioned environment.  Stir in a seriously large amounts of consulting services, training, and Red Bull.  At the end of this process, you will have your very own Frankencloud!

Frankenclouds are notoriously difficult to maintain because each part has its own version life cycle.  More critically, they also lack a brain.

Unfortunately, there are few alternatives to the Frankencloud today.  I think that the alternatives will rewrite the rules that Ops uses to create clouds.  Here are the rules that I think help drive a wooden stake through the heart of the Frankencloud (yeah, I mixed monsters):

  • not assume that server virtualization == cloud. 
  • simple, simple and simpler than that
  • focus on applications (need to write more about DevOps)
  • start with networking, not computation
  • assume that software containers are replaced, not upgraded

What do you think we can do to defeat Frankenclouds?

Shared Nothing Virtual Cluster

A while back (2004), Dave McCrory and Patent can protect or trap good ideasI patented an interesting curosity that we called the Shared Nothing Virtual Cluster.  Basically, the idea is to use OS RAID 1 on a VM but to have the VHDs split between physical hosts.  If the host died, the VM could be restarted on a the second host using the RAID mirror.

It was an interesting idea, but seemed less than ideal because everyone was running to SAN storage and falling madly (insanely?) in love with vMotion.

Now that we’re looking towards clouds that beyond SAN scale, the idea of mixing DAS and NAS to create instant redundancy for VMs may suddenly have more value.

Of course, Sugient owns the patent now…

If Apple is Disney then is the iPad Miley Cyrus?

Or Is Apple’s walled garden more like Disney World

With the iPad frenzy, I’ve been hearing a lot about Apple’s success with its walled garden approach.  I objected to their proprietary closed stance on principle for a long time.  When I finally caved in, I came to understand something fundamentally true about consumers: predictability matters to the mainstream.

This is really no surprise.   Walt Disney figured this out with his amusement parks a long time ago.

Disney World is the ultimate walled garden.  They relentlessly control every mote of our experience in their parks and my family loves it.  We happily willingly pay a premium for the experience because we know that going to Disney World will be a smooth and our fun in assured. 

However, we less willingly pay a second price for our Disney experience; it’s homogenous and bland.  It lacks the spontaneity and vibe of the Austin City Limits music festivals.   At festivals, the content is raw and fresh and things can go wonderfully wrong.  You may be delighted by Vampire Weekend when you’d planned to see the Bob Dylan.

And so, Apple provides the quality control and censorship to Disney-ify our smart phones and tablets.  They’ve created a safe place to show off their impressive innovations.  They’ve created a limited market where they can control the spot lights.  In this way, Apple reminds me of how Disney manipulates it media outlets to create multi-talent superstars like Miley Cyrus.  They craft personas for their actors and ensure that they can sing, dance, and act.  This maximizes the appeal for Disney’s platform but blocks out other talented singers, dancers, and actors. 

Way when Brittany Spears a Disney property there was room left for other (better, truer) singers like Avril Lavigne.  Today, the sanitized Miley Cyrus talent trifecta effectively blots out the sun.

So far, the iPhone has been a platform for innovation.  Please ignore the fact that developers had to buy Apple computers to write applications for it.  Please ignore the fact that developers must pass through Apple’s QA and censors.  Please ignore the fact that you must purchase an Apple device.  Please ignore the fact that you can only purchase applications through the iTunes store.  They are a platform trifecta with hardware, software, and distribution.  This is the price that you pay to ride on Space Mountain, you must enter Apple’s iPark.

I’m hearing about some interesting new products emerging that will challenge Apple’s technology; however, I’m not sure if consumers are ready to leave the park and go to the festival.  I hope they are.

Disclaimer: I am a Dell employee.  We have products (based on Android) that complete with Apple’s smart phones and tablets.

Speaking at RedHat Summit / JBoss World 2010

I’ve been enlisted by my employer, Dell, to speak about cloud software architecture JBoss World 2010 in Boston the week of June 21st.

My talk will expand on the “RAIN” posts that I’ve written before with some practical examples on our we are using Joyent to create applications using these models.

Here’s the abstract:

The need for hyper-scale and the lack of SLAs on public clouds has forced architects to stripe their applications across multiple servers. Similar to disk RAID striping, application striping creates redundancy using an array of inexpensive nodes (RAIN). This technique enables applications to have dramatic performance bursts while improving fault tolerance and reducing costs.

In this session, Rob will review how to use JBoss Enterprise Middleware to create a RAIN configuration using technologies available through the Dell Cloud Solution for Web Applications and on Joyent public cloud hosting. He will review the essential role of the virtual load balancer using Zeus ZXTM. Rob will also show specific architectures that can be implemented quickly and explain how ZXTM can deliver scale-out ready SQL read-write splitting without recoding.

Java makes stange bedfellows of VMware and Google

I was thinking about Sci-Tech’s story about VMware and Google. I’ve been watching and wondering how giants VMware and Google will dance to the music of Java (now an Oracle asset). VMware’s Spring and Groovy seems like a natural fit with Google’s AppEngine. However, neither own the Java platform yet both are banking big on it becoming the major development language. It puts them into the interesting position of having the evangelize Java together.

If they can marshall their shared interests then this combination could be a potent counter point to Microsoft’s .NET. They could provide the corporate support and lift that Sun did not. Or they could just create more confusion and dilution for an already fragmented platform.

6/29 update: after the JBoss World show, I need to add RedHat to the list of java supporters. Starting to take on an AntiMS feeling.

Putting on my Dell hat, accelerating these platforms helps our customers and our industry.