Unknown's avatar

About Rob H

A Baltimore transplant to Austin, Rob thinks about ways of building scale infrastructure for the clouds using Agile processes. He sat on the OpenStack Foundation board for four years. He co-founded RackN enable software that creates hyperscale converged infrastructure.

Exploding the Cloud Storage Banana

Storage Banana shows how cloud persistence is functionally diverse and optimized

Internally, my group (specifically Dave McCrory & Greg Althaus) has been kicking around some new ways of expressing clouds in an effort to help reconcile Dell’s traditional and cloud focused businesses.  We’ve found it challenging to translate CAP theorem and

externalized application state into more enterprise-ready concepts.

Our latest effort led to a pleasantly succinct explanation of why cloud storage is different than enterprise storage.  Ultimately, it’s a matter of control and optimization.  Cloud persistence (Cache, Queue, Tables, Objects) is functionally diverse in order to optimize for price and performance while enterprise storage (SAN, NAS, SQL) is control and centralization driven.  Unfortunately for enterprises, the data genie is out of the Pandora’s box with respect to architectures that drive much lower cost and higher performance.

The background on this irresistible transformation begins with seeing storage as a spectrum of services as per the table below.

Enterprise:

Consistent

 

Block (SAN) iSCSI, Infiband:

Amazon EBS, EqualLogic, EMC Symmeterix

File (NAS) NFS, CIFS:

NetApp, PowerVault, EMC Clariion

Database (ACID) MS SQL, Oracle 11g, MySQL, Postgres
Cloud:

Distributed

Partitioned

Object DX/Caringo, OpenStack Swift, EMC Atmos
Map/Reduce Hadoop DFS
Key Value Cassandra, CouchDB, Riak, Reddis, Mongo
Queue (Bus) RabbitMQ, ActiveMQ, ZeroMQ, OpenMQ, Celery
Cloud:

Transitory

 

Messaging AMPQ, MSMQ (.NET)
Shared RAM MemCache, Tokyo Cabinet

From this table, I approximated the relative price and performance for each component in the storage spectrum.

The result was the “cloud storage banana” graph.  In this graph, enterprise storage is clustered in the “compromise” quadrant where there’s a high price for relatively low performance.  The cloud persistence refuses to be clustered at all.  To save cost and enable distributed data, applications will use cheap but slow object storage.  This drives the need for high speed RAM based cache and distributed buses. These approaches are required when developers build fault tolerance at the application level.

Enterprises have enjoyed the false luxury of perceived hardware reliability.  Where these assumptions are removed, applications are freed to scale more gracefully and consider resource cost in their consumption plans.

When we compare the enterprise Pandora’s box storage to the cloud persistence banana, a more general pattern emerges.  The cloud persistence pattern represents a fragmentation of monolithic, IT controlled services into a more functional driven architecture.  In this case, we see desire for speed, distribution and cost forcing change to application design patterns.

We also see similar dispersion patterns driving changes in compute and networking conventions.

So next time your corporate IT refuses to deploy Rabbit MQ or MemCacheD, just remember my mother’s sage advice for cloud architects: “time flies like an arrow, fruit flies like an banana.”

OpenStack videos peek into cloud shakers

Barton George (Dell’s cloud evangalist and cloud shouter) has posted videos from the OpenStack conference last week:

McCrory lays out VMware vision

Props are due to Dave McCrory for his fine investigative work reading the VMware cloudy tea leaves.  Over the weekend, he posted a series of articles about VMware’s Open PaaS and VMforce offerings.  This is a significant write-up based on information gleaned from their public code check-ins that he validated with them after the fact.

I have not had time to digest it yet – check back later for actual commentary.

OpenStack Day 2 Aspiration: Dreaming & Breathing

Between partnering meetings, I bounced through biz and tech sessions during Day 2 of the OpenStack conference (day 1 notes).   After my impression summary, I’m including some succinct impressions, pictures, and copies of presentations by my Dell team-mates Greg Althaus & Brent Douglas.

Clouds on the road to Bexar
My overwhelming impression is a healthy tension between aspirational* and practical discussions.  The community appetite for big broad and bodacious features is understandably high: cloud seems on track as a solution for IT problems but there are is still an impedance mismatch between current apps and cloud capabilities.
As service providers ASPire to address these issues, some OpenStack blue print discussions tended to digress towards more forward-looking or long-term designs.  However, watching the crowd, there was also a quietly heads down and pragmatic audience ready to act and implement.  For this action focused group, delivering working a cloud was the top priority.  The Rackers and Nebuliziers have product to deploy and will not be distracted from the immediate concerns of living, breathing shippable code.
I find the tension between dreaming aspiration (cloud futures) and breathing aspiration (cloud delivery) necessary to the vitality of OpenStack.
[Day 3 update, these coders are holding the floor.  People who are coding have moved into the front seats of the fishbowl and the process is working very nicely.]
Specific Comments (sorry, not linking everything):
  • Cloud networking is a mess and there is substantial opportunity for innovation here.  Nicira was making an impression talking about how Open vSwitch and OpenFlow could address this at the edge switches.  interesting,  but messy.
  • I was happy with our (Dell’s) presentations: real clouds today (Bexas111010DataCenterChanges) and what to deploy on (Bexar111010OpenStackOnDCS).
  • SheepDog was presented as a way to handle block storage.  Not an iSCSI solution, works directly w/ KVM.  Strikes me as too limiting – I’d rather see just using iSCSI.  We talked about GlusterFS or Ceph (NewDream).  This area needs a lot of work to catch up with Amazon EBS.  Unfortunately, persisting data on VM “local” disks is still the dominate paradigm.
  • Discussions about how to scale drifted towards aspirational.
  • Scalr did a side presentation about automating failover.
  • Discussion about migration from Eucalyptus to OpenStack got side tracked with aspirations for a “hot” migration.  Ultimately, the differences between network was a problem.  The practical issue is discovering the meta data – host info not entirely available from the API.
  • Talked about an API for cloud networking.  This blue print was heavily attended and messy.  The possible network topologies present too many challenges to describe easily.  Fundamentally, there seems consensus that the API should have a very very simple concept of connecting VM end points to a logical segment.  That approach leverages the accepted (but out dated) VLAN semantic, but implementation will have to be topology aware.  ouch!
  • Day 3 topic Live migration: Big crowd arguing with bated breath about this.  The summary “show us how to do it without shared storage THEN we’ll talk about the API.”
Executive Tweet:  #OpenStack getting to down business.  Big dreams.  Real problems.  Delivering Code.
 
Note: I nominate Aspirational for 2010 buzzword of the year.

Greg PresentingBig Crowd on Day 1

OpenStack Bexar Design Summit Day 1

Yesterday, Dell sent me to be part of our OpenStack vanguard for the design summit.  The conference is fascinating and productive for the content of the sessions and even more interesting for the hallway meetings.

It’s obvious looking at the board composition that RackSpace and NASA Nova are driving  most of the development; however, the is palpable community interest and enthusiasm.  Participants and contributors showed up in force at this event.

RackSpace and NASA leadership provides critical momentum for the community.  Code is the smallest part of their contribution, their commitment to run the code at scale in production is the magic rocket fuel powering OpenStack. I’ve had many conversations with partners and prospects planning to follow RackSpace into production with a 3-6 month lag.

Beyond that primary conference arc, my impressions:

  • Core vendors like Citrix, Dell, Canonical are signing up to do primary work for the code base.  They are taking ownership for their own components in the stack.
  • Universally, people comment about the speed of progress and amount of code being generated.  Did I mention that there is a lot of code being written.
  • Networking is still a major challenge.  OpenStack (with Citrix’s Xen support) is driving Open vSwitchas a replacement for iptables management.
  • IPv6 gets lackadaisical treatment in the US, but is urgent in Japan/Asia where their core infrastructure is ALREADY IPv6.  Their frustration to get attention here should be a canary in the cloud mine (but is not).  They proposed a gateway model where VMs have dual addresses: IPv4 gets NATed while IPv6 is a pass-through. Seems to me that the going IPv6 internal is the real solution.
  • Cloud bursting is still too fuzzy a thing to talk about in a big group.  The session about it covered so many use-cases that we did not accomplish anything.  Some people wanted to talk about cloud API proxy while others (myself included) wanted to talk about managing apps between clouds.  My $0.02 is that vendors like RightScale solve the API proxy issue so it’s the networking issues that need focus.  We need to get back to the use-cases!

Executive Tweet: #openstack: Partners & Code = great progress.  Networking = needs more love

Other notes:

CAP Chasm: why clouds say “no SANank you” to SANs

My personal bias against SANs in cloud architectures is well documented; however, I am in the minority at my employer (Dell) and few enterprise IT shops share my view.  In his recent post about CAP theorem, Dave McCrory has persuaded me to look beyond their failure to bask in my flawless reasoning.  Apparently, this crazy CAP thing explains why some people loves SANs (enterprise) and others don’t (clouds).

The deal with CAP is that you can only have two of Consistency, Availability, or Partitioning Tolerance.  Since everyone wants Availablity, the choice is really between Consitency or Partitioning.  Seeking Availability you’ve got two approaches:

  1. Legacy applications tried to eliminate faults to achieve Consistency with physically redundant scale up designs. 
  2. Cloud applications assume faults to achieve Partitioning Tolerance with logically redundant scale out design.

According to CAP, Legacy and cloud approaches are so fundamentally different that they create a “CAP Chasm” in which the very infrastructure fabric needed to deploy these applications is different.

As a cloud geek, I consider the inherent cost and scale limitations of a CA approach much too limited.   My first hand experience is that our customers and partners share my view: they have embraced AP patterns.  These patterns make more efficient use of resources, dictate simpler infrastructure layout, scale like hormone-crazed rabbits at a carrot farm, and can be deployed on less expensive commodity hardware.

As a CAP theorem enlightened IT professional, I can finally accept that there are other intellectually valid infrastructure models. 

See Mom?  I can play nicely with others after all.

VM != Cloud! Comparision draws ire, misses point

Having the requirement benefit of working with both Dave McCrory and Joyent on a daily basis at Dell, I cannot resist weighing in on the blog pong between them.

Dave’s post comparing VM pricing prompted Joyent to blog that VMs are not the only measure of cloud.

While I completely agree that clouds are not all about VMs, I think that Joyent is too limited in their definition of cloud in their reply.  We’re seeing an emergence of services as the differentiator between clouds.

Looking at Amazon, Azure, and Google, the clear way to reduce cloud spend is to migrate applications to consume their services (SQL, Storage, Bus, etc).

If cloud users are primarily concerned about price per hour (which I’m not convinced is the case) then they have real motivation to migrate from purely VM (or SmartMachine(tm) ) based applications to ones that use services.

Cloud Culture Clash Creates Opportunities

In my opinion, one of the biggest challenges facing companies like Dell, my employer, is how to help package and deliver this thing called cloud into the market.  I recently had the opportunity to watch and listen to customers try to digest the concept of PaaS.

While not surprising, the technology professionals in the room split into across four major cultural camps: enterprise vs. start-up and dev vs. ops.  Because I have a passing infatuation with pastel cloud shaped quadrant graphs, I was able to analyze the camps for some interesting insights.

The camps are:

  1. Imperialists:  These enterprise type developers are responsible for adapting their existing business to meet the market.  They prefer process oriented tools like Microsoft .Net and Java that have proven scale and supportability.
  2. MacGyvers: These startup type developers are under the gun to create marketable solutions before their cash runs out.  They prefer tools that adapt quick, minimize development time and community extensions.
  3. Crown Jewels: These enterprise type IT workers have to keep the email and critical systems humming.  When they screw up everyone notices.  They prefer systems where they can maintain control, visibility, or (better) both.
  4. Legos: These start-up type operations jugglers are required to be nimble and responsive with shoestring budgets.   They prefer systems that they can change and adapt quickly.  They welcome automation as long as they can maintain control, visibility, or (better) both.

This graph is deceiving because it underplays the psychological break caused by willingness to take risks.  This break creates a cloud culture chasm. 

On one side, the reliable Imperialists want will mount a Royal Navy flotilla to protect the Crown Jewels in a massive show of strength.  They are concerned about the security and reliability of cloud technologies.

On the other side, the MacGyvers are working against a ticking time bomb to build a stealth helicopter from Legos they recovered from Happy Meals™.  They are concerned about getting out of their current jam to compile another day.

Normally Imperialists simply ignore the MacGyvers or run down the slow ones like yesterday’s flotsam.  The cloud is changing that dynamic because it’s proving to be a dramatic force multiplier in several ways:

  1. Lower cost of entry – the latest cloud options (e.g. GAE) do not charge anything unless you generate traffic.  The only barrier to entry is an idea and time.
  2. Rapid scale – companies can fund growth incrementally based on success while also being able to grow dramatically with minimal advanced planning.
  3. Faster pace of innovation – new platforms, architectures and community development has accelerated development.  Shared infrastructure means less work on back office and more time on revenue focused innovation.
  4. Easier access to customers – social media and piggy backing on huge SaaS companies like Facebook, Google or SalesForce bring customers to new companies’ front doors.  This means less work on marketing and sales and more time on revenue focused innovation.

The bottom line is that the cloud is allowing the MacGyvers to be faster, stronger, and more innovative than ever before.  And we can expect them to be spending even less time polishing the brass in the back office because current SaaS companies are working hard to help make them faster and more innovative.

For example, Facebook is highly incented for 3rd party applications to be innovative and popular not only because they get a part of the take, but because it increases the market strength of their own SaaS application.

So the opportunity for Imperialists is to find a way for employee and empower the MacGyvers.  This is not just a matter of buying a box of Legos: the strategy requires tolerating enabling embracing a culture of revenue focused innovation that eliminates process drag.  My vision does not suggest a full replacement because the Imperialists are process specialists.  The goal is to incubate and encapsulate cloud technologies and cultures.

So our challenge is more than picking up cloud technologies, it’s understanding the cloud communities and cultures that we are enabling.