Exploding the Cloud Storage Banana

Storage Banana shows how cloud persistence is functionally diverse and optimized

Internally, my group (specifically Dave McCrory & Greg Althaus) has been kicking around some new ways of expressing clouds in an effort to help reconcile Dell’s traditional and cloud focused businesses.  We’ve found it challenging to translate CAP theorem and

externalized application state into more enterprise-ready concepts.

Our latest effort led to a pleasantly succinct explanation of why cloud storage is different than enterprise storage.  Ultimately, it’s a matter of control and optimization.  Cloud persistence (Cache, Queue, Tables, Objects) is functionally diverse in order to optimize for price and performance while enterprise storage (SAN, NAS, SQL) is control and centralization driven.  Unfortunately for enterprises, the data genie is out of the Pandora’s box with respect to architectures that drive much lower cost and higher performance.

The background on this irresistible transformation begins with seeing storage as a spectrum of services as per the table below.

Enterprise:

Consistent

 

Block (SAN) iSCSI, Infiband:

Amazon EBS, EqualLogic, EMC Symmeterix

File (NAS) NFS, CIFS:

NetApp, PowerVault, EMC Clariion

Database (ACID) MS SQL, Oracle 11g, MySQL, Postgres
Cloud:

Distributed

Partitioned

Object DX/Caringo, OpenStack Swift, EMC Atmos
Map/Reduce Hadoop DFS
Key Value Cassandra, CouchDB, Riak, Reddis, Mongo
Queue (Bus) RabbitMQ, ActiveMQ, ZeroMQ, OpenMQ, Celery
Cloud:

Transitory

 

Messaging AMPQ, MSMQ (.NET)
Shared RAM MemCache, Tokyo Cabinet

From this table, I approximated the relative price and performance for each component in the storage spectrum.

The result was the “cloud storage banana” graph.  In this graph, enterprise storage is clustered in the “compromise” quadrant where there’s a high price for relatively low performance.  The cloud persistence refuses to be clustered at all.  To save cost and enable distributed data, applications will use cheap but slow object storage.  This drives the need for high speed RAM based cache and distributed buses. These approaches are required when developers build fault tolerance at the application level.

Enterprises have enjoyed the false luxury of perceived hardware reliability.  Where these assumptions are removed, applications are freed to scale more gracefully and consider resource cost in their consumption plans.

When we compare the enterprise Pandora’s box storage to the cloud persistence banana, a more general pattern emerges.  The cloud persistence pattern represents a fragmentation of monolithic, IT controlled services into a more functional driven architecture.  In this case, we see desire for speed, distribution and cost forcing change to application design patterns.

We also see similar dispersion patterns driving changes in compute and networking conventions.

So next time your corporate IT refuses to deploy Rabbit MQ or MemCacheD, just remember my mother’s sage advice for cloud architects: “time flies like an arrow, fruit flies like an banana.”

Ready to Fail

Or How Monte Python taught me to program

Sometimes you learn the most from boring conference calls.  In this case, I was listening to a deployment that was so painfully reference-example super-redundant by-the-book that I could have completed the presenter’s sentences.  Except that he kept complaining about the cost.  It turns out that our typical failure-proofed belt-and-suspenders infrastructure is really, really expensive.

Shouldn’t our applications be Monte Python’s Black Knight yelling “It’s just a flesh wound!  Come back and fight!”   Instead, we’ve grown to tolerate princess applications that throw a tantrum of over skim milk instead of organic soy in their mochaito.

Making an application failure-ready requires a mindset change.  It means taking of our architecture space suit and donning our welding helmet.

Fragility is often born from complexity and complexity is the compounded interest from system design assumptions.

Let’s consider a transactional SQL database.  I love relational databases.  Really, I do.  Just typing SELECT * FROM or LEFT OUTER JOIN gives me XKCD-like goose bumps.  Unfortunately, they are as fragile as Cinderella’s glass slippers.  The whole concept of relational databases requires a complex web of sophisticated data integrity we’ve been able to take for granted.  The web requires intricate locking mechanisms that make data replication tricky.  We could take it for granted because our operations people have built up super-complex triple-redundant infrastructure so that we did not have to consider what happens when the database can’t perform its magic.

What is the real cost for that magic?

I’m learning about CouchDB.  It’s not a relational database, it a distributed JSON document warehouse with smart indexing.  And compared some of the fine grained features of SQL, it’s an arc welder.   The data in CouchDB is loosely structured (JSON!) and relationships are ad hoc.  The system doesn’t care (let alone enforce) that if you’ve maintained referential integrity within the document – it just wants to make sure that the documents are stored, replicated, and indexed.   The goodness here is that CouchDB allows you to distribute your data broadly so that it can be local and redundant.  Even better, weak structure allows you to evolve your schema agilely (look for a future post on this topic).

If you’re cringing about lack referential integrity then get over it – every SQL backed application I ever wrote required RI double-checking anyway!

If you’re cringing about possible dirty reads or race conditions then get over it – every SQL backed application I ever wrote required collision protection too!

I’m not pitching CouchDB (or similar) is a SQL replacement.   I’m holding it up as an example of a pragmatic approach to failure-ready design.   I’m asking you to think about the hidden complexity and consequential fragility that you may blindly inherit.

So cut off my arms and legs – I can still spit on your shoes.