Notes from 2011 Cloud Connect Event Day 2 (#ccevent)

With the OpenStack launch behind me, I have some time to attend the Cloud Connect Event.  I missed all the DevOps sessions, but was getting to geek out on the NoSQL & Big Data sessions.   I jumped to the private cloud track (based on Twitter traffic) and was rewarded for the shift.

I’m surprised at how much focus this cloud conference is dedicated to private cloud.  At other cloud conferences I’ve attended, the focus has been on learning how to use the cloud (specifically the public cloud).  This is the first cloud show I’ve attended that has so much emphasis, dialog and vendor feeding around private.  This was a suits & slacks show with few jeans, t-shirts, and pony tails.  Perhaps private cloud is where the $$$ is being spent now?

It definitely feels like using cloud has become assumed, but the best practices and tools are just emerging.

The twitter #ccevent stream is interesting but temporal.  I’m posting my raw (spelling optional) notes (below the more tag) because there is a lot of great content from the show to support and extend the twitter stream.  I’ll try to italicize some of the better lines.

Keynote

Reddit Presentation about Data Architecture

  • they like memcache
  • they use cassandra
    • has very good properties for a cloud app (distributed, no single failure, sharding, etc)
  • they interject funny things into their presentations – works, keeps attention
  • using pig for the log processor
  • using ganglia to track their performance
  • were using Postgresql, but mainly as KV store. moving way (“thing database”)
  • tips:
    • queues are your friend
    • EBS S3 not always performance consistent, needed to add protection (software RAID)
    • had trouble w/ memchacedb – using md5 keys made it hard to rebalance, not for high workloads
    • dont use a single EBS – RAID them
      • they found it to be more consistent than either single EBS or local disk
    • londiste is flexible, but does not handle errors very well (corrupts the slaves!)
    • db transactions are needed if you have data in multiple tables
    • cassandra is not perfect, especially at edge (no data loss bugs)
  • speaker @jedberg
  • moving to queues as much as possible
    • using Rabbit – live version 2
    • data moves to cache & queue at the same time, that lets users get data before the storage system had to catch up
    • basically, they are using queues + cache to front end their
  • S3 is “basically perfect” – they have some issues w/ EBS on performance
  • they run 100% on the cloud “the only equipment our company owns is laptops”
  • had been using SOLR for search, but that did not search
    • Solr hit a wall and stopped working
    • they out sourced it a company that does it w/ Lucine
  • tried Amazon simpleDB a while back – it could not scale to their needs (at the time)
    • they are reluctant to lock in to an amazon specific service – impacts their community
  • they keep team small
    • have a tendency to make choices that create technical debt so they can keep moving
    • would like more automation
    • looking to expand team
  • devops tools:
    • switching over to puppet for base config, chef for detailed config
    • this is a place where they would like to grow

NoSQL & Big Data Session

  • Voldermort – very large hashtable (inspired by Amazon Dynamo DB)
    • main goals: scalabilty & availablity
    • modular design
  • Mongo was interested in stronger consistency
    • conflict resolution is difficult to maintain
    • big table
  • Cassandra
    • eventual consistency
    • finding – vast majority of corporate data does not need to be transactionally consistent
    • tried to combine best of big table & dynamo
    • fully written in Java (and NEW >2007 code)
    • compact and modern because of green field
  • Rob’s note: where’s CouchDB ? What about Riak? (hello – marketing???)
  • Reddit’s comments
    • Cassandra’s data model is similar to what they were using, would keep using
  • Unstructured, Distributed DB is NoNEW – DNS uses it, search, etc
  • Audience comment: using NoSQL , you have to spend time writing consistency checkers – is this a systemic?
  • Once you are thinking about non-transactional KV DBs, then you’ve got the hard part done.
    • SQL is a mentallity
  • … left session, was more like NoSQL 101

Private Cloud (question format)

  • Cloud migration is disruptive, creates culture challenges
    • see out the people questioning the status quo
    • John Treadway – customer wanted “best cloud product”, did not understand what they wanted to accomplish
    • My cloud is not the stack & management: the cloud is the storage, compute, people, network, security, etc
  • How do you judge the ROI of your cloud?
    • Regulation requires private cloud for some
    • Some of the migration requires rewrite (such as vertical scale DBs) and that keeps things in private cloud (hybrid)
    • eventually, everything will be at public
    • No company will go 100% in any direction (blended environment) – we are in transition with five of models: 3 of which are new.
    • You have to be practical about how quickly you can influence culture change
  • Discussion about security
    • panel believed that Amazon security is better than any individual company
    • Amazon has a “huge attack surface” and there is some idea that you are more secure hiding in the “haystack”
    • I think it’s unclear about where people expect attacks to come from.

Open vs Closed Source in Private Cloud

  • Panel w/ nimbula, cloud.com, openstack (@bpiatt, rackspace), cisco.
  • Moderator: Randy Bias – cloud scaling
  • Release cycles are speeding up for public software & clouds (Amazon had 50 major releases in 2010)
  • Question – can enterprise software keep up w/ release cycles
    • it’s the culture of the company that drives the feature set & frequency, not the tech
    • everyone will have middle of the road features, it’s the edges that are interesting
    • there is a distinction between open source and openness
      • openness is…? transparency, APIs, extensibility, ability to integrations
  • Question – is open source required to get full benefit
    • no – there are customers who want a bundle/integrated (turnkey) solution w/ an API
    • yes – there are customers who want control and ownership
    • key is that tech needs to be widely adopted and can get expert, that is the key to success
    • I asked how they factor in ops into the mix of software & hardware
      • cloud is very much about process
      • cloud is an operations model – what’s different is how you put in the automation around it
      • clouds are moving to make operations easier and easier to adopt that model
      • the focus (so far) has been the app developer – now the physical infrastructure has to do this too
  • Question – lockin? Does this limit growth
    • This is all over the place – seems cultural
    • Cisco? If relationship & tech work then it has ROI, but that can change
      • reputation is a factor
      • openness is important
      • automation & integration are important
      • they believe they can provide converged elements that are easier to automate
      • not trying to do an end-to-end stack because no 2 data centers are alike
    • the longer you work w/ the vendor, the higher the switching cost
      • vendors know this, risk is vendor using it against you
      • cloud fabrics can have high switching costs
      • having multiple vendors or open source can be a defense of this
  • Question – is issue open source or something else?
    • no good answers “it depends”

Ceph Storage

  • very focused on Ceph – not a general session at all
  • as systems grow, they tend to loose performance because they are not optimized / self optimized
  • challenge is to distribute data w/o a lookup table because the table is hard to scale
  • traditional storage clusters (NAS, SAN, RAID, etc) are mainly PASSIVE devices
  • did a good job of going into the whys of their architecture
  • ceph uses a dynamic tree so that load is distributed and can be rebalanced
    • cluster can respond when it notices a hot spot or overload
  • subtree recursive accounting – can walk tree quickly
  • nearly fell asleep…. jumped to Hybrid cloud but missed the fireworks

Hybrid Clouds – transition or here to stay

  • Panel seems to believe that everything is moving to public eventually
  • long tail (Mainframes) still around, but still a tail.
  • build your own is for >10,000 machines
  • using public cloud allows for “late binding decisions” about where and when to position capacity

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s