Opscode Summit Recap – taking Chef & DevOps to a whole new level

Opscode Summit Agenda created by open space

I have to say that last week’s Opscode Community Summit was one of the most productive summits that I have attended. Their use of the open-space meeting format proved to be highly effective for a team of motivated people to self-organize and talk about critical topics. I especially like the agenda negations (see picture for an agenda snapshot) because everyone worked to adjust session times and locations based on what else other sessions being offered. Of course, is also helped to have an unbelievable level of Chef expertise on tap.

Overall

Overall, I found the summit to be a very valuable two days; consequently, I feel some need to pay it forward with some a good summary. Part of the goal was for the community to document their sessions on the event wiki (which I have done).

The roadmap sessions were of particular interest to me. In short, Chef is converging the code bases of their three products (hosted, private and open). The primary change on this will moving from CouchBD to a SQL based DB and moving away the API calls away from Merb/Ruby to Erlang. They are also improving search so that we can make more fine-tuned requests that perform better and return less extraneous data.

I had a lot of great conversations. Some of the companies represented included: Monster, Oracle, HP, DTO, Opscode (of course), InfoChimps, Reactor8, and Rackspace. There were many others – overall >100 people attended!

Crowbar & Chef

Greg Althaus and I attended for Dell with a Crowbar specific agenda so my notes reflect the fact that I spent 80% of my time on sessions related to features we need and explaining what we have done with Chef.

Observations related to Crowbar’s use of Chef

  1. There is a class of “orchestration” products that have similar objectives as Crowbar. Ones that I remember are Cluster Chef, Run Deck, Domino
  2. Crowbar uses Chef in a way that is different than users who have a single application to deploy. We use roles and databags to store configuration that other users inject into their recipes. This is dues to the fact that we are trying to create generic recipes that can be applied to many installations.
  3. Our heavy use of roles enables something of a cookbook service pattern. We found that this was confusing to many chef users who rely on the UI and knife. It works for us because all of these interactions are automated by Crowbar.
  4. We picked up some smart security ideas that we’ll incorporate into future versions.

Managed Nodes / External Entities

Our primary focus was creating an “External Entity” or “Managed Node” model. Matt Ray prefers the term “managed node” so I’ll defer to that name for now. This model is needed for Crowbar to manage system components that cannot run an agent such as a network switch, blade chassis, IP power distribution unit (PDU), and a SAN array. The concept for a managed node is that that there is an instance of the chef-client agent that can act as a delegate for the external entity. I had so much to say about that part of the session, I’m posting it as its own topic shortly.

Ready to Fail

Or How Monte Python taught me to program

Sometimes you learn the most from boring conference calls.  In this case, I was listening to a deployment that was so painfully reference-example super-redundant by-the-book that I could have completed the presenter’s sentences.  Except that he kept complaining about the cost.  It turns out that our typical failure-proofed belt-and-suspenders infrastructure is really, really expensive.

Shouldn’t our applications be Monte Python’s Black Knight yelling “It’s just a flesh wound!  Come back and fight!”   Instead, we’ve grown to tolerate princess applications that throw a tantrum of over skim milk instead of organic soy in their mochaito.

Making an application failure-ready requires a mindset change.  It means taking of our architecture space suit and donning our welding helmet.

Fragility is often born from complexity and complexity is the compounded interest from system design assumptions.

Let’s consider a transactional SQL database.  I love relational databases.  Really, I do.  Just typing SELECT * FROM or LEFT OUTER JOIN gives me XKCD-like goose bumps.  Unfortunately, they are as fragile as Cinderella’s glass slippers.  The whole concept of relational databases requires a complex web of sophisticated data integrity we’ve been able to take for granted.  The web requires intricate locking mechanisms that make data replication tricky.  We could take it for granted because our operations people have built up super-complex triple-redundant infrastructure so that we did not have to consider what happens when the database can’t perform its magic.

What is the real cost for that magic?

I’m learning about CouchDB.  It’s not a relational database, it a distributed JSON document warehouse with smart indexing.  And compared some of the fine grained features of SQL, it’s an arc welder.   The data in CouchDB is loosely structured (JSON!) and relationships are ad hoc.  The system doesn’t care (let alone enforce) that if you’ve maintained referential integrity within the document – it just wants to make sure that the documents are stored, replicated, and indexed.   The goodness here is that CouchDB allows you to distribute your data broadly so that it can be local and redundant.  Even better, weak structure allows you to evolve your schema agilely (look for a future post on this topic).

If you’re cringing about lack referential integrity then get over it – every SQL backed application I ever wrote required RI double-checking anyway!

If you’re cringing about possible dirty reads or race conditions then get over it – every SQL backed application I ever wrote required collision protection too!

I’m not pitching CouchDB (or similar) is a SQL replacement.   I’m holding it up as an example of a pragmatic approach to failure-ready design.   I’m asking you to think about the hidden complexity and consequential fragility that you may blindly inherit.

So cut off my arms and legs – I can still spit on your shoes.