Crowbar 1.2 released includes OpenStack Diablo Final

With the holiday rush, I neglected to post about Monday’s Crowbar v1.2 release (ISO here)!

The core focus for this release was to support the OpenStack Diablo Final bits (which my employer, Dell, includes as part of the “Dell OpenStack Powered Cloud Solution“); however, we added a lot of other capability as we continue to iterate on Crowbar.

I’m proud of our team’s efforts on this release on both on features and quality.  I’m equally delighted about the Crowbar community engagement via the Crowbar list server.  Crowbar is not hardware or operating system specific so it’s encouraging to hear about deployments on other gear and see the community helping us port to new operating system versions.

We driving more and more content to Crowbar’s Github as we are working to improve community visibility for Crowbar.  As such, I’ve been regularly updating the Crowbar Roadmap.  I’m also trying to make videos for Crowbar training (suggestions welcome!).  Please check back for updates about upcoming plans and sprint activity.

Crowbar Added Features in v1.2:

  • Central feature was OpenStack Diablo Final barclamps (tag “openstack-os-build”)
  • Improved barclamp packaging
  • Added concepts for “meta” barclamps that are suites of other barclamps
  • Proposal queue and ordering
  • New UI states for nodes & barclamps (led spinner!)
  • Install includes self-testing
  • Service monitoring (bluepill)

Looking forward

Dell has a long list of pending Hadoop and OpenStack deployments using these bits so you can expect to see updates and patches matching our field experiences.  We are very sensitive to community input and want to make Crowbar the best way to deliver a sustainable repeatable reference deployment of OpenStack, Hadoop and other cloud technologies.

Extending Chef’s reach: “Managed Nodes” for External Entities.

Note: this post is very technical and relates to detailed Chef design patterns used by Crowbar. I apologize in advance for the post’s opacity. Just unleash your inner DevOps geek and read on. I promise you’ll find some gems.

At the Opscode Community Summit, Dell’s primary focus was creating an “External Entity” or “Managed Node” model. Matt Ray prefers the term “managed node” so I’ll defer to that name for now. This model is needed for Crowbar to manage system components that cannot run an agent such as a network switch, blade chassis, IP power distribution unit (PDU), and a SAN array. The concept for a managed node is that there is an instance of the chef-client agent that can act as a delegate for the external entity. We’ve been reluctant to call it a “proxy” because that term is so overloaded.

My Crowbar vision is to manage an end-to-end cloud application life-cycle. This starts from power and network connections to hardware RAID and BIOS then up to the services that are installed on the node and ultimately reaches up to applications installed in VMs on those nodes.

Our design goal is that you can control a managed node with the same Chef semantics that we already use. For example, adding a Network proposal role to the Switch managed node will force the agent to update its configuration during the next chef-client run. During the run, the managed node will see that the network proposal has several VLANs configured in its attributes. The node will then update the actual switch entity to match the attributes.

Design Considerations

There are five key aspects of our managed node design. They are configuration, discovery, location, relationships, and sequence. Let’s explore each in detail.

A managed node’s configuration is different than a service or actuator pattern. The core concept of a node in chef is that the node owns the configuration. You make changes to the nodes configuration and it’s the nodes job to manage its state to maintain that configuration. In a service pattern, the consumer manages specific requests directly. At the summit (with apologies to Bill Clinton), I described Chef configuration as telling a node what it “is” while a service provide verbs that change a node. The critical difference is that a node is expected to maintain configuration as its composition changes (e.g.: node is now connected for VLAN 666) while a service responds to specific change requests (node adds tag for VLAN 666). Our goal is the maintain Chef’s configuration management concept for the external entities.

Managed nodes also have a resource discovery concept that must align with the current ohai discovery model. Like a regular node, the manage node’s data attributes reflect the state of the managed entity; consequently we’d expect a blade chassis managed node to enumerate the blades that are included. This creates an expectation that the manage node appears to be “root” for the entity that it represents. We are also assuming that the Chef server can be trusted with the sharable discovered data. There may be cases where these assumptions do not have to be true, but we are making them for now.

Another essential element of managed nodes is that their agent location matters because the external resource generally has restricted access. There are several examples of this requirement. Switch configuration may require a serial connection from a specific node. Blade SANs and PDUs management ports are restricted to specific networks. This means that the manage node agents must run from a specific location. This location is not important to the Chef server or the nodes’ actions against the managed node; however, it’s critical for the system when starting the managed node agent. While it’s possible for managed nodes to run on nodes that are outside the overall Chef infrastructure, our use cases make it more likely that they will run as independent processes from regular nodes. This means that we’ll have to add some relationship information for managed nodes and perhaps a barclamp to install and manage managed nodes.

All of our use cases for managed nodes have a direct physical linkage between the managed node and server nodes. For a switch, it’s the ports connected. For a chassis, it’s the blades installed. For a SAN, it’s the LUNs exposed. These links imply a hierarchical graph that is not currently modeled in Chef data – in fact, it’s completely missing and difficult to maintain. At this time, it’s not clear how we or Opscode will address this. My current expectation is that we’ll use yet more roles to capture the relationships and add some hierarchical UI elements into Crowbar to help visualize it. We’ll also need to comprehend node types because “managed nodes” are too generic in our UI context.

Finally, we have to consider the sequence of action for actions between managed nodes and nodes.  In all of our uses cases, steps to bring up a node requires orchestration with the managed node.  Specifically, there needs to be a hand-off between the managed node and the node.  For example, installing an application that uses VLANs does not work until the switch has created the VLAN,  There are the same challenges on LUNs and SAN and blades and chassis.  Crowbar provides orchestration that we can leverage assuming we can declare the linkages.

For now, a hack to get started…

For now, we’ve started on a workable hack for managed nodes. This involves running multiple chef-clients on the admin server in their own paths & processes. We’ll also have to add yet more roles to comprehend the relationships between the managed nodes and the things that are connected to them. Watch the crowbar listserv for details!

Extra Credit

Notes on the Opscode wiki from the Crowbar & Managed Node sessions

Barclamps: now with added portability!

I had a question about moving barclamps between solutions.  Since Victor just changed the barclamp build to create a tar for each barclamp (with the debs/rpms), I thought it was the perfect time to explain the new feature.

You can find the barclamps on the Crowbar ISO under “/dell/barclamps” and you can install the TAR onto a Crowbar system using “./barclamp_install foo.tar.gz” where foo is the name of your barclamp.

Here’s a video of how to find and install barclamp tars:

Note: while you can install OpenStack into a Hadoop system, that combination is NOT tested.  We only test OpenStack on Ubuntu 10.10 and Hadoop on RHEL 5.7.   Community help in expanding support is always welcome!

Opscode Summit Recap – taking Chef & DevOps to a whole new level

Opscode Summit Agenda created by open space

I have to say that last week’s Opscode Community Summit was one of the most productive summits that I have attended. Their use of the open-space meeting format proved to be highly effective for a team of motivated people to self-organize and talk about critical topics. I especially like the agenda negations (see picture for an agenda snapshot) because everyone worked to adjust session times and locations based on what else other sessions being offered. Of course, is also helped to have an unbelievable level of Chef expertise on tap.

Overall

Overall, I found the summit to be a very valuable two days; consequently, I feel some need to pay it forward with some a good summary. Part of the goal was for the community to document their sessions on the event wiki (which I have done).

The roadmap sessions were of particular interest to me. In short, Chef is converging the code bases of their three products (hosted, private and open). The primary change on this will moving from CouchBD to a SQL based DB and moving away the API calls away from Merb/Ruby to Erlang. They are also improving search so that we can make more fine-tuned requests that perform better and return less extraneous data.

I had a lot of great conversations. Some of the companies represented included: Monster, Oracle, HP, DTO, Opscode (of course), InfoChimps, Reactor8, and Rackspace. There were many others – overall >100 people attended!

Crowbar & Chef

Greg Althaus and I attended for Dell with a Crowbar specific agenda so my notes reflect the fact that I spent 80% of my time on sessions related to features we need and explaining what we have done with Chef.

Observations related to Crowbar’s use of Chef

  1. There is a class of “orchestration” products that have similar objectives as Crowbar. Ones that I remember are Cluster Chef, Run Deck, Domino
  2. Crowbar uses Chef in a way that is different than users who have a single application to deploy. We use roles and databags to store configuration that other users inject into their recipes. This is dues to the fact that we are trying to create generic recipes that can be applied to many installations.
  3. Our heavy use of roles enables something of a cookbook service pattern. We found that this was confusing to many chef users who rely on the UI and knife. It works for us because all of these interactions are automated by Crowbar.
  4. We picked up some smart security ideas that we’ll incorporate into future versions.

Managed Nodes / External Entities

Our primary focus was creating an “External Entity” or “Managed Node” model. Matt Ray prefers the term “managed node” so I’ll defer to that name for now. This model is needed for Crowbar to manage system components that cannot run an agent such as a network switch, blade chassis, IP power distribution unit (PDU), and a SAN array. The concept for a managed node is that that there is an instance of the chef-client agent that can act as a delegate for the external entity. I had so much to say about that part of the session, I’m posting it as its own topic shortly.

Hadoop Crowbar released to open source! (plus AN HOUR of videos!)

I’m proud to announce that my team at Dell has open sourced our Apache Hadoop barclamps!  This release follows our Dell | Cloudera Hadoop Solution open source commitment from Hadoop World earlier this month.

As part of this release, we’ve created nearly AN HOUR of video content showing the Hadoop Barclamps in action, installing Crowbar (on CentOS), building Crowbar ISOs in the cloud and specialized developer focused builds.

If you want to talk to the Crowbar team.  We’re attending events in Boston 11/29, Seattle 11/30, and Austin 12/8.

Here are links to the videos:

More Hadoop perspectives from Dell:  Joseph George on what it means and  Barton George‘s backgrounder about barclamps.

Greg Althaus at 11/15 Austin Cloud User Group meeting (annotated 90 min audio)

Greg Althaus did a 90 minute Crowbar deep dive at this week’s Austin Cloud User Group.  Brad Knowles recorded audio and posted it so I thought I’d share the link and my annotations.  There are a lot more times to catch up with our team at Dell in Austin, Boston, and Seattle.

Video Annotations –  (##:## time stamp)

  • 00:00 Intros & Meeting Management
  • 12:00 Joseph George Introduction / Sponsorship
  • 16:00 Greg Starts – why Crowbar
  • 19:00 DevOps slides
  • 21:00 What does Crowbar do for DevOps
    • make it easier to manage
    • make it easier to repeat
  • 24:00 What’s included – how we grow / where to start
  • 27:20 Starting to show crowbar – what’s included as barclamps
    • pluggable / configuration
    • Barclamps!
  • 28:10 What is a barclamp
    • discussion about the barclamps in the base
  • 34:30: We ❤ Chef. Puppet vs Chef
  • 36:00 Why barclamps are more than cookbooks
  • 36:30 State machine & transitions
  • Q&A Section
    • 38:50 Reference Architectures
    • 43:00 Barclamps work outside of Crowbar?
    • 44:15 Hardware models supported
    • 47:30 Storage Queston
    • 49:00 HA progress
    • 53:00 Ceph as a distributed cloud on all nodes
    • 56:20 Deployer has a map of how to give out roles
  • 58:00 Demo Fails
  • 58:30 Crowbar Architecture
  • 62:00 How Crowbar can be extended
  • 63:00 Workflow & Proposals
  • 65:40 Meta Barclamps
  • 71:10 Chef Environments
  • 73:40 Taking OpenStack releases and Environments
  • 75:00 The case for remove recipes
  • 77:33 Git Hub Tour
  • 79:00 Network Barclamp deep dive
  • 84:00 Adding switch config (roadmap topic)
  • 86:30 Conduits
  • 87:40 Barclamp Extensions / Services
  • 89:00 Questions
    • 89:20 Proposal operations
    • 93:30 OpenStack Readiness & Crowbar Design Approach
    • 93:10 Network Teaming
    • 94:30 Which OS & Hypervisors
    • 96:30 Continuous Integration & Tools
    • 98:40 BDD (“cucumberesque”) & Testing
    • 99:40 Build approach & barclamp construction
  • 100:00 Wrap up by Joseph

Crowbar community support and 111111 sprint plan

The Dell Crowbar team is working to improve road map transparency. In the last few weeks, the Crowbar community has become more active on our lists, testing builds, and helping with documentation.

We love the engagement and continue to make supporting the list a priority.

Participation in Crowbar, OpenStack and Hadoop has been exceeding our expectations and we’re working to implement more community support and process. Thank you!!!

Our next steps:

  1. I’ve committed to post sprint plans and summary pages (this is the first)
  2. New Crowbar Twitter account
  3. I’m going to setup feature voting on the Crowbar Facebook page (like to vote)
  4. Continue to work the listserv and videos. We need help converting those to documentation on the crowbar wiki.
  5. Formalize collaborator agreements – we’re working with legal on this
  6. Exploring the option of a barclamp certification program and Crowbar support
  7. Moving to a gated trunk model for internal commits to improve quality
  8. Implementing a continuous integration system that includes core and barclamps. This will be part of our open source components.

We are working towards the 1.2 release (Beta 1) . That release is focused on supporting OpenStack but includes enhancements for upgrades, Hadoop, and additional OS support.

Our Sprint 111111 plan.

Source: Crowbar Wiki: [[sprint 111111]]

  • Theme: OpenStack Diablo Final release candidate.
  • Core Work: Refine Deployment for Nova, Glance, Nova Dashboard (horizon), keystone, swift
  • New additions: mySQL barclamp, Nova HA networking, kong
  • Crowbar internals: expose error states for proposals, allow packages to be included with barclamps to make upgrades easier, barclamp group pages
  • Operating system: added CentOS
  • Documentation: we’ve split the user guides into distinct books so Crowbar, OpenStack, and Hadoop each have their own user guide.
  • Pending action: expose the Hadoop barclamps
  • OS note: OpenStack is being tested (at Dell) against Ubuntu 10.10 only. Hadoop was tested against RHEL 5.7 and we expect it to work against CentOS also.

Rackspace unveils OpenStack reference architecture & private cloud offering

Yesterday, Rackspace Cloud Builders unveiled both their open reference architecture (RA) and a private cloud offering (on GigaOM) based upon the RA.  The RA (which is well aligned with our Dell OpenStack RA) does a good job laying out the different aspects of an OpenStack deployment.  It also calls for the use of Dell C6100 servers and the open source version of Crowbar.

The Rackspace RA and Crowbar deployment barclamps share the same objective: sharing of best practices for OpenStack operations.

Over the last 12+ months, my team at Dell has had the opportunity to work with many customers on OpenStack deployment designs.  While no two of these are identical, they do share many similarities.  We are pleased to collaborate with Rackspace and others on capturing these practices as operational code (or “opscode” if you want a reference to the Chef cookbooks that are an intrinsic part of Crowbar’s architecture).

In our customer interactions, we hear clearly that Crowbar must remain flexible and ready to adapt to both customer on-site requirements and evolution within the OpenStack code base.  You are also telling us that there is a broader application space for Crowbar and we are listening to that too.

I believe that it will take some time for the community and markets to process today’s Rackspace announcements.  Rackspace is showing strong leadership in both sharing information and commercialization around OpenStack.  Both of these actions will drive responses from the community members.

Dell is open sourcing Crowbar Apache Hadoop barclamps!

I’m very excited to announce that my team at Dell will be open sourcing our Apache Hadoop Crowbar barclamps by the end of the month.

This release raises the bar on open Hadoop deployments by making them faster, scalable, more integrated and repeatable.

These barclamps were developed in conjunction with our licensed Dell | Cloudera Solution. The licensed solution is for customers seeking large scale and professionally supported big data solutions. The purpose of the open barclamps (which pull the open source parts from the Cloudera distro) is to help you get started with Hadoop and reduce your learning curve. Our team invested significant testing effort in ensuring that these barclamps work smoothly because they are the foundational layer of our for-pay Hadoop solution.

Included in the Hadoop barclamp suite are Hadoop Map Reduce, Hive, Pig, ZooKeeper and Sqoop running on RHEL 5.7. These barclamps cover the core parts of the Hadoop suite. Like other Crowbar deployments (see OpenStack), the barclamps automatically discover the service configurations and interoperate. One of our team members (call him Scott Jensen) said it very simply “I can deploy a fully an integrated Hadoop cluster in a few hours. That friggin’ rocks!” I just can’t put it more eloquently than that!

I’ll post again when we flip the “open” bit and invite our community to dig in and help us continue to set the standards on open Hadoop deployments.

For more perspectives on this release, check out posts by Barton George (just for devs), Joseph George (About Hadoop) and Aurelian Dumitru

Barton posted these two videos of me talking about the release too:

Hadoop & Crowbar:

Dev’s Only Short:

Talk with Team Crowbar! Online 11/8, Austin 11/15, Boston 11/29 & 11/29 & Seattle 11/30

My team at Dell has been getting a great response from our community about Crowbar. Thanks! We’re actively working a rock solid OpenStack deployment that will raise the bar on ease of deploy and drive operational excellence.

We have also heard that we need to improve access to the team; consequently, I’m delighted to announce a long list of places and dates where you can access us online AND in person.

Here’s the list:

Or in a calendar view:

Sun Mon Tuesday Wed Thursday Fri Sat
11/8 Online
Crowbar Chat
11/15 Austin
Cloud User
11/29 Boston
OpenStack Meetup
11/30 Seattle
Crowbar Drinks TBD
12/6 Boston
Opscode BoaF
12/8 Austin
OpenStack Meetup