Cloudcast interview with Dell Cloud Solutions Team (quotes with time stamps)

Lloyds BarbershopTheCloudcast.net – Thank you for such a great series of questions. Wow, nearly 36+ minutes of cloudicious interview about the work my team at Dell is doing!

Thanks to our hosts for putting together a great series! They are:

Highlights from Episode 16: Dell, Dude you’re getting a cloud

  • 3:40 JBG “we are listening to our customers tell us what they want to accomplish”
  • 4:40 RAH “humility is part of [what we’re] doing … cloud is about learning and collaboration”
  • 6:40 RAH “OpenStack filled a niche. It was the first open source community cloud. … Not just open source, its open community.”
  • 7:15 RAH “We’re beyond critical mass. We’re seeing acceleration… we are transitioning into a community development.”
  • 7:30 RAH “It’s accelerating. It happening so fast.”
  • 8:00 RAH “We felt it was really important for people to be able to use it. We felt that it was important to get away from just people developing into people using. “
  • 8:57 – RAH “Cloud is not just one thing. You have to have all the pieces.”
  • 10: 15 – RAH “Cloud is always ready, never finished”
  • 10:50 – RAH “OpenStack is an alternative to public cloud including hosting providers seeking to offer their own cloud”
  • 12:40 – AD “Dell has been in the Big data space for many years now”
  • 20:15 – JBG “There’s a legacy of great partnerships that we leverage”
  • 20:48 – JBG “Conflicts have not come up because we are focused on the customer”
  • 21:30 – RAH “Shout out to Greg Althaus for solving these problems in such an elegant way. And we rewrote it 3 times”
  • 22:02 – RAH “Crowbar started from our frustration of bringing up a cloud quickly … so we took a DevOps approach.”
  • 22:41 – RAH “You had to have a system view AND a boot strapping view simultaneously”
  • 23:50 – JBG “Crowbar was born out of necessity because we were setting up and blowing away our clouds over and over and over again. “
  • 24:40 – JBG “We realized there were not many people thinking about all the pieces before OpenStack was installed”
  • 25:20 – RAH “We don’t think customers have all the answers before we show up. This is not unique to OpenStack.”
  • 28:20 – JBG “We’re seeing the community pick up Crowbar as a way to deploy”

Collaboration between Dell Crowbar & VMware Cloud Foundry – unleashes your inner cloud

Sometimes a single sprint can deliver magic: when I signed up to document how to create a Crowbar module (aka a barclamp) two weeks ago, I had no idea that it would add a new flavor to Crowbar .

I’m proud to announce that the first public non-Dell Crowbar module will be supporting the VMware Cloud Foundry Open PaaS project.

Development is still in progress (on the Crowbar “CF” branch) and you’ll be able to watch us (even help!) collaborate on this project.  Initially, the deployment will be to single server but we’re hoping to quickly expand to a distributed install that fully leverages the capabilities of both projects.

By creating a Crowbar module, Cloud Foundry™ is able to leverage the cloud deployment capabilities that allow it to be setup on any physical or virtualized data center.  This is core to the Crowbar message: the value of a cloud solution can best be realized when it’s coupled with open practices for deploying it.

There are many significant aspects of this collaboration:

  1. Cloud Foundry is taking the right approach to PaaS.  Their team’s perspective on PaaS mirrors my own: A PaaS is a collection of application services.  That approach makes it extensible and flexible.  Plus, they are also multi-language and multi-platform.
  2. Crowbar is proving our breadth of support.  Last week we announced coming RHEL support and now adding Cloud Foundry is a natural extension.  We did not design Crowbar to be a one-trick pony.  It’s modular design makes it easy to extend while leveraging the existing body of work.
  3. Big companies are acting like start-ups.  Both Crowbar and Cloud Foundry are projects that focus on putting the core functionality out quickly to prove their value proposition, get feedback, and change the game.  This collaboration is positive proof of these companies being Agile and starting a project Lean.
  4. Big companies are acting in the open.  Both Dell via Crowbar and VMware via Cloud Foundry are contributing their source and working on it in the open.

Stay tuned for that “how to create a barclamp” post (or check out the barclamp rake task).

For more information:

Build Sledgehammer, the Crowbar discovery image / build prerequisite

Note: This content has been copied to the Crowbar Wiki.
Victor “got your back” Lowther, CI & build automation czar on our team at Dell, spent a lot of time cleaning up the open source build to make it MUCH easier.  The latest build only requires ONE server for all components.  To make it repeatable and fast, I’m using a hosted VM from Rackspace Cloud.
Here are the steps that you should follow (cool: if you build before the prereqs are in place, the script will tell you what’s missing).
Note: You must build the discovery image (build_sledgehammer.sh) before building Crowbar.  This image does not change very often, so it’s helpful to cache it somewhere (like in the Crowbar cache where it normally lives) and save time.
  1. Starting from a Rackspace Cloud Ubuntu 10.10 image (512 RAM is OK, $0.03/hr)
  2. Get libraries for git, RPM, & Ruby: apt-get install git rpm ruby
  3. Get the sledgehammer repo: git clone git://github.com/dellcloudedge/crowbar-sledgehammer.git
  4. Go to sledgehammer: cd crowbar-sledgehammer
  5. Download the CentOS image: curl -o ../CentOS-5.6-x86_64-bin-DVD-1of2.iso http://mirror.cs.vt.edu/pub/CentOS/5.6/isos/x86_64/CentOS-5.6-x86_64-bin-DVD-1of2.iso
    1. takes some time (10+ mins) even in the cloud
  6. Tell the build where to look for the CentOS image: CENTOS_ISO=~/CentOS-5.6-x86_64-bin-DVD-1of2.iso ./build_sledgehammer.sh
    1. you may need to change the path of the image if you did not put it in your home directory
    1. wait a long time while magic happens and the tar gets created
    2. check out the tar ball in the /bin directory!
  7. Create the cache location for Sledgehammer: mkdir -p ~/.crowbar-build-cache
  8. Move the the cache location: cd ~/.crowbar-build-cache
  9. Extract the Sledgehammer tar: tar xzvf ~/crowbar-sledgehammer/bin/sledgehammer-tftpboot.tar.gz 
Or, use the tar copy that I’ve cached it on zehicle.com!  Then you can start at step 8.
Now you can build crowbar as per instructions (duplicated below)
  1. cd ~
  2. git clone git://github.com/dellcloudedge/crowbar.git
  3. apt-get update
  4. apt-get install build-essential mkisofs debootstrap
  5. crowbar/build_crowbar.sh
    1. kicks off a long download to create the cache (first time only!)
    2. look in the home directory for the openstack-dev.iso

Of course, you still need to INSTALL CROWBAR (as root, /tftpboot/ubuntu_dvd/extra/install) after you use the ISO to boot a VM.  Instructions on that shortly…

Videos about Crowbar, CloudOps, and Dell OpenStack Cloud

I’m not usually a big fan of launch videos (too much markitecture); however, these turned out to be nice and meaty.  The meaty part explains why it looks like I’m about to eat a big sandwich in the last video.  yum!

  • What is Crowbar: Dell Crowbar Software Overview  

Continue reading

Crowbar source released, includes OpenStack Cloud install

I’m delighted to announce (official version) that my team at Dell has opened the Crowbar source under the Apache 2 license. This action is part of the broader Dell OpenStack Cloud Solution which includes OpenStack install packages, Crowbar, reference hardware architectures, and services/consulting to support deployments.

There are two important components to this news:

  1. Dell is officially offering our OpenStack Solution and helping advance the community’s ability to implement OpenStack quickly and consistently.
  2. Dell is releasing the Crowbar code (which is included in the solution) as open source.

Both are significant items; however, my focus here is on the Crowbar release.

Crowbar started as a Dell OpenStack installer project and then grew beyond that in scope.  Now it can be extended to work with other vendors’ kits and other solutions bits.

We are contributing Crowbar to the community because we believe that everyone benefits by sharing in the operational practices that Crowbar embodies. These are rooted in Opscsode Chef (which Crowbar tightly integrates with) and the cloud & hyper-scale proven DevOps practices that are reflected in our deployment model.

Where to get it?

What’s included?

  • A comprehensive set of barclamps to set up an OpenStack cloud.
  • Crowbar UI and Remote APIs to make it easy to set up your cloud
  • Automated testing scripts for community members doing continuous integration with OpenStack.
  • Build scripts so you can create your own Crowbar install ISO
  • Switch discovery so you can create Chef Cookbooks that are network aware.
  • Open source Chef server that powers much of Crowar’s functionality

What’s not included?

  • Non-open source license components (BIOS+RAID config) that we could not distribute under the Apache 2 license.  We are working to address this and include them in our release.  They are available in the Dell Licensed version of Crowbar.
  • Dell Branded Components (skin + overview page).   Crowbar has an OpenSource skin with identical functionality.
  • Pre-built ISOs with install images (you must download the open source components yourself, we cannot redistribute them to you as a package)

Important notes:

  • Crowbar uses Chef Server as its database and relies on cookbooks for node deployments.  It is installed (using Chef Solo) automatically as part of the Crowbar install.
  • Crowbar has a modular architecture so individual components can be removed, extended, and added. These components are known individually as barclamps.
  • Each barclamp has its own Chef configuration, UI sub-component, deployment configuration, and documentation.

On the project roadmap:

  • Hadoop support
  • Additional operating system support (specifically RHEL)
  • Barclamp version repository
  • Network configuration
  • We’d like suggestions!  Please comment!

Sites for more information: Joseph George, Barton George (launch day), Dell

Crowbar modules (aka barclamps) perform many functions and enable multi-vendor hardware

10/18 Update:

More recent information about Barclamps can be found at https://robhirschfeld.com/2011/09/14/details-of-crowbar-changes/.  We’ve also created videos showing how you can create your own barclamps.

Original Post

Just after we’d started deep Crowbar development, Andi Abes, Paul Webster and Victor Lowther joined the Dell Crowbar+OpenStack team.  They immediately started to dig into our Swift, BIOS/RAID, and Network components.  They also started to bump into each other in our original code base.  It quickly became apparent that we needed to modularize Crowbar.

Restructuring Crowbar into modules has proved essential as a method for safe community collaboration.

Greg Althaus coined the name “barclamps” during the modularization rearchitecture.  A barclamp is a class extension of the Crowbar ServiceObject that allows Crowbar to identify the Chef components used by the barclam

p (name p

attern in Chef is bc-template-[barclamp]) and provides capabilities that are specific to each barclamp.

  • In the simplest case, the barclamp is a minimal wrapper that just provides naming hooks for your Chef cookbooks.  This makes it very easily to adapt existing Chef work to work with Crowbar.
  • In more complex cases, the barclamp will help identity how nodes are allocated, interacts with other barclamps, extends the provisioner state machine and provides custom user interfaces.
  • In most cases, the barclamp’s generic integration and UI are sufficient.
Initially, barclamps were entirely exposed via REST using the ServiceObject.  We quickly wrapped those into a CLI for our continuous integration system.  Lately, we’ve expressed them in the user interface.
At launch, you’ll find all but two in the open source repository.  Unfortunately, we were not able to include BIOS and RAID barclamps in the open version because they use licensed components – we are working to correct this.  They are available in the Dell licensed version.
When looking at the barclamps, it is critical to understand that even the most core Crowbar functionality is expressed as a barclamp.
This exposure of Crowbar internals as barclamps is important because it
  1. helps modularize the code and
  2. reflects the deep integration between Chef and Crowbar.

Consequently, the core logic of the state machine, networking configuration, and provisioning are all exposed in barclamps.  This makes it possible to modify and extend the most basic Crowbar operations; however, there are currently no guards against breaking these barclamps either!

The following list includes all the barclamps that we’ve created for Crowbar.
Barclamp   Function  Included
Crowbar The roles and recipes to set up the barclamp framework.  Yes
Deployer Initial classification system for the Crowbar environment (aka the state machine)  Yes
Provisioner The roles and recipes to set up the provisioning server and a base environment for all nodes  Yes
Network Instantiates network interfaces on the crowbar managed systems. Also manages the address pool.  Yes
NTP Common NTP service for the cluster. An NTP server or servers can be specified and all other nodes will be clients of them.  Yes
DNS manages the DNS subsystem for the cluste  Yes
Logging centralized logging system based on syslog  Yes
IPMI Integrates with IP management to allow direct hardware control bypassing the operating system.  Yes
RAID LSI Licensed components.  Cannot be included in open source release at this time.  No
BIOS
PowerEdge C series: Dell License component.  Cannot be included in open source release at this time.  No
Ganglia Optional: a common Ganglia service for the cluster that can be used by other barclamps  Yes
Nagios Optional : common monitoring service for the cluster that can be used by other barclamps  Yes
Nova OpenStack: installs and configures the Openstack Nova (Cactus Release) component. It relies upon the network and glance barclamps for normal operation.  Yes
Swift OpenStack: part of Openstack (Cactus Release) , and provides a distributed blob storage  Yes
Glance OpenStack: Glance service (Cactus Release, Nova image management) for the cloud  Yes
Test provides a shell for writing tests against  Yes

Crowbar design: solving the multi master update issue and adding a pause before configuration

The last few weeks for my team at Dell have been all about testing as Crowbar goes through our QA cycle and enters field testing. These activities are the run up to Dell open sourcing the bits.

The Crowbar testing cycle drove two significant architectural changes that are interesting as general challenges and important in the details for Crowbar adopters.

Challenge #1: Configuration Sequence.

Crowbar has control of every step of deployment from discovery, BIOS/RAID configuration, base image, core services and applications. That’s a great value prop but there’s a chicken and egg problem: how do you set the RAID for a system when you have not decided which applications you are going to install on it?

The urgency of solving this problem became obvious during our first full integration tests. Nova and Swift need very different hardware configurations. In our first Crowbar flows, we would configure the hardware before you selected the purpose of the node.  This was an effect of “rushing” into a Chef client ready state. 

We also needed a concept of collecting enough nodes to deploy a solution.  Building an OpenStack cloud requires that you have enough capacity to build the components of the system in the correct sequence.

Our solution was to inject a “pause” state just after node discovery.  In the current Crowbar state machine, nodes pause after discovery.  This allows you to assign them into the roles that you want them to play in your system.

In testing, we’ve found that the pause state helps manage the system deployment; however, it also added a new user action requirement. 

Challenge #2: Multi-Master Updates

In Chef, the owner of a node’s data in the centralized database is the node, not the server.  This is a logical (but not a typical) design pattern and has interesting side effects.  Specifically, updates from Chef Client runs on the nodes are considered authoritative and will over-write changes made on the server. 

This is correct behavior because Chef’s primary focus is updating the node (edge) and not the central system (core).  If the authority was reversed then we would miss critical changes that Chef effected on the nodes.   From this perspective, the server is a collection point for data that is owned/maintained at the nodes.

Unfortunately, Crowbar’s original design was to inject configuration into the Chef server’s node objects.  We found that Crowbar’s changes could be silently lost since the server is not the owner of the data.  This is not a locking issue – it is a data ownership issue.  Crowbar was not talking to the master of the data when it made updates!

To correct this problem, we (really Greg Althaus in a coding blitz) changed Crowbar to store data in a special role mapped to each node.  This works because roles are mastered on the server.  Crowbar can make reliable updates to the node’s dedicated role without worrying the remote data will override changes. 

This pattern is a better separation of concerns because Crowbar and barclamp configuration in stored in a very clearly delineated location (a role named crowbar-[node] and is not mixed with edge configuration data.

It turns out that these two design changes are tightly coupled.  Simultaneous edge/server writes became very common after we added the pause state.  They are infrequent for single node changes; however, the frequency increases when you are changing a system of interconnected nodes through multiple state.

More simply put: Crowbar is busy changing the node configs at the exactly same time the nodes are busy changing their own configuration.

Whew!  I hope that helped clarify some interesting design considerations behind Crowbar design.

Note: I want to repeat that Crowbar is not tied to Dell hardware! We have modules that are specifically for our BIOS/RAID, but Crowbar will happily do all the other great deployment work if those barclamps are missing.

I’ll be at OSCON 7/25-29/11 (Dell=sponsor & speaking w/ @jbgeorge)

As part of our commitment to open source, Dell is a sponsor of OSCON 2011.  The Dell OpenStack Cloud team will have a booth presence with our well-travelled Crowbar Install rack (now with BOTH PowerEdge C6100 & C6105s).  We’re doing our famous 30 minute OpenStack installs and handing out goodies including USB keys. 

Joseph George (@jbgeorge) and I are speaking:

We’ll be giving specifics about how Crowbar works to deliver the Dell OpenStack Cloud Solution including a narrated demo and details about how the community can extend Crowbar using barclamps.

Note:  Stephen Spector (opnstk_com_mgr), the amazing OpenStack community manager, wanted me to remind everyone that we’re celebrating OpenStack’s 1 year anniversary with activites at OSCON.  He’s asking for video commentary about OpenStack and RSVPs if you can attend the events.  More at OpenStack Blog.

Hungry for Nova Cuisine? Adding Chef recipes for OpenStack Nova

As promised, here’s the other drop in advance of our OpenStack team’s Crowbar release. 

This is the second part of the Swift and Nova recipes that we are intentionally leaking out to the community.

USAGE NOTE: These recipes are designed to work with Crowbar!  They are not intended to stand alone.

As part of our collaboration with Opscode, Matt Ray, has been merging our recipes into his most excellent OpenStack cookbook tree.  If you want to see our unmerged recipes, we’re also posting those to our github

In addition to our Swift recipes, you can now check out the Nova recipes.

ADDITIONAL USAGE NOTE: The Matt’s tree is more complete – these are released for reference only.  They will ultimately be maintained as part of the Crowbar.

The OpenStack Rocket: How Citrix, Dell & Rackspace collaboration propels OpenStack

OpenStack has grown amazingly and picked up serious corporate support in its first year.  Understanding how helps to explain why the initiative has legs and where adopters should invest.  While OpenStack has picked up a lot of industry partners, early participation by Dell and Citrix have been important to a meteoric trajectory.

So why are we (Dell) working so hard to light a fire under OpenStack?

To explain OpenStack support and momentum, we have to start with a self-reflective fact: Dell (my employer), Citrix and Rackspace are seeking to gain the dominate positions in their respective areas of “cloud.”  Individually, we have competitors in more entrenched positions for cloud silos in solutions, ecosystem, and hypervisor; however, our competitors are not acting in concertto maintain their position.   

OpenStack offers an opportunity for companies trying to gain market share to leverage the strengths of partners against their competitors.

The surprising aspect of OpenStack is how much better this collaboration is working in practice than in theory.  This, dare I say it, synergistic benefit comes to OpenStack from all the partners (Dell, Citrix, Rackspace and others) working together because:

  1. Real scale clouds are big and complex with a lot of moving pieces (too big for a single vendor)
  2. A vibrant ecosystem needs to see commercial commitment from large players (and avoids lock-in)
  3. Having alternatives drives innovation and manages costs (because differentiation is still needed)
  4. Interlocking expertise is required (because no vendor has all the pieces)
  5. Customer needs are diverse and changing (since cloud is accelerating the rate of innovation)

Today, three major cloud players are standing together to demonstrate commitment to the community.  This announcement is a foundation for the OpenStack ecosystem.  In the next few months, I expect to see more and more collaborative announcements as the community proves the value of working together.

By aligning around an open platform, we collectively out flank previously dominant players who choose to go it alone.  The technology is promising; however, the power of OpenStack flows mainly from the ecosystem that we are building together.