Please support me for the OpenStack Policy Board

I’m posting my OpenStack bio here and asking for support putting me on the Policy Board by voting for me.  NOTE: You can only vote if you’re registered and you got the “Poll: OpenStack Governance Elections” email.

Project Policy Board Objective

I am seeking a role on the OpenStack Policy Board to further the adoption of OpenStack within and beyond the community.  As the OpenStack technology lead within Dell, I am the engineer who is most actively engaged with field deployments; consequently, I am uniquely positioned to represent our development community, hosters and enterprise user bases.  I bring substantial process experience (Agile/Lean/CI) into my decision making.  My focus will be on ensuring OpenStack is deployable and ready for use.

Background

I am a Principal Engineer at Dell working as the lead for our OpenStack cloud initiative (http://dell.com/openstack).  My team at Dell is responsible for bringing hyper-scale cloud solutions to market and works closely with our cloud optimized hardware division (DCS).  Before working on the OpenStack project, I was involved in cloud projects for Azure, Eucalyptus, and Joyent at Dell.
My involvement with OpenStack goes back to the very earliest days before the project was launched where I was part of the evaluation team that advocated for Dell to join the project.  Since then, I have been active participant at every design conference.  It was my recommendation that Dell focus on making deployment capabilities for OpenStack and to ensure that those contributions are open sourced (Apache 2).  At this point in the project, I am Dell’s technical authority on OpenStack for community and customer interactions.
My team is responsible for the Crowbar cloud deployer (http://github.com/dellcloudedge/crowbar).  The purpose of this project is to ensure that OpenStack is be quickly and reliably deployed in a wide range of configurations on any hardware platform.  I believe that ease of deployability is essential for the success of OpenStack as a project because it ensure adoption by non-developers.  I also believe strongly in continuous integration and am working to adapt Crowbar as a CI platform.  I have been the primary driver to ensure that the Crowbar project is open sourced and accepting of input from the community.
My team also designs technical reference architectures (RAs) for OpenStack.  These RAs help drive adoption by providing crisp guidance on how deploy OpenStack.  I am a vocal proponent of open operations (keeping best practices public) and following a DevOps approach for ongoing cloud deployment life-cycles.
In addition to my work at Dell, I work to ensure community access and communication.  My independent blog provides technical detail and insights about the OpenStack and other cloud initiatives.  My blog also focuses on Agile and Lean practices that I believe are essential to success in technology innovation.
I have been working with cloud computing since 2001.  The company I founded with Dave McCrory (@mccrory), now owned by Quest, was the first multi-server VMware ESX deployment ouside of VMware.  We pioneered the concept of elastic vm management (look up the patents!) so I have a very deep understanding of the problems and architectures required.

OpenStack Design Summit regististration open for DEVS! See you in October

Registration for the OpenStack Conference (which my team at Dell is co-sponsoring), which is now a separate event (from the User Conference the same week) with a separate registration mechanism.

From the OpenStack List:

The Essex Design Summit is an event for the OpenStack developers community (existing and prospective coders) to gather and discuss the upcoming development cycle. It is highly technical in nature and is not targeted to the general OpenStack public, which will find the OpenStack Conference much more enjoyable.

If you think you belong in the former category, please join us. The Summit is free to attend, but with limited attendance, so we require registration.

More details at: http://wiki.openstack.org/Summit/Essex

Crowbar near-term features: increasing DevOps mojo and brewing Diablo

We’ve been so busy working on getting RHEL support ready to drop into the Crowbar repos that I have not had time to post about what’s coming next for Crowbar. The RHEL addition has required a substantial amount of work to accommodate different packaging models and capabilities. This change moves Crowbar closer to being able allow nodes’ operating systems (the allocated TFTP Boot Image) to be unique per node.

I will post more forward looking details soon but wanted to prime the pump and invite suggestions from our community.

We are tracking two major features for delivery by the OpenStack October Design Conference

  1. OpenStack Diablo Barclamps. Expect to see individual barclamps for various components like Keystone, Dashboard, Glace, Nova, Swift, etc)
  2. Barclamp versioning / connected imports. This feature will enable Crowbar to pull in the latest components for barclamps from remote repositories. I consider this a critical feature for Crowbar’s core DevOps/CloudOps capabilities and to support more community development for barclamps.

We are also working on some UI enhancements

  • Merging together the barclamps/proposals/active views into a single view
  • Enabling bulk actions for nodes (description, BIOS types, and allocate)
  • Allowing users to set node names and showing the names throughout the UI
  • More clarity on state of proposal application process (stretch goal)

I am planning to post more about our design ideas as work begins.

If you want to help with Diablo barclamps, these will be worked in the open and we’d be happy to collaborate. We’re also open to suggestion for what’s next.

Cloudcast interview with Dell Cloud Solutions Team (quotes with time stamps)

Lloyds BarbershopTheCloudcast.net – Thank you for such a great series of questions. Wow, nearly 36+ minutes of cloudicious interview about the work my team at Dell is doing!

Thanks to our hosts for putting together a great series! They are:

Highlights from Episode 16: Dell, Dude you’re getting a cloud

  • 3:40 JBG “we are listening to our customers tell us what they want to accomplish”
  • 4:40 RAH “humility is part of [what we’re] doing … cloud is about learning and collaboration”
  • 6:40 RAH “OpenStack filled a niche. It was the first open source community cloud. … Not just open source, its open community.”
  • 7:15 RAH “We’re beyond critical mass. We’re seeing acceleration… we are transitioning into a community development.”
  • 7:30 RAH “It’s accelerating. It happening so fast.”
  • 8:00 RAH “We felt it was really important for people to be able to use it. We felt that it was important to get away from just people developing into people using. “
  • 8:57 – RAH “Cloud is not just one thing. You have to have all the pieces.”
  • 10: 15 – RAH “Cloud is always ready, never finished”
  • 10:50 – RAH “OpenStack is an alternative to public cloud including hosting providers seeking to offer their own cloud”
  • 12:40 – AD “Dell has been in the Big data space for many years now”
  • 20:15 – JBG “There’s a legacy of great partnerships that we leverage”
  • 20:48 – JBG “Conflicts have not come up because we are focused on the customer”
  • 21:30 – RAH “Shout out to Greg Althaus for solving these problems in such an elegant way. And we rewrote it 3 times”
  • 22:02 – RAH “Crowbar started from our frustration of bringing up a cloud quickly … so we took a DevOps approach.”
  • 22:41 – RAH “You had to have a system view AND a boot strapping view simultaneously”
  • 23:50 – JBG “Crowbar was born out of necessity because we were setting up and blowing away our clouds over and over and over again. “
  • 24:40 – JBG “We realized there were not many people thinking about all the pieces before OpenStack was installed”
  • 25:20 – RAH “We don’t think customers have all the answers before we show up. This is not unique to OpenStack.”
  • 28:20 – JBG “We’re seeing the community pick up Crowbar as a way to deploy”

Video of Crowbar Install & Introduction (two 15 min videos)

These Crowbar videos are the first two in a series of how to setup and use your own local Crowbar dev environment (here’s more info & the ISO).   I used VMware Workstation, but any virtual hosts that support Ubuntu 10.10 will work fine.  We use ESX, KVM and Xen for testing too.

Creating this environment is the basis for learning Crowbar, experimenting with OpenStack and creating your own barclamps.  It’s also a handy way to play with Opscode Chef since it includes a stand alone Chef server.

Some items from the install video that I want to re-emphasize:

  • The admin server REQUIRES >1 GB of RAM (more is better)
  • All other nodes REQUIRE > 512 MB of RAM (more depending on what you install)
  • At least TWO NICS are required.
  • You must disable DHCP on the virtual network because Crowbar has a DHCP server and they will conflict.
  • The login for the Crowbar admin server is openstack/openstack.  You must “sudo -i” before you run the install script.
  • The default url for Crowbar is http://192.168.124.10:3000  (crowbar/crowbar)

Installing Crowbar (15 mins)

Introduction to Crowbar (15 mins)

9/30 Bonus Video: How to change the default networking

Continue reading

Videos about Crowbar, CloudOps, and Dell OpenStack Cloud

I’m not usually a big fan of launch videos (too much markitecture); however, these turned out to be nice and meaty.  The meaty part explains why it looks like I’m about to eat a big sandwich in the last video.  yum!

  • What is Crowbar: Dell Crowbar Software Overview  

Continue reading

Crowbar build using Ubuntu 10.10 vm on Rackspace Cloud from Github Repo

Our OpenStack team at Dell (especially Victor Lowthor) has been working hard with the public Crowbar repos to make it possible for the community to build their own version of a Crowbar ISO.   When you build the ISO, you’ll be downloading a whole bunch (that’s the technical term) of open source licensed components to make it work: we’re trying to maintain a list of licenses on the Github wiki.

To make sure that it was possible for mortals, I signed up for a Ubuntu 10.10 VM (512 Mb RAM, $0.03/hr) at RackSpace Cloud.  I did this from a non-Dell to ensure that it was as independent from our source as possible.

Once I had my vm, there were just a few steps to follow (these are NOT verbatim):

  • apt-get install debootstrap, mkisofs, git, build-essential packages
  • git clone git://github.com/dellcloudedge/crowbar.git
  • Got the results from a sledgehammer build (a fresh sledgehammer tarball) and extracted it into $HOME/.crowbar-build-cache/tftpboot, which is where build_crowbar.sh expects to find it cached.
    • NOTE: I’m not ready to document sledgehammer builds yet, but I will tell you that you’d need a CentOS VM.
  • In the crowbar directory, ran ./build_crowbar.sh
  • The build will pull down all the packages that you need and cache them to the VM.  Subsequent builds will be much faster!

The end result of the build is an “openstack-dev.iso” that will install Crowbar with the OpenStack barclamps (here’s how to do it on VMs).  Just for fun, I copied _my build_ output ISO off the build VM and to my web server.

Please let me know if you have problems with this process, we want people to try Crowbar!

$$ Note: Turn off your VM when you’re done so you don’t incur extra expenses.  Since this process only took about 2 hours, the whole build cost me less than a dime.  Which is good, since I was building it on “my own dime” anyway.

Crowbar source released, includes OpenStack Cloud install

I’m delighted to announce (official version) that my team at Dell has opened the Crowbar source under the Apache 2 license. This action is part of the broader Dell OpenStack Cloud Solution which includes OpenStack install packages, Crowbar, reference hardware architectures, and services/consulting to support deployments.

There are two important components to this news:

  1. Dell is officially offering our OpenStack Solution and helping advance the community’s ability to implement OpenStack quickly and consistently.
  2. Dell is releasing the Crowbar code (which is included in the solution) as open source.

Both are significant items; however, my focus here is on the Crowbar release.

Crowbar started as a Dell OpenStack installer project and then grew beyond that in scope.  Now it can be extended to work with other vendors’ kits and other solutions bits.

We are contributing Crowbar to the community because we believe that everyone benefits by sharing in the operational practices that Crowbar embodies. These are rooted in Opscsode Chef (which Crowbar tightly integrates with) and the cloud & hyper-scale proven DevOps practices that are reflected in our deployment model.

Where to get it?

What’s included?

  • A comprehensive set of barclamps to set up an OpenStack cloud.
  • Crowbar UI and Remote APIs to make it easy to set up your cloud
  • Automated testing scripts for community members doing continuous integration with OpenStack.
  • Build scripts so you can create your own Crowbar install ISO
  • Switch discovery so you can create Chef Cookbooks that are network aware.
  • Open source Chef server that powers much of Crowar’s functionality

What’s not included?

  • Non-open source license components (BIOS+RAID config) that we could not distribute under the Apache 2 license.  We are working to address this and include them in our release.  They are available in the Dell Licensed version of Crowbar.
  • Dell Branded Components (skin + overview page).   Crowbar has an OpenSource skin with identical functionality.
  • Pre-built ISOs with install images (you must download the open source components yourself, we cannot redistribute them to you as a package)

Important notes:

  • Crowbar uses Chef Server as its database and relies on cookbooks for node deployments.  It is installed (using Chef Solo) automatically as part of the Crowbar install.
  • Crowbar has a modular architecture so individual components can be removed, extended, and added. These components are known individually as barclamps.
  • Each barclamp has its own Chef configuration, UI sub-component, deployment configuration, and documentation.

On the project roadmap:

  • Hadoop support
  • Additional operating system support (specifically RHEL)
  • Barclamp version repository
  • Network configuration
  • We’d like suggestions!  Please comment!

Sites for more information: Joseph George, Barton George (launch day), Dell

OpenStack at OSCON schedule & event signup

If you’re at OSCON, here’s where to find OpenStack content:

OpenStack Wednesday Evening Event (RSVP REQUIRED):

Wednesday, July 27, 7-9 pm, at Spirit of 77 (right across from the Oregon
Convention Center at the close of the day).  Join us to toast the first
anniversary of the fastest-growing open source project! Please register here and
help promote the event: http://openstack-one-year.eventbrite.com

Speaking Sessions, Wednesday, July 27


Introduction to OpenStack, Eric Day

Wednesday, 1:40 pm http://www.oscon.com/oscon2011/public/schedule/detail/19146

Using OpenStack APIs, Present and Future, Mike Mayo
Wednesday, 4:10 pm http://www.oscon.com/oscon2011/public/schedule/detail/18550

OpenStack Fundamentals Training Part 1, Swift, John Dickinson
Wednesday, 4:10 pm http://www.oscon.com/oscon2011/public/schedule/detail/21287

OpenStack Fundamentals Training Part 2, Nova, Jason Cannavale
Wednesday, 5:00 pm http://www.oscon.com/oscon2011/public/schedule/detail/21347

OpenStack One-Year Anniversary Party, Spirit of 77
Wednesday, 7-9 pm http://openstack-one-year.eventbrite.com/

Speaking Sessions, Thursday, July 28

See why Rob says “No Soup for You” about Cloud Deployments.

Prying Open the Cloud with Dell Crowbar and OpenStack, Joseph George, Rob Hirschfeld
Thursday, 10:40 am http://www.oscon.com/oscon2011/public/schedule/detail/21206

OpenStack + Ceph, Ben Cherian, Jonathan Bryce
Thursday, 1:40 pm http://www.oscon.com/oscon2011/public/schedule/detail/21174

Achieving Hybrid Cloud Mobility with OpenStack and XCP, Paul Voccio, Ewan Mellor
Thursday, 2:30 pm http://www.oscon.com/oscon2011/public/schedule/detail/18726

Crowbar design: solving the multi master update issue and adding a pause before configuration

The last few weeks for my team at Dell have been all about testing as Crowbar goes through our QA cycle and enters field testing. These activities are the run up to Dell open sourcing the bits.

The Crowbar testing cycle drove two significant architectural changes that are interesting as general challenges and important in the details for Crowbar adopters.

Challenge #1: Configuration Sequence.

Crowbar has control of every step of deployment from discovery, BIOS/RAID configuration, base image, core services and applications. That’s a great value prop but there’s a chicken and egg problem: how do you set the RAID for a system when you have not decided which applications you are going to install on it?

The urgency of solving this problem became obvious during our first full integration tests. Nova and Swift need very different hardware configurations. In our first Crowbar flows, we would configure the hardware before you selected the purpose of the node.  This was an effect of “rushing” into a Chef client ready state. 

We also needed a concept of collecting enough nodes to deploy a solution.  Building an OpenStack cloud requires that you have enough capacity to build the components of the system in the correct sequence.

Our solution was to inject a “pause” state just after node discovery.  In the current Crowbar state machine, nodes pause after discovery.  This allows you to assign them into the roles that you want them to play in your system.

In testing, we’ve found that the pause state helps manage the system deployment; however, it also added a new user action requirement. 

Challenge #2: Multi-Master Updates

In Chef, the owner of a node’s data in the centralized database is the node, not the server.  This is a logical (but not a typical) design pattern and has interesting side effects.  Specifically, updates from Chef Client runs on the nodes are considered authoritative and will over-write changes made on the server. 

This is correct behavior because Chef’s primary focus is updating the node (edge) and not the central system (core).  If the authority was reversed then we would miss critical changes that Chef effected on the nodes.   From this perspective, the server is a collection point for data that is owned/maintained at the nodes.

Unfortunately, Crowbar’s original design was to inject configuration into the Chef server’s node objects.  We found that Crowbar’s changes could be silently lost since the server is not the owner of the data.  This is not a locking issue – it is a data ownership issue.  Crowbar was not talking to the master of the data when it made updates!

To correct this problem, we (really Greg Althaus in a coding blitz) changed Crowbar to store data in a special role mapped to each node.  This works because roles are mastered on the server.  Crowbar can make reliable updates to the node’s dedicated role without worrying the remote data will override changes. 

This pattern is a better separation of concerns because Crowbar and barclamp configuration in stored in a very clearly delineated location (a role named crowbar-[node] and is not mixed with edge configuration data.

It turns out that these two design changes are tightly coupled.  Simultaneous edge/server writes became very common after we added the pause state.  They are infrequent for single node changes; however, the frequency increases when you are changing a system of interconnected nodes through multiple state.

More simply put: Crowbar is busy changing the node configs at the exactly same time the nodes are busy changing their own configuration.

Whew!  I hope that helped clarify some interesting design considerations behind Crowbar design.

Note: I want to repeat that Crowbar is not tied to Dell hardware! We have modules that are specifically for our BIOS/RAID, but Crowbar will happily do all the other great deployment work if those barclamps are missing.