To improve flow, we must view OpenStack community as a Software Factory

This post was sparked by a conversation at OpenStack Atlanta between OpenStack Foundation board members Todd Moore (IBM) and Rob Hirschfeld (Dell/Community).  We share a background in industrial and software process and felt that sharing lean manufacturing translates directly to helping face OpenStack challenges.

While OpenStack has done an amazing job of growing contributors, scale has caused our code flow processes to be bottlenecked at the review stage.  This blocks flow throughout the entire system and presents a significant risk to both stability and feature addition.  Flow failures can ultimately lead to vendor forking.

Fundamentally, Todd and I felt that OpenStack needs to address system flows to build an integrated product.  The post expands on the “hidden influencers” issue and adds an additional challenge because improving flow requires that the community influences better understands the need to optimize work inter-project in a more systematic way.

Let’s start by visualizing the “OpenStack Factory”

Factory Floor

Factory Floor from Alpha Industries Wikipedia page

Imagine all of OpenStack’s 1000s of developers working together in a single giant start-up warehouse.  Each project in its own floor area with appropriate fooz tables, break areas and coffee bars.  It’s easy to visualize clusters of intent developers talking around tables or coding in dark corners while PTLs and TC members dash between groups coordinating work.

Expand the visualization so that we can actually see the code flowing between teams as little colored boxes.  Giving project has a unique color allows us to quickly see dependencies between teams.  Some features are piled up waiting for review inside teams while others are waiting on pallets between projects waiting on needed cross features have not completed.  At release time, we’d be able to see PTLs sorting through stacks of completed boxes to pick which ones were ready to ship.

Watching a factory floor from above is a humbling experience and a key feature of systems thinking enlightenment in both The Phoenix Project and The Goal.  It’s very easy to be caught up in a single project (local optimization) and miss the broader system implications of local choices.

There is a large body of work about Lean Process for Manufacturing

You’ve already visualized OpenStack code creation as a manufacturing floor: it’s a small step to accept that we can use the same proven processes for software and physical manufacturing.

As features move between teams (work centers), it becomes obvious that we’ve created a very highly interlocked sequence of component steps needed to deliver product; unfortunately, we have minimal coordination between the owners of the work centers.  If a feature is needs a critical resource (think programmer) to progress then we rely on the resource to allocate time to the work.  Since that person’s manager may not agree to the priority, we have a conflict between system flow and individual optimization.

That conflict destroys flow in the system.

The number #1 lesson from lean manufacturing is that putting individual optimization over system optimization reduces throughput.  Since our product and people managers are often competitors, we need to work doubly hard to address system concerns.  Worse yet our inventory of work in process and the interdependencies between projects is harder to discern.  Unlike the manufacturing floor, our developers and project leads cannot look down upon it and see the physical work as it progresses from station to station in one single holistic view.  The bottlenecks that throttle the OpenStack workflow are harder to see but we can find them, as can be demonstrated later in this post.

Until we can engage the resource owners in balancing system flow, OpenStack’s throughput will decline as we add resources.  This same principle is at play in the famous aphorism: adding developers makes a late project later.

Is there a solution?

There are lessons from Lean Manufacturing that can be applied

  1. Make quality a priority (expand tests from function to integration)
  2. Ensure integration from station to station (prioritize working together over features)
  3. Make sure that owners of work are coordinating (expose hidden influencers)
  4. Find and mange from the bottleneck (classic Lean says find the bottleneck and improve that)
  5. Create and monitor a system view
  6. Have everyone value finished product, not workstation output

Added Subscript: I highly recommend reading Daniel Berrange’s email about this.

7 takeaways from DevOps Days Austin

Block Tables

I spent Tuesday and Wednesday at DevOpsDays Austin and continue to be impressed with the enthusiasm and collaborative nature of the DOD events.  We also managed to have a very robust and engaged twitter backchannel thanks to an impressive pace set by Gene Kim!

I’ve still got a 5+ post backlog from the OpenStack summit, but wanted to do a quick post while it’s top of mind.

My takeaways from DevOpsDays Austin:

  1. DevOpsDays spends a lot of time talking about culture.  I’m a huge believer on the importance of culture as the foundation for the type of fundamental changes that we’re making in the IT industry; however, it’s also a sign that we’re still in the minority if we have to talk about culture evangelism.
  2. Process and DevOps are tightly coupled.  It’s very clear that Lean/Agile/Kanban are essential for DevOps success (nice job by Dominica DeGrandis).  No one even suggested DevOps+Waterfall as a joke (but Patrick Debois had a picture of a xeroxed butt in his preso which is pretty close).
  3. Still need more Devs people to show up!  My feeling is that we’ve got a lot of operators who are engaging with developers and fewer developers who are engaging with operators (the “opsdev” people).
  4. Chef Omnibus installer is very compelling.  This approach addresses issues with packaging that were created because we did not have configuration management.  Now that we have good tooling we separate the concerns between bits, configuration, services and dependencies.  This is one thing to watch and something I expect to see in Crowbar.
  5. The old mantra still holds: If something is hard, do it more often.
  6. Eli Goldratt’s The Goal is alive again thanks to Gene Kims’s smart new novel, The Phoenix project, about DevOps and IT  (I highly recommend both, start with Kim).
  7. Not DevOps, but 3D printing is awesome.  This is clearly a game changing technology; however, it takes some effort to get right.  Dell brought a Solidoodle 3D printer to the event to try and print OpenStack & Crowbar logos (watch for this in the future).

I’d be interested in hearing what other people found interesting!  Please comment here and let me know.