OpenStack DefCore Process Draft Posted for Review [major milestone]

OpenStack DefCore Committee is looking for community feedback about the proposed DefCore Process.

Golden PathMarch has been a month for OpenStack DefCore milestones.  At the March Board meeting, we approved the first official DefCore Guideline (called DefCore 2015.03) and we are poised to commit the first DefCore Process draft.

Once this initial commit is approved by the DefCore Committee (expected at DefCore Scale.8 Meeting 3/25 @ 9 PT), we’ll be ready for broader input by the community using the standard OpenStack Gerrit review process.  If you are not comfortable with Gerrit, we’ll take your input anyway that you want to give it except via telepathy (we’ve already got a lot on our minds).

Note: We’re also looking for input on the Guideline targeted for 2015.04,

The DefCore Process documents the rules (who, what, when and where) that will govern how we create the DefCore Guidelines.  By design, it has to be detailed and specific without adding complexity and confusion.  The why of DefCore is all that work we did on principles that shape the process.

This process reflects nearly a year of gestation starting from the June 2014 DefCore face-to-face.  Once of the notable recent refinements was to organize material into time phases and to be more specific about who is responsible for specific actions.

To make review easier, I’ve reposted the draft.  Comments are welcome here and on the patch (and here after it lands).

DRAFT: OpenStack DefCore Process 2015A (reposted from OpenStack/DefCore)

This document describes the DefCore process required by the OpenStack bylaws and approved by the OpenStack Technical Committee and Board.

Expected Time line:

Time Frame Milestone Activities Lead By
-3 months S-3 “preliminary” draft (from current) DefCore
-2 months S-2 ID new Capabilities Community
-1 month S-1 Score capabilities DefCore
Summit S “solid” draft Community
Advisory items selected DefCore
+1 month S+1 self-testing Vendors
+2 months S+2 Test Flagging DefCore
+3 months S+3 Approve Guidance Board

Note: DefCore may accelerate the process to correct errors and omissions.

Process Definition

Continue reading

To improve flow, we must view OpenStack community as a Software Factory

This post was sparked by a conversation at OpenStack Atlanta between OpenStack Foundation board members Todd Moore (IBM) and Rob Hirschfeld (Dell/Community).  We share a background in industrial and software process and felt that sharing lean manufacturing translates directly to helping face OpenStack challenges.

While OpenStack has done an amazing job of growing contributors, scale has caused our code flow processes to be bottlenecked at the review stage.  This blocks flow throughout the entire system and presents a significant risk to both stability and feature addition.  Flow failures can ultimately lead to vendor forking.

Fundamentally, Todd and I felt that OpenStack needs to address system flows to build an integrated product.  The post expands on the “hidden influencers” issue and adds an additional challenge because improving flow requires that the community influences better understands the need to optimize work inter-project in a more systematic way.

Let’s start by visualizing the “OpenStack Factory”

Factory Floor

Factory Floor from Alpha Industries Wikipedia page

Imagine all of OpenStack’s 1000s of developers working together in a single giant start-up warehouse.  Each project in its own floor area with appropriate fooz tables, break areas and coffee bars.  It’s easy to visualize clusters of intent developers talking around tables or coding in dark corners while PTLs and TC members dash between groups coordinating work.

Expand the visualization so that we can actually see the code flowing between teams as little colored boxes.  Giving project has a unique color allows us to quickly see dependencies between teams.  Some features are piled up waiting for review inside teams while others are waiting on pallets between projects waiting on needed cross features have not completed.  At release time, we’d be able to see PTLs sorting through stacks of completed boxes to pick which ones were ready to ship.

Watching a factory floor from above is a humbling experience and a key feature of systems thinking enlightenment in both The Phoenix Project and The Goal.  It’s very easy to be caught up in a single project (local optimization) and miss the broader system implications of local choices.

There is a large body of work about Lean Process for Manufacturing

You’ve already visualized OpenStack code creation as a manufacturing floor: it’s a small step to accept that we can use the same proven processes for software and physical manufacturing.

As features move between teams (work centers), it becomes obvious that we’ve created a very highly interlocked sequence of component steps needed to deliver product; unfortunately, we have minimal coordination between the owners of the work centers.  If a feature is needs a critical resource (think programmer) to progress then we rely on the resource to allocate time to the work.  Since that person’s manager may not agree to the priority, we have a conflict between system flow and individual optimization.

That conflict destroys flow in the system.

The number #1 lesson from lean manufacturing is that putting individual optimization over system optimization reduces throughput.  Since our product and people managers are often competitors, we need to work doubly hard to address system concerns.  Worse yet our inventory of work in process and the interdependencies between projects is harder to discern.  Unlike the manufacturing floor, our developers and project leads cannot look down upon it and see the physical work as it progresses from station to station in one single holistic view.  The bottlenecks that throttle the OpenStack workflow are harder to see but we can find them, as can be demonstrated later in this post.

Until we can engage the resource owners in balancing system flow, OpenStack’s throughput will decline as we add resources.  This same principle is at play in the famous aphorism: adding developers makes a late project later.

Is there a solution?

There are lessons from Lean Manufacturing that can be applied

  1. Make quality a priority (expand tests from function to integration)
  2. Ensure integration from station to station (prioritize working together over features)
  3. Make sure that owners of work are coordinating (expose hidden influencers)
  4. Find and mange from the bottleneck (classic Lean says find the bottleneck and improve that)
  5. Create and monitor a system view
  6. Have everyone value finished product, not workstation output

Added Subscript: I highly recommend reading Daniel Berrange’s email about this.

OpenStack DefCore Process Flow: Community Feedback Cycles for Core [6 points + chart]

If you’ve been following my DefCore posts, then you already know that DefCore is an OpenStack Foundation Board managed process “that sets base requirements by defining 1) capabilities, 2) code and 3) must-pass tests for all OpenStack™ products. This definition uses community resources and involvement to drive interoperability by creating the minimum standards for products labeled OpenStack™.”

In this post, I’m going to be very specific about what we think “community resources and involvement” entails.

The draft process flow chart was provided to the Board at our OSCON meeting without additional review.  It below boils down to a few key points:

  1. We are using the documents in the Gerrit review process to ensure that we work within the community processes.
  2. Going forward, we want to rely on the technical leadership to create, cluster and describe capabilities.  DefCore bootstrapped this process for Havana.  Further, Capabilities are defined by tests in Tempest so test coverage gaps (like Keystone v2) translate into Core gaps.
  3. We are investing in data driven and community involved feedback (via Refstack) to engage the largest possible base for core decisions.
  4. There is a “safety valve” for vendors to deal with test scenarios that are difficult to recreate in the field.
  5. The Board is responsible for approving the final artifacts based on the recommendations.  By having a transparent process, community input is expected in advance of that approval.
  6. The process is time sensitive.  There’s a need for the Board to produce Core definition in a timely way after each release and then feed that into the next one.  Ideally, the definitions will be approved at the Board meeting immediately following the release.

DefCore Process Draft

Process shows how the key components: designated sections and capabilities start from the previous release’s version and the DefCore committee manages the update process.  Community input is a vital part of the cycle.  This is especially true for identifying actual use of the capabilities through the Refstack data collection site.

  • Blue is for Board activities
  • Yellow is or user/vendor community activities
  • Green is for technical community activities
  • White is for process artifacts

This process is very much in draft form and any input or discussion is welcome!  I expect DefCore to take up formal review of the process in October.

Back of the Napkin to Presentation in 30 seconds

I wanted to share a handy new process for creating presentations that I’ve been using lately that involves using cocktail napkins, smart phones and Google presentations.

Here’s the Process:

  1. sketch an idea out with my colleagues on a napkin, whiteboard or notebook during our discussion.
  2. snap a picture and upload it to my Google drive from my phone,
  3. import the picture into my presentation using my phone,
  4. tell my team that I’ve updated the presentation using Slack on my phone.

Clearly, this is not a finished presentation; however, it does serve to quickly capture critical content from a discussion without disrupting the flow of ideas.  It also alerts everyone that we’re adding content and helps frame what that content will be as we polish it.  When we immediately position the napkin into a deck, it creates clear action items and reference points for the team.

While blindingly simple, having a quick feedback loop and visual placeholders translates into improved team communication.

Supply Chain Transparency drives Open Source adoption, 6 reasons besides cost

Author’s note: If you don’t believe that software is manufactured then go directly to your TRS80, do not collect $200.

I’m becoming increasingly impatient with people stating that “open source is about free software” because it’s blatantly untrue as a primary driver for corporate adoption.   Adopting open source often requires companies (and individuals) to trade-off one cost (license expense) for another (building expertise).  It is exactly the same balance we make between insourcing, partnering and outsourcing.

Full Speed Ahead

When I probe companies about what motivates their use of open source, they universally talk about transparency of delivery, non-single-vendor ownership of the source and their ability to influence as critical selection factors.  They are generally willing to invest more to build expertise if it translates into these benefits.  Viewed in this light, licensed software or closed services both cost more and introduce significant business risks where open alternatives exist.

This is not new: its basic manufacturing applied to IT

We had this same conversation in the 90s around manufacturing as that industry joltingly shifted from batch to just-in-time (aka Lean) manufacturing.  The key driver for that transformation was improved integration and management of supply chains.   We review witty doctoral dissertations about inventory, drum-buffer-rope flow and economic order quantity; however, trust my summary that it all comes down to companies need supply chain transparency.

As technology becomes more and more integral to delivering any type of product, companies must extend their need for supply chain transparency into their IT systems too.   That does not mean that companies expect to self-generate (insource) all of their technology.  The goal is to manage the supply chain, not to own every step.   Smart companies find a balance between control of owning their supply (making it themselves) and finding a reliable supply (multi-source is preferred).  If you cannot trust your suppliers then you must create inventory buffers and rigid contracts.  Both of these defenses limit agility and drive systemic dysfunction.  This was the lesson learned from Lean Just-In-Time manufacturing.

What does this look like for IT supply chains?

A healthy supply chain allows companies to address these issues.  They can:

  1. Change vendors / suppliers and get equivalent supply
  2. Check the status of deliveries (features)
  3. Review and impact quality
  4. Take deliverables in small frequent batches
  5. Collaborate with suppliers to manage & control the process
  6. Get visibility into the pipeline

None of these items are specific to software; instead, they are general attributes of a strong supply chain.  In a closed system, companies lose these critical supply chain values.  While tightly integrated partnerships can provide these benefits, they carry a cost premium and inherently limit vendor choice.

This sounds great!  What’s the cost?

You need to consider the level of supply chain transparency that’s right for you.  Most companies are no more likely to refine their own metal than to build from pure open source repositories.  There are transparency benefits from open source even from a single supplier.  Yet in some cases like the OpenStack community, systems are so essential that they are warrant investing as core competencies and joining the contributing community.  Even in those cases, most rely on vendors to package and extend their chosen open source software.

But that misses the point: contributing to an open source project is not required in managing your IT supply chain.  Instead, you need to build the operational infrastructure and processes that is open source ready.  They may require investing in skills and capabilities related to underlying technologies like the operating system, database or configuration management.  For cloud, it is likely to require more investment fault-tolerant architecture and API driven deployment.  Companies that are strong in these skills are better able to manage an open source IT supply chain.  In fact, they are better able to manage any IT supply chain because they have more control.

So, it’s not about cost…

When considering motivations for open source adoption, cost (or technology sizzle) should not be the primary factor.  In my experience, the most successful implementations focus first about operational readiness and project stability, and program transparency.  These questions indicate companies are thinking with an IT supply chain focus.

PS: If you found this interesting, you’ll also like my upstream imperative post.

Lean Process’ strength is being Honest and Humble

Lean process and methodology is important to me because I think it is central to the work that we are doing in the community.  Even more, it’s changing how my team at Dell creates and delivers products for customers?
This post may be long, but my answer to “why Lean” ends up being very simple: Lean process is honest and humble.
I believe Lean process is more honest because it assumes a lack of knowledge.  It’s more “truthy” to admit there are a lot of things that we don’t know (we can’t know!) until we’ve started doing the work.  It’s very hard to admit we don’t have answers for things until we are further along because we want to feel like  experts and we to lock deliveries.
The “building software is like building a house” analogy is often used to claim that Lean lacks the design “blue prints” that other processes have.  The argument goes that builders needed to understand how the entire house works with structural support, plumbing lines and electrical circuits and things like that.  However, if I was going to build a house I would still leave a lot of things to the last-minute.  The process of building a house evolves so that the basic outlines of structural elements are known. In a lot of cases the position of rooms, the outlets, air conditioning ducts, a lot of the functional components, even windows and doors while they are often placed in the design can easily be moved and changed as you go through.  You can do a walk-through of a house after it’s been framed out and make all sorts of changes and adjustments.  As things go forward in the design of the house things become more and more difficult to change. You are building a brick façade, moving the windows within the façade are very difficult. However, interior places they aren’t.  Likewise, I don’t want to order my life-gem counter-tops from the blue-prints – it’s much safer to order off actual measurements.
Software projects are also building projects. You build a façade, you build a structure and within that structure you have a lot of flexibility. As you go you make more decisions and your choices become more limited. But, that is the nature of building.  For that reason, saying “we don’t know everything we want” is not just good practice, it is much more honest.
But honesty is not enough for a strong Lean process.  The need for humility in Lean architects and business people really stands out.  The Lean process is humble because it starts with the assumption that we don’t really understand the value, drivers, interests and features that make our product special.
We need very strong ideas and a vision; however, we need to be motivated by making something that is significant to other people.  They are the ones who give it value.
We have to give up the idea that we can convince someone who our idea will be significant to them – we have to show and collaborate instead.  The most important thing in building any project and taking any product to market is listening to the people who are using your product and understanding what their needs.  Instead of telling them what they need;  show them something interesting, interact with them and get their opinion.
Contrast that to waterfall methodology where the assumption is that we can put smart people in a room, have them figure out what the requirements are, build a team, get everything ready to go and then start executing.  That assumption is highly optimized and seems very efficient, but it has a huge amount of hubris in the process.  The idea that we can sit down two years in advance of market need and identify what those features and capabilities are seems outrageous to me in the current technology market.  It is so much harder to try to get that information correct and then execute on it that get a directional statement and begin and then get feedback and interact, it is a world of difference between the two processes.
Ultimately, Lean process about having requirements that are less defined or well-known.  It’s driven by giving respect to the people consuming the product.  We can hear their ideas and their reactions.  Where the users’ input can be evaluated and taken in to account.  It’s about collaboration.
Humility it not just about listening and collecting feedback: it is about interacting and building relationships.
So just as our customers are building a relationship with our product, they are also building a relationship with the people creating that product. And that relationship is what drives the product forward and what makes it a great product and it is what gives you a strong and loyal customer base, rather than dictating, “This is what you wanted. Here it is. I hope you enjoy it.”
This is a completely different and powerful way of delivering product.  I believe that honesty and humility in a Lean process inherently creates stronger products and ones that are both faster delivered and better suited to their markets.

How Good beats Great and avoids Process Interlock failure

Note: This is part 2 of a 3 part series about the “process interlock dilemma.”

This post addresses how to solve the Process Interlock dilemma I identified in part 1. It is critical to understand the failure of Process Interlock comes because the interlocks turn assumptions into facts. We must accept that any forward looking schedule is a guess. If your guesses are accurate then your schedule should be accurate. That type of insight and $5 will get you a Venti Carmel Frappuccino.

The problem of predicting the future and promising to deliver on that schedule results in one of two poor outcomes.

  1. The better poor outcome is that you are accurate and committed to a schedule.

    To keep on the schedule, you must focus on the committed deliverables. While this sounds ideal, there an opportunity cost to staying focused. Opportunity cost means that while your team is busy delivering on schedule, it is not doing work to pursue other opportunities. In a perfect world, your team picked the most profitable option before it committed the schedule. If you don’t live in a perfect world then it’s likely that while you are working on deliver you’ve learned about another opportunity. You may make your schedule but miss a more lucrative opportunity.

  2. The worse poor outcome is that you are not accurate and committed to a schedule.

    In that case, you miss both the opportunity you thought you had and the ones that you could not pursue while staying dedicated to your planning assumptions.

Let’s go back to our G.Mordler example and look at some better outcomes:

The “we’re going to try outcome.”

The Trans Ma’am team, Alpha, Omega and the supplier all get together and realize that the current design is not shippable; however, they realize that each team’s roadmaps converge within target time. To reduce interlocks, Omega takes Alpha in the low power form and begins integration. During integration, Omega identifies that Alpha can produce sufficient power for short periods of time travel but causes the exhaust vent of the power module to melt. Alpha determines that a change to the cooling system will address the problem. In consulting with their supplier, Alpha asks them to stop design on the new supply and adjust the current design as needed. The resulting time drive does not meet GM’s initial design for 4 hour time jumps, but is sufficient for lead footed mommies to retroactively avoid speeding tickets. GM decides it can still market the limited design.

The “we’re not ready outcome.

The Trans Ma’am team, Alpha and Omega all get together and realize that the current design is not shippable in their current state. While they cannot commit each realizes that there is a different market for their products: Alpha pursues dog poop power generation for high rises condo towers (aka brown energy) and Omega finds military applications for time travel nuclear submarines. In the experience gained from delivering products to these markets, Alpha improves power delivery by 20% and Omega improves efficiency by 20%. These modest mutual improvements allow Alpha to meet Omega’s requirement. While the combined product is too late for the target date, GM is able to incorporate the design into next design cycle.

While neither outcome delivers the desired feature at the original schedule, both provide better ROI for the company. One of the most common problems with process interlock is that we lost sight of ROI in our desire to meet an impractical objective.

Process interlock is a classic case of point optimization driving down system-wide performance.

If you’re interested in this effect, I recommend reading Eli Goldratt’s The Goal.

In the part, I’ve discussed some ways to escape from Process Interlock. I’ll talk about four alternative approaches in part 3 (to be published 3/16).