Crowbar and our Pivot (or, how we slipped and shipped Grizzly)

Crowbar Grizzly PostMy team at Dell uses Lean process because it forces us to be honest about making hard choices. Our recent decision to pivot back to Crowbar 1.x for the OpenStack Grizzly release is a great example how the pivot process works.

4/24 note: I have a longer post and ISO for Grizzly on Crowbar waiting until we enter QA. The Crowbar community is already very active around this work and you’re encouraged to join.

Like any refactor, there was schedule risk when we started the Crowbar 2.x release. To mitigate this risk, we made two critical choices. First, we choose to advance the OpenStack barclamps on the 1.x code base in parallel with the 2.x work. Second, we chose a pivot date for the team to choose releasing Grizzly on the 1.x or 2.x trunks.

Choosing to jump back to 1.x was one of the hardest choices I’ve made in my career. I’m proud that we had the foresight to keep that as an option and prouder that our team rallied to make it happen.

I acknowledge that 1.x has gaps; however, getting Grizzly into the field for PoCs and pilots with 1.x provide substantial benefits to the community.  That said, there are barclamps for HA deployments and other production features that are under development on the 1.x branch and will be available in the community.

The 2.x code base provides important features but we are building from on the 1.x deployment recipes. This means that development, testing and tuning applied to the Grizzly barclamps will translates directly into Crowbar 2.x field readiness. In fact, more completeness on OpenStack can dramatically simplify Crowbar 2.x testing efforts.  This is especially true on the OpenStack Networking (fka Quantum) barclamps because they are new work.

Delivering solutions is a balance between features, timing and field experience.  The Crowbar team’s preference is to collaborate with operators in the field and that means making workable software available quickly.

I hope that you’ll agree with our approach and help us make Grizzly the most deployable OpenStack yet.

As OpenStack enters rapids with Grizzly, watch for strong currents, hidden rocks & eddies.

White Water

Play Boating From Wikipedia

I enjoy kayaking white water rapids – they are exhilarating and demanding. The water accelerates around obstacles and shows its power. You cannot simply ride the current; you must navigate your way around obstacles, stay clear of eddies that pull you back and watch for hidden rocks. The secret to success is to read the current and make small adjustments as you are carried along – resistance is futile.

After the summit, I see the OpenStack with the Grizzly release like water entering the rapids. The quality and capability of the code base continues to improve while the number of players with offerings in the ecosystem is also increasing rapidly. Until now, there was plenty of room to play together; however, as scope, activity and velocity increase there will more inter-vendor interactions.

As a member of the OpenStack board, I have tremendous enthusiasm for what the OpenStack community has accomplished. There have been some really positive accounts of the summit including CSC “OpenStack gains maturity…“, Silicon Angle “OpenStack has reached a Flash Point”, Randy Bias’ “OpenStack is THE Stack”, Wayne Walls “Hallway Track” and much more on the Planet OpenStack aggregator.

In fact, we’ve created such a love fest for OpenStack that I fear we are drinking our own kool aide.

I have a responsibility to be transparent and honest about challenges facing the us because it’s the Foundation’s job to guide us forward. My positions result from many conversations that I had throughout the week of the Summit. They are also the result of my first hand experiences along with my 14 years of cloud experience.

Over the next posts, I’ll explore a number of these topics with the goal of helping navigate a path through the potential turbulence. The simple fact is the OpenStack is growing quickly and that creates challenges:

  1. A growing number of new developers are joining. Since our work surface area is expanding, it’s both easier than ever to participate and harder to navigate where to begin. We need to get ahead of the design cycles.
  2. A growing number of non-devs are participating and bringing important contributions and experience. We must include them in the OpenStack meritocracy because they speak for the quality and usability of the project.
  3. A growing number of companies (many “name brands”) who are still trying to figure out how to participate and collaborate in open source projects. Lack of experience increases the risk of divergence (forking) and market confusion.
  4. A growing number of products based on OpenStack also increases forking risk as OpenStack contributors feel compelled to differentiate.
  5. A growing number of core components (compute+block+network+…) that are required to have base functionality.
  6. A growing number of incubated projects that continue to stress innovation and pace of change that challenges the very question of “what is OpenStack?”
  7. A growing number of deployed sites offering OpenStack clouds but the community lacks a way to verify (or really discuss) compatibility between the sites.

This list is a cause for celebration not a cause for alarm – every item is a challenges based on our success. The community and Foundation are already working to address the risks.

While some of us enjoy the chaos and excitement of rapids, other can take comfort from the fact that they are always followed by calm waters. Don’t worry – we’ll navigate through this together.

“Stack Shop” cover of Macklemore’s Thrift Shop

Sometimes a meme glitters too strongly for me to resist getting pulled in… that happened to great effect that just before the OpenStack Havana summit. When my code-addled mind kept swapping “poppin’ tags” for “OpenStack” on the radio edit, I stopped fighting and rewrote the Thrift Shop lyrics for OpenStack (see below the split).

With a lot of help from summit attendees (many of them are OpenStack celebrities, CEOs, VPs and members OpenStack Foundation board), I was able to create a freaking awesome cover of Macklemore’s second hand confection (NSFW).

Frankly, I don’t know everyone in the video (what, what?)!

But here’s a list of those that I do know.  I’m happy to update so the victims actors get credit.  Singers (in order):

Rob Hirschfeld (me) & Monty Taylor, Peter Poulliot, Judd Maltin, Forrest Norrod, Josh Kleinpeter, Tristan Goode, Dan Bode, Jay Pipes, Prabhakar Gopalan, Peter Chadwick, Simon Andersen, Vish Ishaya, Wayne Walls, Alex Freedland, Niki Acosta, Ops Track Monday Session 1, Ben Cherian, Eric Windisch, Brandon Draeger, Joseph B George,  Mark Collier, Joseph Heck, Tim Bell,  Chris Kemp, Kyle McDonald & Joshua McKenty,

Continue reading

OpenStack’s next hurdle: Interoperability. Why should you care?

SXSW life size Newton's Cradle

SXSW life size Newton’s Cradle

The OpenStack Board spent several hours (yes, hours) discussing interoperability related topics at the last board meeting.  Fundamentally, the community benefits when uses can operate easily across multiple OpenStack deployments (their own and/or public clouds).

Cloud interoperability: the ability to transfer workloads between systems without changes to the deployment operations management infrastructure.

This is NOT hybrid (which I defined as a workload transparently operating in multiple systems); however it is a prereq to achieve scalable hybrid operation.

Interoperability matters because the OpenStack value proposition is all about creating a common platform.  IT World does a good job laying out the problem (note, I work for Dell).  To create sites that can interoperate, we have to some serious lifting:

At the OpenStack Summit, there are multiple chances to engage on this.   I’m moderating a panel about Interop and also sharing a session about the highly related topic of Reference Architectures with Monty Tayor.

The Interop Panel (topic description here) is Tuesday @ 5:20pm.  If you join, you’ll get to see me try to stump our awesome panelists

  • Jonathan LaCour, DreamHost
  • Troy Toman, Rackspace
  • Bernard Golden,  Enstratius
  • Monty Taylor, OpenStack Board (and HP)
  • Peter Pouliot, Microsoft

PS: Oh, and I’m also talking about DevOps Upgrades Patterns during the very first session (see a preview).

DevOps approaches to upgrade: Cube Visualization

I’m working on my OpenStack summit talk about DevOps upgrade patterns and got to a point where there are three major vectors to consider:

  1. Step Size (shown as X axis): do we make upgrades in small frequent steps or queue up changes into larger bundles? Larger steps mean that there are more changes to be accommodated simultaneously.
  2. Change Leader (shown as Y axis): do we upgrade the server or the client first? Regardless of the choice, the followers should be able to handle multiple protocol versions if we are going to have any hope of a reasonable upgrade.
  3. Safeness (shown as Z axis): do the changes preserve the data and productivity of the entity being upgraded? It is simpler to assume to we simply add new components and remove old components; this approach carries significant risks or redundancy requirements.

I’m strongly biased towards continuous deployment because I think it reduces risk and increases agility; however, I laying out all the vertices of the upgrade cube help to visualize where the costs and risks are being added into the traditional upgrade models.

Breaking down each vertex:

  1. Continuous Deploy – core infrastructure is updated on a regular (usually daily or faster) basis
  2. Protocol Driven – like changing to HTML5, the clients are tolerant to multiple protocols and changes take a long time to roll out
  3. Staged Upgrade – tightly coordinate migration between major versions over a short period of time in which all of the components in the system step from one version to the next together.
  4. Rolling Upgrade – system operates a small band of versions simultaneously where the components with the oldest versions are in process of being removed and their capacity replaced with new nodes using the latest versions.
  5. Parallel Operation – two server systems operate and clients choose when to migrate to the latest version.
  6. Protocol Stepping – rollout of clients that support multiple versions and then upgrade the server infrastructure only after all clients have achieved can support both versions.
  7. Forced Client Migration – change the server infrastructure and then force the clients to upgrade before they can reconnect.
  8. Big Bang – you have to shut down all components of the system to upgrade it

This type of visualization helps me identify costs and options. It’s not likely to get much time in the final presentation so I’m hoping to hear in advance if it resonates with others.

PS: like this visualization? check out my “magic 8 cube” for cloud hosting options.

What foo is “contribution” to open source? Mik Kersten & Tasktop @ SXSW

Nested

How do we really know who influences most in a software project?  We can easily track code commits, but there are more bits to the project than the commits.

I had the good fortune to attend Mik Kersten’s Code Graph presentation at SXSW last week. Mik started the Eclipse Mylyn project and went on to found Tasktop. Both are built on the very intriguing concepts that software development production (aka work) is organized around tasks.

His premise is that organizing around tasks provides a more manageable and actionable view of a project than a more traditional application life-cycle management (ALM) approaches.  I’m a sucker for any presentation about lean development process that includes references to both DevOps and industrial engineering (I have an MS in IE), but Mik surprised me by taking his code graph concept to a whole ‘nutha level.

The software value chain is much deeper than just the people who write code. Mik’s approach included managers, testers and operators in the interaction graphs for his projects.

By including all of the ALM artifacts in the analysis, you get a much richer picture of the influencers for a project.

For example, the development manager may never show up as a code committer; however, they are hugely influential in which work gets prioritized. If your graph includes who is touching the work assignments and stories then the manager’s influence jumps out immediately. That knowledge would completely change how and who you may interact with a team. It effectively brings a shadow contributor into the light.

The same is true for QA members who are running tests and opening defects and operators who are building deployment scripts. Ideally, it should include users who exercise different parts of the applications capabilities.

Mik’s graphs clearly showed the influence impact of managers because they touched all of the story cards for the project.  The people who own the story cards are the most potent influencers in a project, yet they are invisible in code repositories!

I would love to see an impact graph for a software project that equally reflected the wide range of contributions that people make to its life-cycle.  This type of information helps rebalance the power in a project.

Industrial engineering legend W.E. Demming‘s advice is to look at production as a system.  Finding ways to show everyone’s contributions is an important step towards bringing lean processes fully into software manufacturing.

Creating Communities: the intersection between Twitter celebrities and open source

calvin_leeOne of the unexpected perks of my Chevy SXSW experience was access to some real social media celebrities such as Josh Estrin, Calvin LeeKristin Brandt, Doug MoraSamantha Needham and Jennie Chen.  They are all amazing, fun, wicked smart and NOT INTO CLOUD COMPUTING.

While I already knew Samantha (via Dell) and Jennie (via TechRanch), all of Chevy’s guests brought totally different perspectives to Chevy’s SXSW team ranging from pop culture  and mommies to hypermilers and gearheads.

The common thread is that we are all looking to engage our communities.

We each wanted to find something that would be interesting for our very different audiences to discuss.  That meant using our experiences at SXSW, Chevy and with each other to start a conversation within our communities.  We need good content as a seed but the goal is to drive the interaction.

Josh was the most articulate about this point saying that he measured his success when his followers talked to each other more than to him.   Being able to create content that engages people to do that is a true talent.

Calvin’s focus was more on helping people connect.  He felt successful when he was able to bring people together through his extended network. In those cases and others, the goals and challenges of a social media celebrity were remarkably similar to those helping lead open source projects.

In building communities, you must measure success in member communication and interaction.

If you are intent on being at the center of the universe then your project cannot grow; however, people also need celebrities to bring them together.  The amazing thing about the the people I met at SXSW through Chevy is that they managed to both attract the attention needed to build critical mass and get out of the way so communities could form around them. That’s a skill that we all should practice and foster.

PS: I also heard clearly that “I ate …” tweets are some of their most popular.  Putting on my collaboration hat: if you’re looking to engage a community then food is the most universal and accessible discussion topic.  Perhaps I’ll have to eat crow on that one.