Connecting the dots: Dell stays course on OpenStack private

rob pdx drivingWhen it comes to OpenStack, I don’t just work for Dell: I’m the technical lead for our OpenStack-powered private Cloud Solution and an elected director to the OpenStack Foundation board.

Frankly, the announcement of our change in public cloud strategy overshadowed our increasing level of investment in OpenStack-powered private cloud solutions (we are hiring!).  Sam Greenblatt, Dell Product Group VP and Chief Architect, is very specific that the recent announcements are about increasing investment where Dell is already successful plus accelerating with new features (such as leadership in HyperV enablement).

The fact that we focused on our decision to pivot away from Dell hosted public cloud distracted from the strategic choices that we’ve been making.  In the lean process that we use, pivots are a positive sign of listening and self-honesty.  Sadly, that distraction led to confusion, misleading comments, and implications that Dell was dropping OpenStack or questioning OpenStack sustainability and market success.

For the record, Dell was one of the first companies to support OpenStack with supporting quotes from Forrest Norrod (Dell GM for Servers and my direct boss) way back  in July 2010.  Our private OpenStack based cloud, built on open source Crowbar, was the first to market 2 years ago (deploying Cactus!).  We’ve been investing steadily in both fundamental improvements to OpenStack deployment and being early supporting the Grizzly release.

I am not implying that OpenStack’s future is certain (we have a lot of work to do) or that Dell OpenStack strategy will not change again; however, I know first-hand that both are on much firmer footing than some reports have implied.

Crowbar cuts OpenStack Grizzly (“pebbles”) branch & seeks community testing

Pebbles CutThe Crowbar team (I work for Dell) continues to drive towards “zero day” deployment readiness. Our Hadoop deployments are tracking Dell | Cloudera Hadoop-powered releases within a month and our OpenStack releases harden within three months.

During the OpenStack summit, we cut our Grizzly branch (aka “pebbles”) and switched over to the release packages. Just a reminder, we basically skipped Folsom. While we’re still tuning out issues on OpenStack Networking (OVS+GRE) setup, we’re also looking for community to start testing and tuning the Chef deployment recipes.

We’re just sprints from release; consequently, it’s time for the Crowbar/OpenStack community to come and play! You can learn Grizzly and help tune the open source Ops scripts.

While the Crowbar team has been generating a lot of noise around our Crowbar 2.0 work, we have not neglected progress on OpenStack Grizzly.  We’ve been building Grizzly deploys on the 1.x code base using pull-from-source to ensure that we’d be ready for the release. For continuity, these same cookbooks will be the foundation of our CB2 deployment.

Features of Crowbar’s OpenStack Grizzly Deployments

  • We’ve had Nova Compute, Glance Image, Keystone Identity, Horizon Dashboard, Swift Object and Tempest for a long time. Those, of course, have been updated to Grizzly.
  • Added Block Storage
    • importable Ceph Barclamp & OpenStack Block Plug-in
    • Equalogic OpenStack Block Plug-in
  • Added Quantum OpenStack Network Barclamp
    • Uses OVS + GRE for deployment
  • 10 GB networking configuration
  • Rabbit MQ as its own barclamp
  • Swift Object Barclamps made a lot of progress in Folsom that translates to Grizzly
    • Apache Web Service
    • Rack awareness
    • HA configuration
    • Distribution Report
  • “Under the covers” improvements for Crowbar 1.x
    • Substantial improvements in how we configure host networking
    • Numerous bug fixes and tweaks
  • Pull from Source via the Git barclamp
    • Grizzly branch was switched to use Ubuntu & SUSE packages

We’ve made substantial progress, but there are still gaps. We do not have upgrade paths from Essex or Folsom. While we’ve been adding fault-tolerance features, full automatic HA deployments are not included.

Please build your own Crowbar ISO or check our new SoureForge download site then join the Crowbar List and IRC to collaborate with us on OpenStack (or Hadoop or Crowbar 2). Together, we will make this awesome.

OpenStack steps toward Interopability with Temptest, RAs & RefStack.org

Pipes are interoperableI’m a cautious supporter of OpenStack leading with implementation (over API specification); however, it clearly has risks. OpenStack has the benefit of many live sites operating at significant scale. The short term cost is that those sites were not fully interoperable (progress is being made!). Even if they were, we are lack the means to validate that they are.

The interoperability challenge was a major theme of the Havana Summit in Portland last week (panel I moderated) .  Solving it creates significant benefits for the OpenStack community.  These benefits have significant financial opportunities for the OpenStack ecosystem.

This is a journey that we are on together – it’s not a deliverable from a single company or a release that we will complete and move on.

There were several themes that Monty and I presented during Heat for Reference Architectures (slides).  It’s pretty obvious that interop is valuable (I discuss why you should care in this earlier post) and running a cloud means dealing with hardware, software and ops in equal measures.  We also identified lots of important items like Open OperationsUpstreamingReference Architecture/Implementation and Testing.

During the session, I think we did a good job stating how we can use Heat for an RA to make incremental steps.   and I had a session about upgrade (slides).

Even with all this progress, Testing for interoperability was one of the largest gaps.

The challenge is not if we should test, but how to create a set of tests that everyone will accept as adequate.  Approach that goal with standardization or specification objective is likely an impossible challenge.

Joshua McKenty & Monty Taylor found a starting point for interoperability FITS testing: “let’s use the Tempest tests we’ve got.”

We should question the assumption that faithful implementation test specifications (FITS) for interoperability are only useful with a matching specification and significant API coverage.  Any level of coverage provides useful information and, more importantly, visibility accelerates contributions to the test base.

I can speak from experience that this approach has merit.  The Crowbar team at Dell has been including OpenStack Tempest as part of our reference deployment since Essex and it runs as part of our automated test infrastructure against every build.  This process does not catch every issue, but passing Tempest is a very good indication that you’ve got the a workable OpenStack deployment.

Crowbar and our Pivot (or, how we slipped and shipped Grizzly)

Crowbar Grizzly PostMy team at Dell uses Lean process because it forces us to be honest about making hard choices. Our recent decision to pivot back to Crowbar 1.x for the OpenStack Grizzly release is a great example how the pivot process works.

4/24 note: I have a longer post and ISO for Grizzly on Crowbar waiting until we enter QA. The Crowbar community is already very active around this work and you’re encouraged to join.

Like any refactor, there was schedule risk when we started the Crowbar 2.x release. To mitigate this risk, we made two critical choices. First, we choose to advance the OpenStack barclamps on the 1.x code base in parallel with the 2.x work. Second, we chose a pivot date for the team to choose releasing Grizzly on the 1.x or 2.x trunks.

Choosing to jump back to 1.x was one of the hardest choices I’ve made in my career. I’m proud that we had the foresight to keep that as an option and prouder that our team rallied to make it happen.

I acknowledge that 1.x has gaps; however, getting Grizzly into the field for PoCs and pilots with 1.x provide substantial benefits to the community.  That said, there are barclamps for HA deployments and other production features that are under development on the 1.x branch and will be available in the community.

The 2.x code base provides important features but we are building from on the 1.x deployment recipes. This means that development, testing and tuning applied to the Grizzly barclamps will translates directly into Crowbar 2.x field readiness. In fact, more completeness on OpenStack can dramatically simplify Crowbar 2.x testing efforts.  This is especially true on the OpenStack Networking (fka Quantum) barclamps because they are new work.

Delivering solutions is a balance between features, timing and field experience.  The Crowbar team’s preference is to collaborate with operators in the field and that means making workable software available quickly.

I hope that you’ll agree with our approach and help us make Grizzly the most deployable OpenStack yet.

5 things keeping DevOps from playing well with others (Chef, Crowbar and Upstream Patterns)

Sharing can be hardSince my earliest days on the OpenStack project, I’ve wanted to break the cycle on black box operations with open ops. With the rise of community driven DevOps platforms like Opscode Chef and Puppetlabs, we’ve reached a point where it’s both practical and imperative to share operational practices in the form of code and tooling.

Being open and collaborating are not the same thing.

It’s a huge win that we can compare OpenStack cookbooks. The real victory comes when multiple deployments use the same trunk instead of forking.

This has been an objective I’ve helped drive for OpenStack (with Matt Ray) and it has been the Crowbar objective from the start and is the keystone of our Crowbar 2 work.

This has proven to be a formidable challenge for several reasons:

  1. diverging DevOps patterns that can be used between private, public, large, small, and other deployments -> solution: attribute injection pattern is promising
  2. tooling gaps prevent operators from leveraging shared deployments -> solution: this is part of Crowbar’s mission
  3. under investing in community supporting features because they are seen as taking away from getting into production -> solution: need leadership and others with join
  4. drift between target versions creates the need for forking even if the cookbooks are fundamentally the same -> solution: pull from source approaches help create distro independent baselines
  5. missing reference architectures interfere with having a stable baseline to deploy against -> solution: agree to a standard, machine consumable RA format like OpenStack Heat.

Unfortunately, these five challenges are tightly coupled and we have to progress on them simultaneously. The tooling and community requires patterns and RAs.

The good news is that we are making real progress.

Judd Maltin (@newgoliath), a Crowbar team member, has documented the emerging Attribute Injection practice that Crowbar has been leading. That practice has been refined in the open by ATT and Rackspace. It is forming the foundation of the OpenStack cookbooks.

Understanding, discussing and supporting that pattern is an important step toward accelerating open operations. Please engage with us as we make the investments for open operations and help us implement the pattern.

double Block Head with OpenStack+Equallogic & Crowbar+Ceph

Block Head

Whew….Yesterday, Dell announced TWO OpenStack block storage capabilities (Equallogic & Ceph) for our OpenStack Essex Solution (I’m on the Dell OpenStack/Crowbar team) and community edition.  The addition of block storage effectively fills the “persistent storage” gap in the solution.  I’m quadrupally excited because we now have:

  1. both open source (Ceph) and enterprise (Equallogic) choices
  2. both Nova drivers’ code is in the open at part of our open source Crowbar work

Frankly, I’ve been having trouble sitting on the news until Dell World because both features have been available in Github before the announcement (EQLX and Ceph-Barclamp).  Such is the emerging intersection of corporate marketing and open source.

As you may expect, we are delivering them through Crowbar; however, we’ve already had customers pickup the EQLX code and apply it without Crowbar.

The Equallogic+Nova Connector

block-eqlx

If you are using Crowbar 1.5 (Essex 2) then you already have the code!  Of course, you still need to have the admin information for your SAN – we did not automate the configuration of the storage system, but the Nova Volume integration.

We have it under a split test so you need to do the following to enable the configuration options:

  1. Install OpenStack as normal
  2. Create the Nova proposal
  3. Enter “Raw” Attribute Mode
  4. Change the “volume_type” to “eqlx”
  5. Save
  6. The Equallogic options should be available in the custom attribute editor!  (of course, you can edit in raw mode too)

Want Docs?  Got them!  Check out these > EQLX Driver Install Addendum

Usage note: the integration uses SSH sessions.  It has been performance tested but not been tested at scale.

The Ceph+Nova Connector

block-ceph

The Ceph capability includes a Ceph barclamp!  That means that all the work to setup and configure Ceph is done automatically done by Crowbar.  Even better, their Nova barclamp (Ceph provides it from their site) will automatically find the Ceph proposal and link the components together!

Ceph has provided excellent directions and videos to support this install.

Interview Transcript about OpenStack, Foundation, Dell & Crowbar

Rafael Knuth, Dell Rafael Knuth (@RafaelKnuth with Dell as EMEA Social Media Community Manager) has set an ambitious objective to interview all 24 of the OpenStack Foundation Board members.  I have the privilege of being the first interview posted (Japanese version!).   Here’s the whole series.

This is not puff interview – We spent an hour together and Rafael did not shy away from asking hard questions like “Why did Dell jump into OpenStack?” and “is VMware a threat to OpenStack?”  Rather than posting the whole transcript (it’s posted here), I’m including the questions (as a teaser) below.   There is some real meat in these answers about OpenStack, Dell, Crowbar and challenges facing the project.

WARNING: My job is engineering, not marketing.  You may find my answers (which are MY OWN ANSWERS) to be more direct that you are expecting.  If you find yourself needing additional circumlocution then simply close your browser and move on.

Some highlights…

Dell’s interest in OpenStack has been very pragmatic. OpenStack is something we really see a market need for.

Rackspace …  runs on OpenStack pretty much off trunk …  That’s exactly the type of vibrant community we want to see.  At the same time, there is a growing community that wants to use OpenStack distributions with support, certifications and they are fine with being 6 months behind OpenStack off trunk. That’s good, and we want that shadow, we want that combination of pure minded early adopters and less sophisticated OpenStack users both working together.

We are working with different partners to bring OpenStack to different customers in different ways. It is confusing. Your question about Dell Crowbar was right … it is targeted at a certain class of users, and I don’t want enterprise customers who expect a lot of shiny chrome and zero touch. That’s not the target by now for Dell Crowbar. We definitely need that sort of magic decoder page to help customers understand our commercial offering.

Questions from Interview:

  1. Dell is one of the very early contributors to OpenStack. Why is Dell engaging in this project?
  2. How does Dell contribute to OpenStack?
  3. Let’s talk a bit about Dell Crowbar, your team’s deployment mechanism for OpenStack.
  4. Let’s talk a bit about OpenStack raw vs. OpenStack distributions.
  5. What are the biggest barriers to OpenStack adoption as of now?
  6. What does a customer specifically need to do when moving from OpenStack Essex to Folsom for example?
  7. My next question is around proof of concept versus production, Rob. How are customers using OpenStack and can you give examples for both scenarios?
  8. I hear very often two different statements: “Open Stack is an alternative to Amazon.” The other is: “OpenStack is an alternative to VMware … maybe, hopefully in two or three years from now.”  Which of both statements is true?
  9. How do you view VMware joining OpenStack. Is it a threat to OpenStack or does VMware add value to the project?
  10. Let us speak about market adoption. Who are the early adopters of OpenStack? And when do you expect OpenStack to hit the tipping point for mass market adoption?
  11. Rob, for all those interested in Dell’s commercial offering around OpenStack … can you give a brief overview?
  12. Dell TechCenter that provides customers an overview over our OpenStack offering: Dell Crowbar as our DevOps tool in its various shapes and forms, OpenStack distros we support … cloud services we build around OpenStack … hardware capabilities optimized for OpenStack.
  13. What are the challenges for the OpenStack Board of Directors?

Crowbar releases for OpenStack Essex and Folsom

On the eve of the OpenStack design summit, it’s worth noting that the Crowbar team at Dell cut our final Essex release (aka Betty) last week. We’ve also committed the initial Folsom deployment scripts to the 1x development trunk under “feature/folsom” if you are doing Crowbar builds from DevTool (see bit.ly/crowbardevtool).

If you don’t want to build it, I’ve cut ISOs from the open source bits and posted them to http://crowbar.zehicle.com.

Andi Abes is presenting about the new Pull From Source (pfs) feature at the Summit on Monday. There’s a feature branch for that too and I’m going to check with him and try to post and ISO for that too.

20121014-132246.jpg

OpenStack Summit: Let’s talk DevOps, Fog, Upgrades, Crowbar & Dell

If you are coming to the OpenStack summit in San Diego next week then please find me at the show! I want to hear from you about the Foundation, community, OpenStack deployments, Crowbar and anything else.  Oh, and I just ordered a handful of Crowbar stickers if you wanted some CB bling.

Matt Ray (Opscode), Jason Cannavale (Rackspace) and I were Ops track co-chairs. If you have suggestions, we want to hear. We managed to get great speakers and also some interesting sessions like DevOps panel and up streaming deploy working sessions. It’s only on Monday and Tuesday, so don’t snooze or you’ll miss it.

My team from Dell has a lot going on, so there are lots of chances to connect with us:

At the Dell booth, Randy Perryman will be sharing field experience about hardware choices. We’ve got a lot of OpenStack battle experience and we want to compare notes with you.

I’m on the board meeting on Monday so likely occupied until the Mirantis party.

See you in San Diego!

PS: My team is hiring for Dev, QA and Marketing. Let me know if you want details.

Why doesn’t Chef call them “bowls” instead of roles?

The extended Crowbar team (my employer Dell and community) recently had a bit of a controversy heated discussion over the renaming of “proposals” to “configurations.” It was pretty clear that the term “proposal” confused users because an “active proposal” seems like a bit of an oxymoron. Excepting Scott Jensen, our schedule-czar and director of engineering, we had relatively few die-hard “I love proposal” advocates; however, deciding on an alternative was not quite so easy.

Ultimately, the question came down to “do we use an invented term or an intuitive term.”  The answer lies in when to use or avoid the congruity theory as articulated by Roger Cauvin.

We considered many alternatives like calling them “fixtures” to go along with the Crowbar & Barclamp tool theme. Even “Chuck Norris” was considered until copyright issues were flagged. The top alternative, “configuration” seemed just too bland. Frankly and amazingly, we originally considered it “too descriptive!”

The crux of the argument really revolved around the users’ ability to intuitively grasp a concept or to force them learn a new term. For example, we specifically chose “barclamp” instead of “module” because we felt that there were more components to a barclamp than just being a Crowbar module. In many ways, module would be sufficiently descriptive; however, we saw that there was benefit to the user tax in introducing a new term. It also fit nicely within our tool theme.

Opscode Chef is an example of investing heavily in a naming theme. For example, the concept of “cookbooks” and “recipes” seems relatively intuitive for users but starts getting stretched for “knife” because it is not immediately clear to users what that component does (it executes instructions on nodes and the server). After learning Chef, I appreciate “knife” as the universal tool but still remember having to figure it out.

A good theme is awesome, but it can quickly encumber usability.

For example, what if Chef has used “bowl” instead of role. It’s logical: you put a group of ingredients to mix into a bowl that acts as a container. While it may be logical to the initiated, it mainly extends the learning curve for new users. A role is a commonly accepted term for an operational classification so it is a much better term for users. The same is true for “node” and “data bag.”

I love a good incongruent theme as much as any meme-enabled tech geek but themes must not hinder usability. After all, we all fight for the users.