Unknown's avatar

About Rob H

A Baltimore transplant to Austin, Rob thinks about ways of building scale infrastructure for the clouds using Agile processes. He sat on the OpenStack Foundation board for four years. He co-founded RackN enable software that creates hyperscale converged infrastructure.

What does “enable upstream recipes” mean? Not just fishing for community goodness!

One of the major Crowbar 2.0 design targets is to allow you to “upstream” operations scripts more easily.  “Upstream code” means that parts of Crowbar’s source code could be maintained in other open source repositories.  This is beyond a simple dependency (like Rails, Curl, Java or Apache): Upstreaming allows Crowbar can use code managed in the other open source repositories for more general application.  This is important because Crowbar users can leverage DevOps logic that is more broadly targeted than just Crowbar.  Even more importantly, upstreaming means that we can contribute and take advantage of community efforts to improve the upstream source.

Specifically, Crowbar maintains a set of OpenStack cookbooks that make up the core of our OpenStack deployment.  These scripts have been widely cloned (not forked) and deCrowbarized for other deployments.  Unfortunately, that means that we do not benefit from downstream improvements and the cloners cannot easily track our updates.  This happened because Crowbar was not considered a valid upstream OpenStack repository because our deployment scripts required Crowbar.  The consequence of this cloning is that incompatible OpenStack recipes have propagated like cracks in a windshield.

While there are concrete benefits to upstreaming, there are risks too.  We have to evaluate if the upstream code has been adequately tested, operates effectively, implements best practices and leverages Crowbar capabilities.  I believe strongly that untested deployment code is worse than useless; consequently, the Dell Crowbar team provides significant value by validating that our deployments work as an integrated system.  Even more importantly, we will not upstream from unmoderated sources where changes are accepted without regard for downstream impacts.  There is a significant amount of trust required for upstreaming to work.

If upstreaming is so good, why did we not start out with upstream code?  It was simply not an option at the time – Crowbar was the first (and is still!) most complete set of DevOps deployment scripts for OpenStack in a public repository.
By design, Crowbar 1.0 was tightly coupled to Opscode Chef and required users to inject Crowbar dependencies into their Chef Recipes.  This approach allowed us to more quickly integrate capabilities between recipes and with nascent Crowbar features.  Our top design requirement was that our deployment was tightly integrated between hardware, networking, operating system, operations infrastructure and the application.  Figuring out the correct place to separate concerns was impractical; consequently, we injected dependencies into our Chef code.
We have reached a point with Crowbar development that we can correctly decouple Crowbar and Chef.
The benefits to upstreaming go far beyond enabling more collaboration on OpenStack deployments.  These same changes make it easier for Crowbar to leverage community deployment scripts without one-way modifications.  If you have a working Chef Recipe then making it work with Crowbar will no longer require changes that break it outside of Crowbar; therefore, you can leverage Crowbar capabilities without losing community input and without being locked into Crowbar.

OSCON preso graphic about Upstreaming added 7/23:

Open Source Vision, Strategy & Execution: The Community Garden Analogy

I’ve been working on a white paper series about open source culture and projects based on my experience with at Dell with Crowbar and OpenStack.  I’m hoping to show off the first result of that collaboration soon, until then I’m glad to share some ideas that we’ve been throwing around to help explain the fundamental shift that we see taking place in the technology community.

It’s obvious that executing a collaborative open source project is fundamentally different than a either a licensed or an open single contributor project.

However, we have yet to describe the culture, process and success criteria needed to drive a collaborative open source project.

I am not saying that projects are not being successful – there are many great examples (OpenStack, CloudFoundry, Apache). My point is that we I do not have articulated the vision, strategy or execution needed for people grounded in traditional software delivery to understand why these projects are different and how to navigate the differences. Lack of alignment within the lead delivery team can be highly disruptive to the project.

Community gardening is all about people working together to produce something tangible that is bigger than they could accomplish individually. Each participant has an expectation of return – a garden is not a charity, it must produce; however, there is a community drive rewards a garden wide focus.

  • You could just pull weeds from your plot, but doing extra means that the garden as a whole will prosper.
  • Fixing the fence keeps the “MQ” rabbits out of your carrots and your neighbors lettuce
  • The person growing the mint also keeps out the ants and so helps the entire garden.
  • The oddball that wants inverted chinese radishes may also be an expert vermiculturist who nurtures the whole worm population as a side interest
  • While everyone may want tomatoes, they may not want to same variety or amount. You may want a small batch of heirlooms for your salad while someone making pasta sauce wants

For a community garden, like an open source project, the specific objectives of the participants do not have to be identical for the garden to florish. In fact, the very diversity of intent is what makes the garden successful. A single gardener may only plant watermelons and corn, but the community group will likely have a complete crop.

But the analogy does not end with the gardeners. A community garden is also strengthened by the cooks and dinners who enjoy the food because they are the audience.

This post was designed to plant some seeds of understanding.  I know it does not get to the meat of the vision, strategy or execution for open source, that will come in future posts.  Specifically, I’m planning to discuss how OpenStack and Crowbar measure up as they near their respective second and first anniversaries.

Crowbar deploying Dell | Cloudera 4 | Apache Hadoop

Hopefully you wrote “Cloudera 3.7” in pencil on your to-do list because the Dell Crowbar team has moved to CHD4 & Cloudera Enterprise 4.0. This aligns with the Cloudera GA announcement on Tuesday 6/5 and continues our drive keep Crowbar deployments both fresh and spicy.

With the GA drop, the Crowbar Cloudera Barclamps are effectively at release candidate state (ISO). The Cloudera Barclamps include a freemium version of Cloudera Enterprise 4 that supports up to 50 nodes.

I’m excited about this release because it addresses concerns around fault tolerance, multi-tenant and upgrade.

These tools are solving real world problems ranging from data archival, ad hoc analysis and click stream analysis. We’ve invested a lot of Crowbar development effort in making it fast and easy to build a Hadoop cluster. Now, Cloudera makes it even easier to manage and maintain.

A SuPEr New Linux for Crowbar! SuSE shows off port and OpenStack deploy

During last week’s OpenStack Essex Deploy Day, we featured several OpenStack ecosystem presentations including SuSE, Morphlabs, enStratus, Opscode, and Inktank (Ceph).

SuSE’s presentation (video) was deploying OpenStack using a SuSE port of Crowbar (including a reskinned UI)!

This is a significant for SuSE and Crowbar:

  1. SuSE, a platinum member of the OpenStack foundation, now has an OpenStack Essex distribution. They are offering this deployment as an on-request beta.
  2. Crowbar is now demonstrable operating on the three top Linux distributions.

SuSE is advancing some key architectural proposals for Crowbar because their implementation downloads Crowbar as a package rather than bundling everything into an ISO.

With the Hadoop 4 & OpenStack Essex releases nearly put to bed, it’s time to bring some of this great innovation into the Crowbar trunk.

OpenStack Deploy Day generates lots of interest, less coding

Last week, my team at Dell led a world-wide OpenStack Essex Deploy event. Kamesh Pemmaraju, our OpenStack-powered solution product manager, did a great summary of the event results (200+ attendees!). What started as a hack-a-thon for deploy scripts morphed into a stunning 14+ hour event with rotating intro content and an ecosystem showcase (videos).  Special kudos to Kamesh, Andi Abes, Judd Maltin, Randy Perryman & Mike Pittaro for leadership at our regional sites.

Clearly, OpenStack is attracting a lot of interest. We’ve been investing time in content to help people who are curious about OpenStack to get started.

While I’m happy to be fueling the OpenStack fervor with an easy on-ramp, our primary objective for the Deploy Day was to collaborate on OpenStack deployments.

On that measure, we have room for improvement. We had some great discussions about how to handle upgrades and market drivers for OpenStack; however, we did not spend the time improving Essex deployments that I was hoping to achieve. I know it’s possible – I’ve talked with developers in the Crowbar community who want this.

If you wanted more expert interaction, here are some of my thoughts for future events.

  • Expert track did not get to deploy coding. I think that we need to simply focus more even tightly on to Crowbar deployments. That means having a Crowbar Hack with an OpenStack focus instead of vice versa.
  • Efforts to serve OpenStack n00bs did not protect time for experts. If we offer expert sessions then we won’t try to have parallel intro sessions. We’ll simply have to direct novices to the homework pages and videos.
  • Combining on-site and on-line is too confusing. As much as I enjoy meeting people face-to-face, I think we’d have a more skilled audience if we kept it online only.
  • Connectivity! Dropped connections, sigh.
  • Better planning for videos (not by the presenters) to make sure that we have good results on the expert track.
  • This event was too long. It’s just not practical to serve Europe, US and Asia in a single event. I think that 2-3 hours is a much more practical maximum. 10-12am Eastern or 6-8pm Pacific would be much more manageable.

Do you have other comments and suggestions? Please let me know!

With Dell ARM-based “Copper” servers, Crowbar footprint grows

One of my team at Dell’s most critical lessons from hyperscale cloud deployments was the DevOps tooling and operations processes are key to success.  Our crowbar project was born out of this realization.

I have been tracking the progress the Copper ARM-based server from design to implementation internally.  Now, I’m excited to see it getting some deserved attention.

The Copper platform is really cool because the cost, power, and density ratios of the nodes are unparalleled.  This makes it an ideal platform for distributed mixed compute/store workloads like Hadoop.  The nodes in the platform have excellent RAM/CPU/Spindle ratios.

While Copper is driving huge density, it also drives forward the same hyperscale challenges that we’ve been trying to address with Crowbar; consequently, we’re already working to ensure that we can deploy and manage Copper with Crowbar at scale.

Copper and Crowbar make a natural team and we’re excited to be part of today’s announcement:

Dell is staging clusters of the Dell “Copper” ARM server within the Dell Solution Centers and with TACC so developers may book time on the platforms. Dell also will deliver an ARM-supported version of Crowbar, Dell’s open-source management infrastructure software, to the industry in the future.

Congratulations to the Copper team!

OSED OMG: OpenStack Essex Deploy Day!! A day-long four-session two-track International Online Conference

Curious about OpenStack? Know it, but want to tune your Ops chops? JOIN US on Thursday 5/31 (or Friday 6/1 if you are in Asia)!

Already know the event logistics? Skip back to my OSED observations post.

Some important general notes:

  1. We are RECORDING everything and will link posts from the event page.
  2. There is HOMEWORK if you want to get ahead by installing OpenStack yourself.
  3. For last minute updates about the event, I recommend that you join the Crowbar Listserver.

Content Logistics work like this.

  1. Everything will be available ONLINE. We are also coordinating many physical sites as rally points.
  2. Introductory: FOUR 3-hour sessions for people who do not have OpenStack or Crowbar experience. These sessions will show how to install OpenStack using Crowbar, discuss DevOps and showcase companies that are in the OpenStack ecosystem. They are planned to have 2 European slots (afternoon & evening), 3 US slots (morning, afternoon & evening), and 1 Asian slot (morning).
  3. Expert: ON-GOING deep technical sessions for engineers who have OpenStack and/or Crowbar experience. There will be one main screen and voice channel in which we are planning to highlight and discuss these topics in blocks throughout the day. We have a long list of topics to discuss and will maintain an ongoing Google Hangout for each topic. Depending on interest, we will jump back and forth to different hangouts.

Intro/Overview Session Logistics work like this

We’re planning FOUR introductory sessions throughout the day (read ahead?). Each session should be approximately 3 hours. The first hour of the sessions will be about OpenStack Essex and installing it using Crowbar. After some Q&A, we’re going to highlight the OpenStack ecosystem. The schedule for the ecosystem is in flux and will likely shift even during the event.

The Session start times for Overview & Ecosystem content

Region EDT Session 1 Session 2 Session 3 Session 4
Europe (-5) -5 3pm 6pm * *
Americas Eastern 0 10am 1pm 4pm *
Americas Central +1 9am Noon 3pm *
Americas Mtn +2 * 11am 2pm 7pm
Americas West +3 * 10am 1pm 6pm
Asia (Toyko) +10 * * * 6/1 10 am

* There are no planned live venues at this time/region. You are always welcome to join online!

Experts Track Logistics

Note: we expect experts to have already installed OpenStack (see homework page). Ideally, an expert has already setup a build environment.

We have a list of topics (Essex, Quantum, Networking, Pull from Source, Documentation, etc) that we plan to cover on a 30-60 minute rotation.

We will cover the OpenStack Essex deploy at the start of each planned session (9am, Noon, 3pm & 8pm EDT). Before we cover the OpenStack deploy, we’ll spend 10 minutes setting (and posting) the agenda for the next three hours based on attendee input.

Even if we are not talking about a topic on the main channel, we will keep a dialog going on topic specific Google hangouts. The links to the hangouts will be posted with the Expert track agenda.

We need an OpenStack Reference Deployment (My objectives for Deploy Day)

I’m overwhelmed and humbled by the enthusiasm my team at Dell is seeing for the OpenStack Essex Deploy day on 5/31 (or 6/1 for Asia). What started as a day for our engineers to hack on Essex Cookbooks with a few fellow Crowbarians has morphed into an international OpenStack event spanning Europe, Americas & Asia.

If you want to read more about the event, check out my event logistics post (link pending).

I do not apologize for my promotion of the Dell-lead open source Crowbar as the deployment tool for the OpenStack Essex Deploy. For a community to focus on improving deployment tooling, there must be a stable reference infrastructure. Crowbar provides a fast and repeatable multi-node environment with scriptable networking and packaging.

I believe that OpenStack benefits from a repeatable multi-node reference deployment. I’ll go further and state that this requires DevOps tooling to ensure consistency both within and between deployments.

DevStack makes trunk development more canonical between different developers. I hope that Crowbar will help provide a similar experience for operators so that we can truly share deployment experience and troubleshooting. I think it’s already realistic for Crowbar deployments to a repeatable enough deployment that they provide a reference for defect documentation and reproduction.

Said more plainly, it’s a good thing if a lot of us use OpenStack in the same way so that we can help each out.

My team’s choice to accelerate releasing the Crowbar barclamps for OpenStack Essex makes perfect sense if you accept our rationale for creating a community baseline deployment.

Crowbar is Dell-lead, not Dell specific.

One of the reasons that Crowbar is open source and we do our work in the open (yes, you can see our daily development in github) is make it safe for everyone to invest in a shared deployment strategy. We encourage and welcome community participation.

PS: I believe the same is true for any large scale software project. Watch out for similar activity around Apache Hadoop as part of our collaboration with Cloudera!

Quick turn OpenStack Essex on Crowbar (BOOM, now we’re at v1.4!)

Don’t blink if you’ve been watching the Crowbar release roadmap!

My team at Dell is about to turn another release of Crowbar. Version 1.3 released 5/14 (focused on Cloudera Apache Hadoop) and our original schedule showed several sprints of work on OpenStack Essex. Upon evaluation, we believe that the current community developed Essex barclamps are ready now.

The healthy state of the OpenStack Essex deployment is a reflection of 1) the quality of Essex and 2) our early community activity in creating deployments based Essex RC1 and Ubuntu Beta1.

We are planning many improvements to our OpenStack Essex and Crowbar Framework; however, most deployments can proceed without these enhancements.  This also enables participants in the 5/31 OpenStack Essex Deploy Day.

By releasing a core stable Essex reference deployment, we are accelerating field deployments and enabling the OpenStack ecosystem. In terms of previous posts, we are eliminating release interlocks to enable more downstream development. Ultimately, we hope that we are also creating a baseline OpenStack deployment.

We are also reducing the pressure to rush more disruptive Crowbar changes (like enabling high availability, adding multiple operating systems, moving to Rails 3, fewer crowbarisms in cookbooks and streamlining networking). With this foundational Essex release behind us (we call it an MVP), we can work on more depth and breadth of capability in OpenStack.

One small challenge, some of the changes that we’d expected to drop have been postponed slightly. Specifically, markdown based documentation (/docs) and some new UI pages (/network/nodes, /nodes/families). All are already in the product under but not wired into the default UI (basically, a split test).

On the bright side, we did manage to expose 10g networking awareness for barclamps; however, we have not yet refactored to barclamps to leverage the change.

Asia-Pac Session for OpenStack Essex Global Deploy day

I did not want us to neglect Asia-Pac for the upcoming OpenStack Deploy day, so I was delighted when Mike Pittaro offered to help host the online content for the last session. Mike is an OpenStack contributor who recently joined my team at Dell.

This addresses the concern that our first Essex hack day was America’s daytime only so it was difficult for time zones east of GMT to participate.

We are working with Dell teams in Asia-Pac to setup more information to support Japan, China, Korea and Australia.

This picture, taken by Dan Choquette (my team too!), is from Toyko DevOpsDays.