jump to navigation

Open Source Cloud Bootstrapping Revisised March 27, 2012

Posted by Rob H in Crowbar, Dell, DevOps, Greg Althaus, Open source, OpenStack, OpenStack Design Summit.
add a comment

At the OpenStack last design conference, Greg Althaus and I presented about updates (presentation here) we were making to a Nov 2010 cloud architecture white paper.

The revised “Bootstrapping Open Source Clouds” white paper has been out for a few months so I thought it was past time to throw out a link.

I’m really pleased about this update because it reflects real world experience my team has working with customers and partners on OpenStack (and Hadoop) deployments.

Executive Summary
Bringing a cloud infrastructure online can be a daunting bootstrapping challenge. Before
hanging out a shingle as a private or public cloud service provider, you must select a platform,
acquire hardware, configure your network, set up operations services, and integrate it to work
well together. That is a lot of moving parts before you have even installed a sellable application.
This white paper walks you through the decision process to get started with an open source
cloud infrastructure based on OpenStack™ and Dell™ PowerEdge™ C servers. At the end, you’ll
be ready to design your own trial system that will serve as the foundation of your hyperscale
cloud.
2011 Revision Notes
In the year since the the original publication of this white paper, we worked with many
customers building OpenStack clouds. These clouds range in size from small six-node lab
systems to larger production deployments. Based on these experiences, we updated this white
paper to reflect lessons learned.

Four alternatives to Process Interlock March 16, 2012

Posted by Rob H in Agile, Crowbar, Lean, Open source, Process Interlock, Quality, RackSpace.
Tags: , , ,
add a comment

Note: This is the third and final part of 3 part series about the “process interlock dilemma.”

In post 1, I’ve spelled out how evil Process Interlock causes well intentioned managers to add schedule risk and opportunity cost even as they appear to be doing the right thing. In post 2, I offered some alternative outcomes when process interlock is avoided. In this post, I attempt to provide alternatives to the allure of process interlock. We must have substitute interlocks types to replace our de facto standard because there are strong behavioral and traditional reasons to keep broken processes. In other words, process Interlock feels good because it gives you the illusion that your solution is needed and vital to other projects.

If your product is vital to another team then they should be able to leverage what you have, not what you’re planning to have.

We should focus on delivered code instead of future promises. I am not saying that roadmaps and projections are bad – I think they are essential. I am saying that roadmaps should be viewed as potential not as promises.

  1. No future commits (No interlock)

    The simplest way to operate without any process interlock is to never depend on other groups for future deliveries. This approach is best for projects that need to move quickly and have no tolerance for schedule risk. This means that your project is constrained to use the “as delivered” work product from all external groups. Depending on needs, you may further refine this as only rely on stable and released work.

    For example, OpenStack Cactus relied on features that were available in the interim 10.10 Ubuntu version. This allowed the project to advance faster, but also limited support because the OS this version was not a long term support (LTS) release.

  2. Smaller delivery steps (MVP interlock)

    Sometimes a new project really needs emerging capabilities from another project. In those cases, the best strategy is to identify a minimum viable feature set (or “product”) that needs to be delivered from the other project. The MVP needs to be a true minimum feature set – one that’s just enough to prove that the integration will work. Once the MVP has been proven, a much clearer understanding of the requirements will help determine the required amount of interlock. My objective with an MVP interlock is to find the true requirements because IMHO many integrations are significantly over specified.

    For example, the OpenStack Quantum project (really, any incubated OpenStack projects) focuses on delivering the core functionality first so that the ecosystem and other projects can start using it as soon as possible.

  3. Collaborative development (Shared interlock)

    A collaborative interlock is very productive when the need for integration is truly deep and complex. In this scenario, the teams share membership or code bases so that the needs of each team is represented in real time. This type of transparency exposes real requirements and schedule risk very quickly. It also allows dependent teams to contribute resources that accelerate delivery.

    For example, our Crowbar OpenStack team used this type of interlock with the Rackspace OpenStack team to ensure that we could get Diablo code delivered and deployed as fast as possible.

  4. Collaborative requirements (Fractal interlock)

    If you can’t collaborate or negotiate an MVP then you’re forced into working at the requirements level instead of development collaboration. You can think of this as a sprint-roadmap fast follow strategy because the interlocked teams are mutually evolving design requirements.

    I call this approach Fractal because you start at big concepts (road maps) and drill down to more and more detail (sprints) as the monitored project progresses. In this model, you interlock on a general capability initially and then work to refine the delivery as you learn more. The goal is to avoid starting delays or injecting false requirements that slow delivery.

    For example, if you had a product that required power from hamsters running in wheels then you’d start saying that you needed a small fast running animal. Over the next few sprints, you’d likely refine that down to four legged mammals and then to short tailed high energy rodents. Issues like nocturnal or bites operators could be addressed by the Hamster team or by the Wheel team as the issues arose. It could turn out that the right target (a red bull sipping gecko) surfaces during short tail rodent design review. My point is that you can avoid interlocks by allowing scope to evolve.

Breaking Process Interlocks delivers significant ROI

I have been trying to untangle both the cause and solution of process interlock for a long time. My team at Dell has an interlock-averse culture and it accelerates our work delivery. I write about this topic because I have real world experience that eliminating process interlocks increases

  1. team velocity
  2. collaboration
  3. quality
  4. return on investment

These are significant values that justify adoption of these non-interlock approachs; however, I have a more selfish motivation.

We want to work with other teams that are interlock-averse because the impacts multiply. Our team is slowed when others attempt to process interlock and accelerated when we are approached in the ways I list above.

I suspect that this topic deserves a book rather than a three part blog series and, perhaps, I will ultimately create one. Until then, I welcome your comments, suggestions and war stories.

Cloudera Manager Barclamp posted! (part of updated Dell | Cloudera Apache Hadoop Solution) March 16, 2012

Posted by Rob H in Big Data Analytics, Cloudera, Crowbar, Dell, Hadoop, Open source.
Tags: , , , , ,
add a comment

My team at Dell has been driving to transparency and openness around Crowbar plus our OpenStack and Hadoop powered solutions.  Specifically, our work for our coming release is maintained in the open on the Dell CloudEdge Github site.  You can see (and participate in!) our development and validation work in advance of our official release.

I’m pleased to note that our Cloudera Manager barclamp has been posted to Github!

This barclamp supersedes  the Hadoop barclamp in the next release of the Dell | Cloudera Apache Hadoop solution.  You can built it in Crowbar using the “cloudera-os-build”  branch for Crowbar.  Do not fear!  The Hadoop barclamp still exists (hadoop-os-build branch).

Both the new and original Hadoop barclamp use the Cloudera Hadoop distribution (aka CDH); however, the new barclamp is able to leverage Cloudera‘s latest management capabilities.  For the Dell solution, Cloudera Manager has always been part of the offering.  The primary difference is that we are improving the level of integration.  I promise to post more about the features of the solution as we get closer to release.

OpenStack Essex Deploy Day: First Steps to Production March 12, 2012

Posted by Rob H in Andi Abes, Canonical, Crowbar, Dell, Greg Althaus, Matt Ray, Meetup, OpenStack, Opscode, RackSpace, Victor Lowther.
Tags: , , , , ,
2 comments

One March 8th, 70 people from around the world gathered on the Crowbar IM channel to begin building a production grade OpenStack Essex deployment. The event was coordinated as meet-ups by the Dell OpenStack/Crowbar team (my team) in two physical locations: the Nokia offices in Boston and the TechRanch in Austin.

My objective was to enable the community to begin collaboration on Essex Deployment. At that goal, we succeeded beyond my expectations.

IMHO, the top challenge for OpenStack Essex is to build a community of deploying advocates. We have a strong and dynamic development community adding features to the project. Now it is time for us to build a comparable community of deployers. By providing a repeatable, shared and open foundation for OpenStack deployments, we create a baseline that allows collaboration and co-development. Not only must we make deployments easy and predictable, we must also ensure they are scalable and production ready.

Having solid open production deployment infrastructure drives OpenStack adoption.

Our goal on the 8th was not to deliver finished deployments; it was to the start of Essex deployment community collaboration. To ensure that we could focus on getting to an Essex baseline, our team invested substantial time before the event to make sure that participants had a working Essex reference deployment.

By the nature of my team’s event leadership and our approach to OpenStack, the event was decidedly Crowbar focused. I feel like this is an acceptable compromise because Crowbar is open and provides a repeatable foundation. If everyone has the same foundation then we can focus on the truly critical challenges of ensuring consistent OpenStack deployments. Even using Crowbar, we waste a lot of time trying to figure out the differences between configurations. Lack of baseline consistency seriously impedes collaboration.

The fastest way to collaborate on OpenStack deployment is to have a reference deployment as a foundation.

Success By The Numbers

This was a truly international community collaborative event. Here are some of the companies that participated:

Dell (sponsor), Nokia (sponsor), Rackspace, Opscode, Canonical, Fedora, Mirantis, Morphlabs, Nicira, Enstratus, Deutsche Telekom Innovation Laboratories, Purdue University, Orbital Software Solutions, XepCloud and others.

PLEASE COMMENT here if I missed your company and I will add it to the list.

On the day of the event, we collected the following statistics:

  • 70 people on Skype IM channel (it’s not too late to join by pinging DellCrowbar with “Essex barclamps”).
  • 14+ companies
  • 2 physical sites with 10-15 people at each
  • 4 fold increase in traffic on the Crowbar Github to 813 hits.
  • 66 downloads of the Deploy day ISO
  • 8 videos capture from deploy day sessions.
  • World-wide participation

For over 70 people to spend a day together at this early stage in deployment is a truly impressive indication of the excitement that is building around OpenStack.

Improvements for Next Deploy Day

This was a first time that Andi Abes (Boston event lead), Rob Hirschfeld (Austin event lead) or Jean-Marie Martini (Dell event lead) had ever coordinated an event like this. We owe much of the success to efforts by Greg Althaus, Victor Lowther and the Canonical 12.04/Essex team before the event. Also, having physical sites was very helpful.

We are planning to do another event, so we are carefully tracking ways to improve.

Here are some issues we are tracking.

  • Issues with setting up a screen and voice share that could handle 70 people.
  • Lack of test & documentation on Crowbar meant too much time focused on Crowbar
  • Connectivity issues distributed voice
  • Should have started with DevStack as a baseline
  • more welcome in the comments!

Thank you!

I want to thank everyone who participated in making this event a huge success!

OpenStack Essex Deploy Day 3/8 – Get involved and install with us March 5, 2012

Posted by Rob H in Andi Abes, Crowbar, Dell, Greg Althaus, Matt Ray, Meetup, OpenStack, Suse, Victor Lowther.
Tags: , , , , , ,
1 comment so far

My team at Dell has been avidly tracking the upsdowns, and breakthroughs of the OpenStack Essex release.  While we still have a few milestones before the release is cut, we felt like the E4 release was a good time to begin the work on Essex deployment.  Of course, the final deployment scripts will need substantial baking time after the final release on April 5th; however, getting deployments working will help influence the quality efforts and expand the base of possible testers.

To rally behind Essex Deployments, we are hosting a public work day on Thursday March 8th.

For this work day, we’ll be hosting all-day community events online and physically in Austin and Boston.  We are getting commitments from other Dell teams, partners and customers around the world to collaborate.  The day is promising to deliver some real Essex excitement.

The purpose of these events is to deliver the core of a working OpenStack Essex deployment.  While my team is primarily focused on deploys via Crowbar/Chef, we are encouraging anyone interested in laying down OpenStack Essex to participate.  We will be actively engaged on the OpenStack IRC and mailing lists too.

We have experts in OpenStack, Chef, Crowbar and Operating Systems (Canonical, SUSE, and RHEL) engaged in these activities.

This is a great time to start learning about OpenStack (or Crowbar) with hands-on work.  We are investing substantial upfront time (checkout out the Crowbar wiki for details) to ensure that there is a working base OpenStack Essex deploy on Ubuntu 12.04 beta.  This deploy includes the Crowbar 1.3 beta with some new features specifically designed to make testing faster and easier than ever before.

In the next few days, I’ll cut a 12.04 ISO and OpenStack Barclamp TARs as the basis for the deploy day event.  I’ll also be creating videos that help you quickly get a test lab up and running.  Visit the wiki or meetup sites to register and stay tuned for details!

Austin OpenStack Meetup: Keystone & Knife (2/20 notes via Greg Althaus) February 24, 2012

Posted by Rob H in Amazon, Crowbar, Dell, Greg Althaus, Matt Ray, Meetup, OpenStack, Opscode, RackSpace, Video.
Tags: , , , , , , ,
add a comment

I could not make it to the recent Austin OpenStack Meetup, but Greg Althaus generously let me post his notes from the event.

Background

Matt Ray talks about Chef

Matt Ray from Opscode presented some of the work with Chef and OpenStack. He talked about the three main chef repos floating around. He called out Anso’s original cookbook set that is the basis for the Crowbar cookbooks (his second set), and his final set is the emerging set of cookbooks in OpenStack proper. The third one is interesting and what he plans to continue working on to make into his public openstack cookbooks. These are an amalgamation of smokestack, RCB, Anso improvements, and his (Crowbar’s).

He then demoed his knife plugin (slideshare) to build openstack virtual servers using the Openstack API. This is nice and works against TryStack.org (previously “Free Cloud”) and RCB’s demo cloud. All of that is on his github repo with instructions how to build and use. Matt and I talked about trying to get that into our Crowbar distro.

There were some questions about flow and choice of OpenStack API versus Amazon EC2 API because there was already an EC2 knife set of plugins.

Ziad Sawalha talks about Keystone

Ziad Sawalha is the PLT (Project Technical Lead) for Keystone. He works for Rackspace out of San Antonio. He drove up for the meeting.

He split his talk into two pieces, Incubation Process and Keystone Overview. He asked who was interested in what and focused his talk more towards overview than incubation.

Some key take-aways:

  • Keystone comes from Rackspace’s strong, flexible, and scalable API. It started as a known quantity from his perspective.
  • Community trusted nothing his team produced from an API perspective
  • Community is python or nothing
    • His team was ignored until they had a python prototype implementing the API
    • At this point, comments on API came in.
  • Churn in API caused problems with implementation and expectations around the close of Diablo.
    • Because comments were late, changes occurred.
    • Official implementation lagged and stalled into arriving.
  • API has been stable since Diablo final, but code is changing. that is good and shows strength of API.
  • Side note from Greg, Keystone represents to me the power of API over Code. You can have innovation around the implementation as long all the implementations have a fair ground work to plan under which is an API specification. The replacement of Keystone with the Keystone Light code base is an example of this. The only reason this is possible is that the API was sound and documented.  (Rob’s post on this)

Ziad spent the rest of his time talking about the work flow of Keystone and the API points. He covered the API points.

  • Client to Keystone, Keystone to Client for initial auth token
  • Client to Middleware API for the services to have a front.
  • Middleware to Keystone to verify and establish identity.
  • Middleware to Service to pass identity

Not many details other then flow and flexibility. He stressed the API design separated protocol from actions and data at all the layers. This allows for future variations and innovations while maintaining the APIs.

Ziad talked about the state of Essex.

  • Planned
    • RBAC (aka Role Based Access Control)
    • Stability
    • Many backends
  • Actual
    • Code replacement Keystone Light
    • Stability
    • LDAP backend
    • SQL backend

Folsum work:

  • RBAC
  • Stability
  • AD backend
  • Another backend
  • Federation was planned but will most likely be pushed to G
    • Federation is the ability for multiple independent Keystones to operate (bursting use case)
    • Dependent upon two other federation components (networking and billing/metering)

Crowbar+OpenStack Insights for the week: Food Fight Podcast & Boston Meetup 2/1 January 30, 2012

Posted by Rob H in Crowbar, Hadoop, Meetup, OpenStack.
Tags: , , , , , , ,
add a comment

Please don’t confuse a lack of posts with a lack of activity!  I’ve been in the center of a whirlwind of Crowbar, OpenStack and Hadoop for my team at Dell.  I’ve also working on an interesting side project with Liquid Leadership author (and would-be star ship captain) Brad Szollose.

I just don’t have time to post all of the awesomeness.  I can tell you that my team is very focused on Hadoop (RHEL 6.2/CentOS 6.2 + open Cloudera Distro) barclamps as we get some Diablo deployments done.  Also the Crowbar list has been very active about Diablo.  If you’re looking for advanced information, there is  some inside scoop on the Crowbar FoodFight podcast I did with Bryan Berry & Matt Ray.

I’ll be in BOSTON THIS WEDNESDAY 2/1 for the OpenStack Meetup there.  We’re going to be talking about Quantum and the OpenStack Foundation.  I suspect that Keystone will come up too (but that’s the subject of another post).  Of course, it’s not just your humble blogger: the whole Dell CloudEdge OpenStack/Crowbar team will be on hand!  So put on your cloud geek hat and take a trip to Harvard for the meetup!

CloudOps white paper explains “cloud is always ready, never finished” January 16, 2012

Posted by Rob H in CloudFoundry, CloudOps, Crowbar, DevOps, Greg Althaus, Hadoop, Linux, Open source, OpenStack.
2 comments

I don’t usually call out my credentials, but knowing the I have a Masters in Industrial Engineering helps (partially) explain my passion for process as being essential to successful software delivery. One of my favorite authors, Mary Poppendiek, explains undeployed code as perishable inventory that you need to get to market before it loses value. The big lessons (low inventory, high quality, system perspective) from Lean manufacturing translate directly into software and, lately, into operation as DevOps.

What we have observed from delivering our own cloud products, and working with customers on thier’s, is that the operations process for deployment is as important as the software and hardware. It is simply not acceptable for us to market clouds without a compelling model for maintaining the solution into the future. Clouds are simply moving too fast to be delivered without a continuous delivery story.

This white paper [link here!] has been available since the OpenStack conference, but not linked to the rest of our OpenStack or Crowbar content.

Early crop of Crowbar 1.3 features popping up January 11, 2012

Posted by Rob H in Crowbar, Hadoop, Video.
Tags: , , , ,
7 comments

My team at Dell is still figuring out some big items for the 1.3 release; however, somethings were just added that is worth calling out.

  1. Ubuntu 11.04 support!   Thanks to Justin Shepherd from Rackspace Cloud Builders!
  2. Alias names for nodes in the UI
  3. User managed node groups in the UI
  4. Ability to pre-populate the alias, description and group for a node (not integrated with DNS yet)
  5. Hadoop is working again – we addressed the missing Ganglia repo issue.  Thanks to Victor Lowther.
For items 2 – 4, I made a short video tour: Node Alias & Group

Also, I’ve spun new open source ISOs with the new features.  User beware!

Crowbar 1.2 released includes OpenStack Diablo Final December 21, 2011

Posted by Rob H in Crowbar, Hadoop, OpenStack.
Tags: , , ,
17 comments

With the holiday rush, I neglected to post about Monday’s Crowbar v1.2 release (ISO here)!

The core focus for this release was to support the OpenStack Diablo Final bits (which my employer, Dell, includes as part of the “Dell OpenStack Powered Cloud Solution“); however, we added a lot of other capability as we continue to iterate on Crowbar.

I’m proud of our team’s efforts on this release on both on features and quality.  I’m equally delighted about the Crowbar community engagement via the Crowbar list server.  Crowbar is not hardware or operating system specific so it’s encouraging to hear about deployments on other gear and see the community helping us port to new operating system versions.

We driving more and more content to Crowbar’s Github as we are working to improve community visibility for Crowbar.  As such, I’ve been regularly updating the Crowbar Roadmap.  I’m also trying to make videos for Crowbar training (suggestions welcome!).  Please check back for updates about upcoming plans and sprint activity.

Crowbar Added Features in v1.2:

  • Central feature was OpenStack Diablo Final barclamps (tag “openstack-os-build”)
  • Improved barclamp packaging
  • Added concepts for “meta” barclamps that are suites of other barclamps
  • Proposal queue and ordering
  • New UI states for nodes & barclamps (led spinner!)
  • Install includes self-testing
  • Service monitoring (bluepill)

Looking forward

Dell has a long list of pending Hadoop and OpenStack deployments using these bits so you can expect to see updates and patches matching our field experiences.  We are very sensitive to community input and want to make Crowbar the best way to deliver a sustainable repeatable reference deployment of OpenStack, Hadoop and other cloud technologies.

Follow

Get every new post delivered to your Inbox.

Join 980 other followers