Digital Rebar Releases V3.2 – Stage Workflow

In v3.2, Digital Rebar continues to refine the groundbreaking provisioning workflow introduced in v3.1. Updates to the workflow make it easier to consume by external systems like Terraform. We’ve also improved the consistency and performance of both the content and service.

Note: we are accelerating the release schedule for Digital Rebar with a target of 4 to 6 weeks per release. The goal is to incrementally capture new features in stable releases so there is not a lengthy delay before fixes and features are available.

Here’s a list of features for the v3.2 release.

  • Promoted stage automation to release status in open source – these were RackN content during beta
  • Plugins now include content layers – they don’t require separate content and versioning is easier
  • Feature flags on endpoint and content – allows automation to detect if needed requirements are in place before attempting to use them
  • Improve exit codes from jobs – improves coordination and consistency in jobs
  • Allow runner to continue processing into new installed OS – helps with Terraform handoff and direct disk imaging
  • Add tooling for direct image deploy to sledgehammer – self explanatory
  • Change CLI to use Server models instead of swagger generated code – improves consistency and maintainability of the CLI
  • Machine Inventory (gohai utility) – collects machine information (in Golang!) so that automation can make decisions based on configuration
  • General bug fixes and performance enhancements – this was a release theme
  • Make it easier to export content from an endpoint – user requested feature
  • Improve how tokens and secrets are handed by the server – based on audit

The release of workflow and the addition of inventory means that Digital Rebar v3 effectively replaces all key functions of v2 with a significantly smaller footprint, minimal learning curve and improved performance. One v2 major feature, multi-node coordination, is not on any roadmap for v3 because we believe those use case are well serviced by upstack integrations like Terraform and Ansible.

Follow the Digital Rebar Community:

Manage Hardware like a BOSS – latest OpenCrowbar brings API to Physical Gear

A few weeks ago, I posted about VMs being squeezed between containers and metal.   That observation comes from our experience fielding the latest metal provisioning feature sets for OpenCrowbar; consequently, so it’s exciting to see the team has cut the next quarterly release:  OpenCrowbar v2.2 (aka Camshaft).  Even better, you can top it off with official software support.

Camshaft coordinates activity

Dual overhead camshaft housing by Neodarkshadow from Wikimedia Commons

The Camshaft release had two primary objectives: Integrations and Services.  Both build on the unique functional operations and ready state approach in Crowbar v2.

1) For Integrations, we’ve been busy leveraging our ready state API to make physical servers work like a cloud.  It gets especially interesting with the RackN burn-in/tear-down workflows added in.  Our prototype Chef Provisioning driver showed how you can use the Crowbar API to spin servers up and down.  We’re now expanding this cloud-like capability for Saltstack, Docker Machine and Pivotal BOSH.

2) For Services, we’ve taken ops decomposition to a new level.  The “secret sauce” for Crowbar is our ability to interweave ops activity between components in the system.  For example, building a cluster requires setting up pieces on different systems in a very specific sequence.  In Camshaft, we’ve added externally registered services (using Consul) into the orchestration.  That means that Crowbar will either use existing DNS, Database, or NTP services or set it’s own.  Basically, Crowbar can now work FIT YOUR EXISTING OPS ENVIRONMENT without forcing a dedicated Crowbar only services like DHCP or DNS.

In addition to all these features, you can now purchase support for OpenCrowbar from RackN (my company).  The Enterprise version includes additional server life-cycle workflow elements and features like HA and Upgrade as they are available.

There are AMAZING features coming in the next release (“Drill”) including a message bus to broadcast events from the system, more operating systems (ESXi, Xenserver, Debian and Mirantis’ Fuel) and increased integration/flexibility with existing operational environments.  Several of these have already been added to the develop branch.

It’s easy to setup and test OpenCrowbar using containers, VMs or metal.  Want to learn more?  Join our community in Gitteremail list or weekly interactive community meetings (Wednesdays @ 9am PT).

OpenCrowbar.Anvil released – hammering out a gold standard in open bare metal provisioning

OpenCrowbarI’m excited to be announcing OpenCrowbar’s first release, Anvil, for the community.  Looking back on our original design from June 2012, we’ve accomplished all of our original objectives and more.
Now that we’ve got the foundation ready, our next release (OpenCrowbar Broom) focuses on workload development on top of the stable Anvil base.  This means that we’re ready to start working on OpenStack, Ceph and Hadoop.  So far, we’ve limited engagement on workloads to ensure that those developers would not also be trying to keep up with core changes.  We follow emergent design so I’m certain we’ll continue to evolve the core; however, we believe the Anvil release represents a solid foundation for workload development.
There is no more comprehensive open bare metal provisioning framework than OpenCrowbar.  The project’s focus on a complete operations model that comprehends hardware and network configuration with just enough orchestration delivers on a system vision that sets it apart from any other tool.  Yet, Crowbar also plays nicely with others by embracing, not replacing, DevOps tools like Chef and Puppet.
Now that the core is proven, we’re porting the Crowbar v1 RAID and BIOS configuration into OpenCrowbar.  By design, we’ve kept hardware support separate from the core because we’ve learned that hardware generation cycles need to be independent from the operations control infrastructure.  Decoupling them eliminates release disruptions that we experienced in Crowbar v1 and­ makes it much easier to use to incorporate hardware from a broad range of vendors.
Here are some key components of Anvil
  • UI, CLI and API stable and functional
  • Boot and discovery process working PLUS ability to handle pre-populating and configuration
  • Chef and Puppet capabilities including Birk Shelf v3 support to pull in community upstream DevOps scripts
  • Docker, VMs and Physical Servers
  • Crowbar’s famous “late-bound” approach to configuration and, critically, networking setup
  • IPv6 native, Ruby 2, Rails 4, preliminary scale tuning
  • Remarkably flexible and transparent orchestration (the Annealer)
  • Multi-OS Deployment capability, Ubuntu, CentOS, or Different versions of the same OS
Getting the workloads ported is still a tremendous amount of work but the rewards are tremendous.  With OpenCrowbar, the community has a new way to collaborate and integration this work.  It’s important to understand that while our goal is to start a quarterly release cycle for OpenCrowbar, the workload release cycles (including hardware) are NOT tied to OpenCrowbar.  The workloads choose which OpenCrowbar release they target.  From Crowbar v1, we’ve learned that Crowbar needed to be independent of the workload releases and so we want OpenCrowbar to focus on maintaining a strong ops platform.
This release marks four years of hard-earned Crowbar v1 deployment experience and two years of v2 design, redesign and implementation.  I’ve talked with DevOps teams from all over the world and listened to their pains and needs.  We have a long way to go before we’re deploying 1000 node OpenStack and Hadoop clusters, OpenCrowbar Anvil significantly moves the needle in that direction.
Thanks to the Crowbar community (Dell and SUSE especially) for nurturing the project, and congratulations to the OpenCrowbar team getting us this to this amazing place.

 

Looking to Leverage OpenStack Havana? Crowbar delivers 3xL!

openstack_havanaThe Crowbar community has a tradition of “day zero ops” community support for the latest OpenStack release at the summit using our pull-from-source capability.  This release we’ve really gone the extra mile by doing it one THREE Linux distros (Ubuntu, RHEL & SLES) in parallel with a significant number of projects and new capabilities included.

I’m especially excited about Crowbar implementation of Havana Docker support which required advanced configuration with Nova and Glance.  The community also added Heat and Celiometer in the last release cycle plus High Availability (“Titanium”) deployment work is in active development.  Did I mention that Crowbar is rocking OpenStack deployments?  No, because it’s redundant to mention that.  We’ll upload ISOs of this work for easy access later in the week.

While my team at Dell remains a significant contributor to this work, I’m proud to point out to SUSE Cloud leadership and contributions also (including the new Ceph barclamp & integration).  Crowbar has become a true multi-party framework!

 

Want to learn more?  If you’re in Hong Kong, we are hosting a Crowbar Developer Community Meetup on Monday, November 4, 2013, 9:00 AM to 12:00 PM (HKT) in the SkyCity Marriott SkyZone Meeting Room.  Dell, dotCloud/Docker, SUSE and others will lead a lively technical session to review and discuss the latest updates, advantages and future plans for the Crowbar Operations Platform. You can expect to see some live code demos, and participate in a review of the results of a recent Crowbar 2 hackathon.  Confirm your seat here – space is limited!  (I expect that we’ll also stream this event using Google Hangout, watch Twitter #Crowbar for the feed)

My team at Dell has a significant presence at the OpenStack Summit in Hong Kong (details about activities including sponsored parties).  Be sure to seek out my fellow OpenStack Board Member Joseph George, Dell OpenStack Product Manager Kamesh Pemmaraju and Enstratius/Dell Multi-Cloud Manager Founder George Reese.

Note: The work referenced in this post is about Crowbar v1.  We’ve also reached critical milestones with Crowbar v2 and will begin implementing Havana on that platform shortly.

OpenStack Havana provides foundation for XXaaS you need

Folsom SummitIt’s been a long time, and a lot of summits, since I posted how OpenStack was ready for workloads (back in Cactus!).  We’ve seen remarkable growth of both the platform technology and the community surrounding it.  So much growth that now we’re struggling to define “what is core” for the project and I’m proud be on the Foundation Board helping to lead that charge.

So what’s exciting in Havana?

There’s a lot I am excited about in the latest OpenStack release.

Complete Split of Compute / Storage / Network services

In the beginning, OpenStack IaaS was one service (Nova).  We’ve been breaking that monolith into distinct concerns (Compute, Network, Storage) for the last several releases and I think Havana is the first release where all of the three of the services are robust enough to take production workloads.

This is a major milestone for OpenStack because knowledge that the APIs were changing inhibited adoption.

ENABLING TECH INTEGRATION: Docker & Ceph

We’ve been hanging out with the Ceph and Docker teams, so you can expect to see some interesting.  These two are proof of the a fallacy that only OpenStack projects are critical to OpenStack because neither of these technologies are moving under the official OpenStack umbrella.  I am looking forward to seeing both have dramatic impacts in how cloud deployments.

Docker promises to make Linux Containers (LXC) more portable and easier to use.  This paravirtualization approach provides near bear metal performance without compromising VM portability.  More importantly, you can oversubscribe LXC much more than VMs.  This allows you to dramatically improve system utilization and unlocks some other interesting quality of service tricks.

Ceph is showing signs of becoming the scale out storage king.  Beyond its solid data dispersion algorithm, a key aspect of its mojo is that is delivers both block and object storage.  I’ve seen a lot of interest in consolidating both types of storage into a single service.  Ceph delivers on that plus performance and cost.  It’s a real winner.

Crowbar Integration & High Availability Configuration!

We’ve been making amazing strides in the Crowbar + OpenStack integration!  As usual, we’re planning our zero day community build (on the “Roxy” branch) to get people started thinking about operationalizing OpenStack.   This is going to be especially interesting because we’re introducing it first on Crowbar 1 with plans to quickly migrate to Crowbar 2 where we can leverage the attribute injection pattern that OpenStack cookbooks also use.  Ultimately, we expect those efforts to converge.  The fact that Dell is putting reference implementations of HA deployment best practices into the open community is a major win for OpenStack.

Tests, Tests, Tests & Continuous Delivery

OpenStack continue to drive higher standards for reviews, integration and testing.  I’m especially excited to the volume and activity around our review system (although backlogs in reviews are challenges).  In addition, the community continues to invest in the test suites like the Tempest project.  These are direct benefits to operators beyond simple code quality.  Our team uses Tempest to baseline field deployments.  This means that OpenStack test suites help validate live deployments, not just lab configurations.

We achieve a greater level of quality when we gate code check-ins on tests that matter to real deployments.   In fact, that premise is the basis for our “what is core” process.  It also means that more operators can choose to deploy OpenStack continuously from trunk (which I consider to be a best practice scale ops).

Where did we fall short?

With growth comes challenges, Havana is most complex release yet.  The number of projects that are part the OpenStack integrated release family continues to expand.  While these new projects show the powerful innovation engine at work with OpenStack, they also make the project larger and more difficult to comprehend (especially for n00bs).  We continue to invest in Crowbar as a way to serve the community by making OpenStack more accessible and providing open best practices.

We are still struggling to resolve questions about interoperability (defining core should help) and portability.  We spent a lot of time at the last two summits on interoperability, but I don’t feel like we are much closer than before.  Hopefully, progress on Core will break the log jam.

Looking ahead to Ice House?

I and many leaders from Dell will be at the Ice House Summit in Hong Kong listening and learning.

The top of my list is the family of XXaaS services (Database aaS, Load Balanacer aaS, Firewall aaS, etc) that have appeared.  I’m a firm believer that clouds are more than compute+network+storage.  With a stable core, OpenStack is ready to expand into essential platform services.

If you are at the summit, please join Dell (my employer) and Intel for the OpenStack Summit Welcome Reception (RSVP!) kickoff networking and social event on Tuesday November 5, 2013 from 6:30 – 8:30pm at the SkyBistro in the SkyCity Marriott.   My teammate, Kamesh Pemmaraju, has a complete list of all Dell the panels and events.

Crowbar cuts OpenStack Grizzly (“pebbles”) branch & seeks community testing

Pebbles CutThe Crowbar team (I work for Dell) continues to drive towards “zero day” deployment readiness. Our Hadoop deployments are tracking Dell | Cloudera Hadoop-powered releases within a month and our OpenStack releases harden within three months.

During the OpenStack summit, we cut our Grizzly branch (aka “pebbles”) and switched over to the release packages. Just a reminder, we basically skipped Folsom. While we’re still tuning out issues on OpenStack Networking (OVS+GRE) setup, we’re also looking for community to start testing and tuning the Chef deployment recipes.

We’re just sprints from release; consequently, it’s time for the Crowbar/OpenStack community to come and play! You can learn Grizzly and help tune the open source Ops scripts.

While the Crowbar team has been generating a lot of noise around our Crowbar 2.0 work, we have not neglected progress on OpenStack Grizzly.  We’ve been building Grizzly deploys on the 1.x code base using pull-from-source to ensure that we’d be ready for the release. For continuity, these same cookbooks will be the foundation of our CB2 deployment.

Features of Crowbar’s OpenStack Grizzly Deployments

  • We’ve had Nova Compute, Glance Image, Keystone Identity, Horizon Dashboard, Swift Object and Tempest for a long time. Those, of course, have been updated to Grizzly.
  • Added Block Storage
    • importable Ceph Barclamp & OpenStack Block Plug-in
    • Equalogic OpenStack Block Plug-in
  • Added Quantum OpenStack Network Barclamp
    • Uses OVS + GRE for deployment
  • 10 GB networking configuration
  • Rabbit MQ as its own barclamp
  • Swift Object Barclamps made a lot of progress in Folsom that translates to Grizzly
    • Apache Web Service
    • Rack awareness
    • HA configuration
    • Distribution Report
  • “Under the covers” improvements for Crowbar 1.x
    • Substantial improvements in how we configure host networking
    • Numerous bug fixes and tweaks
  • Pull from Source via the Git barclamp
    • Grizzly branch was switched to use Ubuntu & SUSE packages

We’ve made substantial progress, but there are still gaps. We do not have upgrade paths from Essex or Folsom. While we’ve been adding fault-tolerance features, full automatic HA deployments are not included.

Please build your own Crowbar ISO or check our new SoureForge download site then join the Crowbar List and IRC to collaborate with us on OpenStack (or Hadoop or Crowbar 2). Together, we will make this awesome.

Crowbar’s early twins: Cloudera Hadoop & OpenStack Essex

I’m proud to see my team announce the twin arrival of the Dell | Cloudera Apache Hadoop (Manager v4) and Dell OpenStack-Powered Cloud (Essex) solutions.

Not only are we simultaneously releasing both of these solutions, they reflect a significant acceleration in pace of delivery.  Both solutions had beta support for their core technologies (Cloudera 4 & OpenStack Essex) when the components were released and we have dramatically reduced the lag from component RC to solution release compared to past (3.7 & Diablo) milestones.

As before, the core deployment logic of these open source based solutions was developed in the open on Crowbar’s github.  You are invited to download and try these solutions yourself.   For Dell solutions, we include validated reference architectures, hardware configuration extensions for Crowbar, services and support.

The latest versions of Hadoop and OpenStack represent great strides for both solutions.   It’s great to be able have made them more deployable and faster to evaluate and manage.

Very Costly Accounting and why I value ship ready code

Or be careful what you measure because you’ll get it.

One of Agile’s most effective, least understood and seldom realized disciplines is that teams should “maintain ship-readiness” of their product at all times.  Explaining the real value of this discipline in simple terms eluded me for years until I was talking to an accountant on a ski lift.

Before I talk about snow and mountain air inspired insights, let me explain what ship-ready means.  It means that you should always be ready to deliver the output from your last iteration to your most valuable customer.   For example, when my company, BEware, finished our last iteration we could have delivered it to our top customer, Fin & Key, without losing our much needed beauty sleep.   Because we maintain ship-readiness, we worked to ensure that they could upgrade, have complete documentation, and high quality without spending extra cycles on release hardening.

The version that we’d ship to Fin & Key would probably not have any new features enabled (see tippy towers & Eric Ries on split testing) , but it would have fixed defects and the incremental code for those new features baked in.  While we may decide to limit shipments to fixed times for marketing reasons, that must not keep us from always being ready to ship.

Meanwhile, back at 8,200 feet, my accountant friend was enrolled in Cost Accounting 101.  To fulfill my mission as an Agile Subversive, I suggested reading  “The Goal” by Eli Goldratt which comes out very strongly against the evils of CA.  Goldratt’s logic is simple – if you want people to sub-optimize and ignore the overall system productivity then you’d assign costs to each sub-component of your system.  The result in manufacturing is that people will always try to keep the most expensive machine 100% utilized even if that causes lots of problems elsewhere and drives up costs all over the factory.

Cost Accounting’s process of measuring on a per cost basis (instead of a system basis) causes everyone to minimize the cost at their area rather than collaborate to make the system more efficient.  I’ve worked in many environments where management tried to optimize expensive developer time by off loading documentation, quality, and support to different teams.  The result was a much less effective system where defects were fixed late (if ever), test automation was never built, documentation was incomplete, and customer field issues lingered like the smell of stale malt beverage in a frat house.

No one wanted these behaviors, yet they were endemic because the company optimized developer time instead of working product.

Agile maintains ship readiness because it becomes Engineering’s primary measurement.  Making sellable product the top priority focuses team on systems and collaboration.  It may not be as “efficient” to have a highly paid developer running tests; however, it does real economic harm if the developer continues to write untested code ahead of your ability to verify it.  Even more significantly, a developer who plays a larger part in test, documentation, and supports is much more invested in fixing problems early.

If your company wants ship product then find a metric that gets you to that goal.  For Agile teams, that metric is percent of iterations delivering ship ready product.  My condolences if your team’s top values are milestones completed, bugs fixed, or hours worked.