OpenCrowbar Design Principles: Reintroduction [Series 1 of 6]

Posted on May 28, 2014 by Rob H

While “ready state” as a concept has been getting a lot of positive response, I forget that much of the innovation and learning behind that concept never surfaced as posts here. The Anvil (2.0) release included the OpenCrowbar team cataloging our principles in docs. Now it’s time to repost the team’s work into a short series over the next three days.

In architecting the Crowbar operational model, we’ve consistently ~~twisted~~ adapted traditional computer science concepts like late binding, simulated annealing, emergent behavior, attribute injection and functional programming to create a repeatable platform for sharing open operations practice (post 2).

Functional DevOps aka “FuncOps”

Ok, maybe that’s not going to be the 70’s era hype bubble name, but… the operational model behind Crowbar is entering its third generation and its important to understand the state isolation and integration principles behind that model is closer to functional than declarative programming.

Parliament is Crowbar’s official FuncOps sound track

The model is critical because it shapes how Crowbar approaches the infrastructure at a fundamental level so it makes it easier to interact with the platform if you see how we are approaching operations. Crowbar’s goal is to create emergent services.

We’ll expore those topics in this series to explain Crowbar’s core architectural principles. Before we get into that, I’d like to review some history.

The Crowbar Objective

Crowbar delivers repeatable best practice deployments. Crowbar is not just about installation: we define success as a sustainable operations model where we continuously improve how people use their infrastructure. The complexity and pace of technology change is accelerating so we must have an approach that embraces continuous delivery.

Crowbar’s objective is to help operators become more efficient, stable and resilient over time.

Background

When Greg Althaus (github @GAlhtaus) and Rob “zehicle” Hirschfeld (github @CloudEdge) started the project, we had some very specific targets in mind. We’d been working towards using organic emergent swarming (think ants) to model continuous application deployment. We had also been struggling with the most routine foundational tasks (bios, raid, o/s install, networking, ops infrastructure) when bringing up early scale cloud & data applications. Another key contributor, Victor Lowther (github @VictorLowther) has critical experience in Linux operations, networking and dependency resolution that lead to made significant contributions around the Annealing and networking model. These backgrounds heavily influenced how we approached Crowbar.

First, we started with best of field DevOps infrastructure: Opscode Chef. There was already a remarkable open source community around this tool and an enthusiastic following for cloud and scale operators . Using Chef to do the majority of the installation left the Crowbar team to focus on

Key Features

Heterogeneous Operating Systems – chose which operating system you want to install on the target servers.
CMDB Flexibility (see picture) – don’t be locked in to a devops toolset. Attribute injection allows clean abstraction boundaries so you can use multiple tools (Chef and Puppet, playing together).
Ops Annealer –the orchestration at Crowbar’s heart combines the best of directed graphs with late binding and parallel execution. We believe annealing is the key ingredient for repeatable and OpenOps shared code upgrades
Upstream Friendly – infrastructure as code works best as a community practice and Crowbar use upstream code
without injecting “crowbarisms” that were previously required. So you can share your learning with the broader DevOps community even if they don’t use Crowbar.
Node Discovery (or not) – Crowbar maintains the same proven discovery image based approach that we used before, but we’ve streamlined and expanded it. You can use Crowbar’s API outside of the PXE discovery system to accommodate Docker containers, existing systems and VMs.
Hardware Configuration – Crowbar maintains the same optional hardware neutral approach to RAID and BIOS configuration. Configuring hardware with repeatability is difficult and requires much iterative testing. While our approach is open and generic, the team at Dell works hard to validate a on specific set of gear: it’s impossible to make statements beyond that test matrix.
Network Abstraction – Crowbar dramatically extended our DevOps network abstraction. We’ve learned that a networking is the key to success for deployment and upgrade so we’ve made Crowbar networking flexible and concise. Crowbar networking works with attribute injection so that you can avoid hardwiring networking into DevOps scripts.
Out of band control – when the Annealer hands off work, Crowbar gives the worker implementation flexibility to do it on the node (using SSH) or remotely (using an API). Making agents optional means allows operators and developers make the best choices for the actions that they need to take.
Technical Debt Paydown – We’ve also updated the Crowbar infrastructure to use the latest libraries like Ruby 2, Rails 4, Chef 11. Even more importantly, we’re dramatically simplified the code structure including in repo documentation and a Docker based developer environment that makes building a working Crowbar environment fast and repeatable.

OpenCrowbar (CB2) vs Crowbar (CB1)?

Why change to OpenCrowbar? This new generation of Crowbar is structurally different from Crowbar 1 and we’ve investing substantially in refactoring the tooling, paying down technical debt and cleanup up documentation. Since Crowbar 1 is still being actively developed, splitting the repositories allow both versions to progress with less confusion. The majority of the principles and deployment code is very similar, I think of Crowbar as a single community.

Continue Reading > post 2

OpenCrowbar: ready to fly as OpenOps neutral platform – Dell stepping back

Posted on May 20, 2014 by Rob H

Two of Crowbar Founders: me with Greg Althaus [taken Jan 2013]

With the Anvil release in the bag, Dell announced on the community list yesterday that it has stopped active contribution on the Crowbar project. This effectively relaunches Crowbar as a truly vendor-neutral physical infrastructure provisioning tool.

While I cannot speak for my employer, Dell, about Crowbar; I continue serve in my role as a founder of the Crowbar Project. I agree with Eric S Raymond that founders of open source projects have a responsibility to sustain their community and ensure its longevity.

In the open DevOps bare metal provisioning market, there is nothing that matches the capabilities developed in either Crowbar v1 or OpenCrowbar. The operations model and system focused approach is truly differentiated because no other open framework has been able to integrate networking, orchestration, discovery, provisioning and configuration management like Crowbar.

It is time for the community to take Crowbar beyond the leadership of a single hardware vendor, OS vendor, workload or CMDB tool. OpenCrowbar offers operations freedom and flexibility to build upon an abstracted physical infrastructure (what I’ve called “ready state“).

We have the opportunity to make open operations a reality together.

As a Crowbar founder and its acting community leader, you are welcome to contact me directly or through the crowbar list about how to get engaged in the Crowbar community or help get connected to like-minded Crowbar resources.

OpenCrowbar.Anvil released – hammering out a gold standard in open bare metal provisioning

Posted on April 30, 2014 by Rob H

I’m excited to be announcing OpenCrowbar’s first release, Anvil, for the community. Looking back on our original design from June 2012, we’ve accomplished all of our original objectives and more.

Now that we’ve got the foundation ready, our next release (OpenCrowbar Broom) focuses on workload development on top of the stable Anvil base. This means that we’re ready to start working on OpenStack, Ceph and Hadoop. So far, we’ve limited engagement on workloads to ensure that those developers would not also be trying to keep up with core changes. We follow emergent design so I’m certain we’ll continue to evolve the core; however, we believe the Anvil release represents a solid foundation for workload development.

There is no more comprehensive open bare metal provisioning framework than OpenCrowbar. The project’s focus on a complete operations model that comprehends hardware and network configuration with just enough orchestration delivers on a system vision that sets it apart from any other tool. Yet, Crowbar also plays nicely with others by embracing, not replacing, DevOps tools like Chef and Puppet.

Now that the core is proven, we’re porting the Crowbar v1 RAID and BIOS configuration into OpenCrowbar. By design, we’ve kept hardware support separate from the core because we’ve learned that hardware generation cycles need to be independent from the operations control infrastructure. Decoupling them eliminates release disruptions that we experienced in Crowbar v1 and makes it much easier to use to incorporate hardware from a broad range of vendors.

Here are some key components of Anvil

UI, CLI and API stable and functional
Boot and discovery process working PLUS ability to handle pre-populating and configuration
Chef and Puppet capabilities including Birk Shelf v3 support to pull in community upstream DevOps scripts
Docker, VMs and Physical Servers
Crowbar’s famous “late-bound” approach to configuration and, critically, networking setup
IPv6 native, Ruby 2, Rails 4, preliminary scale tuning
Remarkably flexible and transparent orchestration (the Annealer)
Multi-OS Deployment capability, Ubuntu, CentOS, or Different versions of the same OS

Getting the workloads ported is still a tremendous amount of work but the rewards are tremendous. With OpenCrowbar, the community has a new way to collaborate and integration this work. It’s important to understand that while our goal is to start a quarterly release cycle for OpenCrowbar, the workload release cycles (including hardware) are NOT tied to OpenCrowbar. The workloads choose which OpenCrowbar release they target. From Crowbar v1, we’ve learned that Crowbar needed to be independent of the workload releases and so we want OpenCrowbar to focus on maintaining a strong ops platform.

This release marks four years of hard-earned Crowbar v1 deployment experience and two years of v2 design, redesign and implementation. I’ve talked with DevOps teams from all over the world and listened to their pains and needs. We have a long way to go before we’re deploying 1000 node OpenStack and Hadoop clusters, OpenCrowbar Anvil significantly moves the needle in that direction.

Thanks to the Crowbar community (Dell and SUSE especially) for nurturing the project, and congratulations to the OpenCrowbar team getting us this to this amazing place.

Rocking Docker – OpenCrowbar builds solid foundation & life-cycle [VIDEOS]

Posted on April 14, 2014 by Rob H

Docker has been gathering a substantial about of interest as an additional way to solve application portability and dependency hell. We’ve been enthusiastic participants in this fledgling community (Docker in OpenStack) and my work in DefCore’s Tempest in a Container (TCUP).

In OpenCrowbar, we’ve embedded Docker much deeper to solve a few difficult & critical problems: speeding up developing multi-node deployments and building the environment for the containers. Check out my OpenCrowbar does Docker video or the community demo!

Bootstrapping Docker into a DevOps management framework turns out to be non-trivial because integrating new nodes into a functioning operating environment is very different on Docker than using physical servers or a VMs. Containers don’t PXE boot and have more limited configuration options.

How did we do this? Unlike other bare metal provisioning frameworks, we made sure that Crowbar did not require DHCP+PXE as the only node discovery process. While we default to and fully support PXE with our sledgehammer discovery image, we also allow operators to pre-populate the Crowbar database using our API and make configuration adjustments before the node is discovered/created.

We even went a step farther and enabled the Crowbar dependency graph to take alternate routes (we call it the “provides” role). This enhancement is essential for dealing with “alike but different” infrastructure like Docker.

The result is that you can request Docker nodes in OpenCrowbar (using the API only for now) and it will automatically create the containers and attach them into Crowbar management. It’s important to stress that we are not adding existing containers to Crowbar by adding an agent; instead, Crowbar manages the container’s life-cycle and then then work inside the container.

Getting around the PXE cycle using containers as part of Crowbar substantially improves Ops development cycle time because we don’t have to wait for boot > discovery > reboot > install to create a clean environment. Bringing fresh Docker containers into a dev system takes seconds instead,

The next step is equally powerful: Crowbar should be able to configure the Docker host environment on host nodes (not just the Admin node as we are now demonstrating). Setting up the host can be very complex: you need to have the correct RAID, BIOS, Operating System and multi-NIC networking configuration. All of these factors must be done with a system perspective that match your Ops environment. Luckily, this is exactly Crowbar’s sweet spot!

Until we’ve got that pulled together, OpenCrowbar’s ability to use upstream cookbooks and this latest Dev/Test focused step provides remarkable out of the gate advantages for everyone build multi-node DevOps tools.

Enjoy!

PS: It’s worth noting that we’ve already been using Docker to run & develop the Crowbar Admin server. This extra steps makes Crowbar even more Dockeriffic.

Rob Hirschfeld

On Computing, Containers, Cloud & Tech Culture

Tag Archives: OpenCrowbar