OpenCrowbar: ready to fly as OpenOps neutral platform – Dell stepping back

greg and rob

Two of Crowbar Founders: me with Greg Althaus [taken Jan 2013]

With the Anvil release in the bag, Dell announced on the community list yesterday that it has stopped active contribution on the Crowbar project.  This effectively relaunches Crowbar as a truly vendor-neutral physical infrastructure provisioning tool.

While I cannot speak for my employer, Dell, about Crowbar; I continue serve in my role as a founder of the Crowbar Project.  I agree with Eric S Raymond that founders of open source projects have a responsibility to sustain their community and ensure its longevity.

In the open DevOps bare metal provisioning market, there is nothing that matches the capabilities developed in either Crowbar v1 or OpenCrowbar.  The operations model and system focused approach is truly differentiated because no other open framework has been able to integrate networking, orchestration, discovery, provisioning and configuration management like Crowbar.

It is time for the community to take Crowbar beyond the leadership of a single hardware vendor, OS vendor, workload or CMDB tool.  OpenCrowbar offers operations freedom and flexibility to build upon an abstracted physical infrastructure (what I’ve called “ready state“).

We have the opportunity to make open operations a reality together.

As a Crowbar founder and its acting community leader, you are welcome to contact me directly or through the crowbar list about how to get engaged in the Crowbar community or help get connected to like-minded Crowbar resources.

Parable of Kitten Taming

It’s time to return to story of Barney and Bailum.  Last year, I wrote about their separate paths through the circus business: Bailum succeeding with a lean model and Barney failing with a “go big” strategy.  This parable opens with Bailum taking pity on Barney and bringing him into her thriving animal training business.

Bailum had grown her Lion taming business from the ground up.  She started from humble beginnings with untrained dogs; consequently, she’d learned about building rapport and trust with her performers.  She never considered them to be animals.  To her, everyone in her organization (especially the animals) was a valued contributor.  She’d seen first-hand that just one bad link in the chain could cause a great performance team to turn sour.  Her acts won awards and she was proud to have them in the spotlight while she focused on building trust and a sustainable culture.

Unfortunately, Barney did not share his sister’s experience or values.  He only saw the name that she’d built for the company and felt that he could use his position and relationship to promote himself.   Even though he knew nothing of animal training, he was eager to redirect his staff into new areas.  Reading market data and without consulting his trainers, he decided that a cute kitten acts would attract more business than the company’s successful dogs acts.

Overnight, he released the dogs and acquired kittens from a local shelter.  Some of his trainers simply quit while others made an attempt to follow the new direction.  Barney was impatient for success and started watching the trainers learning to work with the frisky felines.  Progress was slow and Barney vented his frustration by yelling at the trainers and ultimately putting shock collars on the kittens.  In short order, the trainers had left and Barney was being sprayed, scratched and bitten by the cats.

When Bailum learned about her brother’s management approach she was mortified; unfortunately, he had also signed contracts promising kitten acts to their customers.   After restructuring her familial entanglements, she took a personal interest in training the kittens.  She immediately recognized that cats require independence instead of direction compared to dogs.  Starting from careful handling, then bringing in her lion tamers and rewarding positive results, she created working troop.  The final results were so effective (and logistics so much easier) that Bailum ultimately transformed her business to focus on them exclusively.

Moral: you can’t force cats to bark but, with the right approach, kittens can outperform lions

OpenStack DefCore Matrix Cheet Sheet

DefCore sets base requirements by defining 1) capabilities, 2) code and 3) must-pass tests for all OpenStack products. This definition uses community resources and involvement to drive interoperability by creating minimum standards for products labeled “OpenStack.”

In the last week, the DefCore committee release the results of 6 months of work.  We choose to getting input in favor of cleanups and polish, so please be patient if some of the data is overwhelming.

We’ve got enough feedback to put together this capabilities matrix cheat sheet to help the interpret all the colors and data on the page (headers link).

capabilities_matrix_explained

DefCore Capabilities Scorecard & Core Identification Matrix [REVIEW TIME!]

Attribution Note: This post was collaboratively edited by members of the DefCore committee and cross posted with DefCore co-chair Joshua McKenty of Piston Cloud.

DefCore sets base requirements by defining 1) capabilities, 2) code and 3) must-pass tests for all OpenStack products. This definition uses community resources and involvement to drive interoperability by creating minimum standards minimum standards for products labeled “OpenStack.”

The OpenStack Core definition process (aka DefCore) is moving steadily along and we’re looking for feedback from community as we move into the next phase.  Until now, we’ve been mostly working out principles, criteria and processes that we will use to answer “what is core” in OpenStack.  Now we are applying those processes and actually picking which capabilities will be used to identify Core.

TL;DR! We are now RUNNING WITH SCISSORS because we’ve reached the point there you can review early thoughts about what’s going to be considered Core (and what’s not).  We now have a tangible draft list for community review.

capabilities_selectionWhile you will want to jump directly to the review draft matrix (red means needs input), it is important to understand how we got here because that’s how DefCore will resolve the inevitable conflicts.  The very nature of defining core means that we have to say “not in” to a lot of capabilities.  Since community consensus seems to favor a “small core” in principle, that means many capabilities that people consider important are not included.

The Core Capabilities Matrix attempts to find the right balance between quantitative detail and too much information.  Each row represents an “OpenStack Capability” that is reflected by one or more individual tests.  We scored each capability equally on a 100 point scale using 12 different criteria.  These criteria were selected to respect different viewpoints and needs of the community ranging from popularity, technical longevity and quality of documentation.

While we’ve made the process more analytical, there’s still room for judgement.  Eventually, we expect to weight some criteria more heavily than others.  We will also be adjusting the score cut-off.  Our goal is not to create a perfect evaluation tool – it should inform the board and facilitate discussion.  In practice, we’ve found this approach to bring needed objectivity to the selection process.

So, where does this take us?  The first matrix is, by design, old news.  We focused on getting a score for Havana to give us a stable and known quantity; however, much of that effort will translate forward.  Using Havana as the base, we are hoping to score Ice House ninety days after the Juno summit and score Juno at K Summit in Paris.

These are ambitious goals and there are challenges ahead of us.  Since every journey starts with small steps, we’ve put ourselves on feet the path while keeping our eyes on the horizon.

Specifically, we know there are gaps in OpenStack test coverage.  Important capabilities do not have tests and will not be included.  Further, starting with a small core means that OpenStack will be enforcing an interoperability target that is relatively permissive and minimal.  Universally, the community has expressed that including short-term or incomplete items is undesirable.  It’s vital to remember that we are looking for evolutionary progress that accelerates our developer, user, operator and ecosystem communities.

How can you get involved?  We are looking for community feedback on the DefCore list on this 1st pass – we do not think we have the scores 100% right.  Of course, we’re happy to hear from you however you want to engage: in intentionally named the committed “defcore” to make it easier to cross-reference and search.

We will eventually use Refstack to collect voting/feedback on capabilities directly from OpenStack community members.

Open Operations [4/4 series on Operating Open Source Infrastructure]

This post is the final in a 4 part series about Success factors for Operating Open Source Infrastructure.

tl;dr Note: This is really TWO tightly related posts: 
  part 1 is OpenOps background. 
  part 2 is about OpenStack, Tempest and DefCore.

2012-01-11_17-42-11_374One of the substantial challenges of large-scale deployments of open source software is that it is very difficult to come up with a best practice, or a reference implementation that can be widely explained or described by the community.

Having a best practice deployment is essential for the growth of the community because it enables multiple people to deploy the software in a repeatable, stable way. This, in turn, fosters community growth so that more people can adopt software in a consistent way. It does little good if operators have no consistent pattern for deployment, because that undermines the developers’ abilities to extend, the testers’ abilities to ensure quality, and users’ ability to repeat the success of others.

Fundamentally, the goal of an open source project, from a user’s perspective, is that they can quickly achieve and repeat the success of other people in the community.

When we look at these large-scale projects we really try to create a pattern of success that can be repeated over and over again. This ensures growth of the user base, and it also helps the developer reduce time spent troubleshooting problems.

That does not mean that every single deployment should be identical, but there is substantial value in having a limited number of success patterns. Customers can then be assured not only of quick time to value with these projects., They can also get help without having everybody else in the community attempt to untangle how one person created a site-specific. This is especially problematic if someone created an unnecessarily unique scenario. That simply creates noise and confusion in the environment., Noise is a huge cost for the community, and needs to be eliminated nor an open source project to flourish.

This isn’t any different from in proprietary software but most of these activities are hidden. A proprietary project vendor can make much stronger recommendations and install guidance because they are the only source of truth in that project. In an open source project, there are multiple sources of truth, and there are very few people who are willing to publish their exact reference implementation or test patterns. Consequently, my team has taken a strong position on creating a repeatable reference implementation for Openstack deployments, based on extensive testing. We have found that our test patterns and practices are grounded in successful customer deployments and actual, physical infrastructure deployments. So, they are very pragmatic, repeatable, and sustained.

We found that this type of testing, while expensive, is also a significant value to our customers, and something that they appreciate and have been willing to pay for.

OpenStack as an Example: Tempest for Reference Validation

The Crowbar project incorporated OpenStack Tempest project as an essential part of every OpenStack deployment. From the earliest introduction of the Tempest suite, we have understood the value of a baselining test suite for OpenStack. We believe that using the same tests the developers use for a single node test is a gate for code acceptance against a multi-node deployment, and creates significant value both for our customers and the OpenStack project as a whole.  This was part of my why I embraced the suggestion of basing DefCore on tests.

While it is important to have developer tests that gate code check-ins, the ultimate goal for OpenStack is to create scale-out multi-node deployments. This is a fundamental design objective for OpenStack.

With developers and operators using the same test suite, we are able to proactively measure the success of the code in the scale deployments in a way that provides quick feedback for the developers. If Tempest tests do not pass a multi-node environment, they are not providing significant value for developers to ensure that their code is operating against best practice scenarios. Our objective is to continue to extend the Tempest suite of tests so that they are an accurate reflection of the use cases that are encountered in a best practice, referenced deployment.

Along these lines, we expect that the community will continue to expand the Tempest test suite to match actual deployment scenarios reflected in scale and multi-node configurations. Having developers be responsible for passing these tests as part of their day-to-day activities ensures that development activities do not disrupt scale operations. Ultimately, making proactive gating tests ensures that we are creating scenarios in which code quality is continually increasing, as is our ability to respond and deploy the OpenStack infrastructure.

I am very excited and optimistic that the expanding the Tempest suite holds the key to making OpenStack the most stable, reliable, performance cloud implementation available in the market. The fact that this test suite can be extended in the community, and contributed to by a broad range of implementations, only makes that test suite more valuable and more likely to fully encompass all use cases necessary for reference implementations.

Networking in Cloud Environments, SDN, NFV, and why it matters [part 2 of 2]

scott_jensen2Scott Jensen is an Engineering Director and colleague of mine from Dell with deep networking and operations experience.  He had first hand experience deploying OpenStack and Hadoop and has a critical role in defining Dell’s Reference Architectures in those areas.  When I saw this writeup about cloud networking (first post), I asked if it would be OK to post it here and share it with you.

GUEST POST 2 OF 2 BY SCOTT JENSEN:

So what is different about Cloud and how does it impact on the network

In a traditional data center this was not all that difficult (relatively).  You knew what was going to running on what system (physically) and could plan your infrastructure accordingly.  The majority of the traffic moved in a North/South direction. Or basically from outside the infrastructure (the internet for example) to inside and then responded back out.  You knew that if you had to design a communication channel from an application server to a database server this could be isolated from the other traffic as they did not usually reside on the same system.

scj-net1

Virtualization made this more difficult.  In this model you are sharing systems resources for different applications.  From the networks point of view there are a large number of systems available behind a couple of links.  Live Migration puts another wrinkle in the design as you now have to deal with a specific system moving from one physical server to another.  Network Virtualization helps out a lot with this.  With this you can now move virtual ports from one physical server to another to ensure that when one virtual machine moves from a physical server to another that the network is still available.  In many cases you managed these virtual networks the same as you managed your physical network.  As a matter of fact they were designed to emulate the physical as much as possible.  The virtual machines still looked a lot like the physical ones they replaced and can be treated in vary much the same way from a traffic flow perspective.  The traffic still is primarily a North/South pattern.

Cloud, however, is a different ball of wax.  Think about the charistics of the Cattle described above.  A cloud application is smaller and purpose built.  The majority of its traffic is between VMs as different tiers which were traditionally on the same system or in the same VM are now spread across multiple VMs.  Therefore its traffic patterns are primarily East/West.  You cannot forget that there is a North/South pattern the same as what was in the other models which is typically user interaction.  It is stateless so that many copies of itself can run in tandem allowing it to elastically scale up and down based on need and as such they are appearing and disappearing from the network.  As these VMs are spawned on the system they may be right next to each other or on different servers or potentially in different Data Centers.  But it gets even better.   scj-net2

Cloud architectures are typically multi-tenant.  This means that multiple customers will utilize this infrastructure and need to be isolated from each other.  And of course Clouds are self-service.  Users/developers can design, build and deploy whenever they want.  Including designing the network interconnects that their applications need to function.  All of this will cause overlapping IP address domains, multiple virtual networks both L2 and L3, requirements for dynamically configuring QOS, Load Balancers and Firewalls.  Lastly in our list of headaches is not the least.  Cloud systems tend to breed like rabbits or multiply like coat hangers in the closet.  There are more and more systems as 10 servers become 40 which becomes 100 then 1000 and so on.

 

So what is a poor Network Engineer to do?

First get a handle on what this Cloud thing is supposed to be for.  If you are one of the lucky ones who can dictate the use of the infrastructure then rock on!  Unfortunately, that does not seem to be the way it goes for many.  In the case where you just cannot predict how the infrastructure will be used I am reminded of the phrase “there is not replacement for displacement”.  Fast links, non-blocking switches, Network Fabrics are all necessary for the physical network but will not get you there.  Sense as a network administrator you cannot predict the traffic patterns who can?  Well the developer and the application itself.  This is what SDN is all about.  It allows a programmatic interface to what is called an overlay network.  A series of tunnels/flows which can build virtual networks on top of the physical network giving that pesky application what it was looking for.  In some cases you may want to make changes to the physical infrastructure.  For example change the configuration of the Firewall or Load Balancer or other network equipment.  SDN vendors are creating plug-ins that can make those types of configurations.  But if this is not good enough for you there is NFV.  The basic idea here is that why have specialized hardware for your core network infrastructure when we can run them virtualized as well?  Let’s run those in VM’s as well, hook them into the virtual network and SDN to configure them and we now can virtualize the routers, load balancers, firewalls and switches.  These technologies are in very much a state of flux right now but they are promising none the less.  Now if we could just virtualize the monitoring and troubleshooting of these environments I’d be happy.

 

 

Ops Validation using Development Tests [3/4 series on Operating Open Source Infrastructure]

This post is the third in a 4 part series about Success factors for Operating Open Source Infrastructure.

turning upIn an automated configuration deployment scenario, problems surface very quickly. They prevent deployment and force resolution before progress can be made. Unfortunately, many times this appears to be a failure within the deployment automation. My personal experience has been exactly the opposite: automation creates a “fail fast” environment in which critical issues are discovered and resolved during provisioning instead of sleeping until later.

Our ability to detect and stop until these issues are resolved creates exactly the type of repeatable, successful deployment that is essential to long-term success. When we look at these deployments, the most important success factors are that the deployment is consistent, known and predictable. Our ability to quickly identify and resolve issues that do not match those patterns dramatically improves the long-term stability of the system by creating an environment that has been benchmarked against a known reference.

Benchmarking against a known reference is ultimately the most significant value that we can provide in helping customers bring up complex solutions such as Openstack and Hadoop. Being successful with these deployments over the long term means that you have established a known configuration, and that you have maintained it in a way that is explainable and reference-able to other places.

Reference Implementation

The concept of a reference implementation provides tremendous value in deployment. Following a pattern that is a reference implementation enables you to compare notes, get help and ultimately upgrade and change deployment in known, predictable ways. Customers who can follow and implement a vendors’ reference, or the community’s reference implementation, are able to ask for help on the mailing lists, call in for help and work with the community in ways that are consistent and predictable.

Let’s explore what a reference implementation looks like.

In a reference implementation you have a consistent, known state of your physical infrastructure that has been implemented based upon a RA. That implementation follows a known best practice using standard gear in a consistent, known configuration. You can therefore explain your configuration to a community of other developers, or other people who have similar configuration, and can validate that your problem is not the physical configuration. Fundamentally, everything in a reference implementation is driving towards the elimination of possible failure cause. In this case, we are making sure that the physical infrastructure is not causing problems (getting to a ready state), because other people are using a similar (or identical) physical infrastructure configuration.

The next components of a reference implementation are the underlying software configurations for operating system management monitoring network configuration, IP networking stacks. Pretty much the entire component of the application is riding on. There are a lot of moving parts and complexity in this scenario, witha high likelihood of causing failures. Implementing and deploying the software stacks in an automated way, has enabled us to dramatically reduce the potential for problems caused by misconfiguration. Because the number of permutations of software in the reference stack is so high, it is essential that successful deployment tightly manages what exactly is deployed, in such a way that they can identify, name, and compare notes with other deployments.

Achieving Repeatable Deployments

In this case, our referenced deployment consists of the exact composition of the operating system, infrastructure tooling, and capabilities for the deployment. By having a reference capability, we can ensure that we have the same:

  • Operating system
  • Monitoring
  • Configuration stacks
  • Security tooling
  • Patches
  • Network stack (including bridges and VLAN, IP table configurations)

Each one of these components is a potential failure point in a deployment. By being able to configure and maintain that configuration automatically, we dramatically increase the opportunities for success by enabling customers to have a consistent configuration between sites.

Repeatable reference deployments enable customers to compare notes with Dell and with others in the community. It enables us to take and apply what we have learned from one site to another. For example, if a new patch breaks functionality, then we can quickly determine how that was caused. We can then fix the solution, add in the complimentary fix, and deploy it at that one site. If we are aware that 90% of our other sites have exactly the same configuration, it enables those other sites to avoid a similar problem. In this way, having both a pattern and practice referenced deployment enables the community to absorb or respond much more quickly, and be successful with a changing code base. We found that it is impractical to expect things not to change.

The only thing that we can do is build resiliency for change into these deployments. Creating an automated and tested referenceable deployment is the best way to cope with change.

 

 

 

DefCore Core Capabilities Selection Criteria SIMPLIFIED -> how we are picking Core

I’ve posted about the early DefCore core capabilities selection process before and we’ve put them into application and discussed them with the community.  The feedback was simple: tl;dr.  You’ve got the right direction but make it simpler!

So we pulled the 12 criteria into four primary categories:

  1. Usage: the capability is widely used (Refstack will collect data)
  2. Direction: the capability advances OpenStack technically
  3. Community: the capability builds the OpenStack community experience
  4. System: the capability integrates with other parts of OpenStack

These categories summarize critical values that we want in OpenStack and so make sense to be the primary factors used when we select core capabilities.  While we strive to make the DefCore process objective and quantitive, we must recognize that these choices drive community behavior.

With this perspective, let’s review the selection criteria.  To make it easier to cross reference, we’ve given each criteria a shortened name:

Shows Proven Usage

  • Widely Deployed” Candidates are widely deployed capabilities.  We favor capabilities that are supported by multiple public cloud providers and private cloud products.
  • “Used by Tools” Candidates are widely used capabilities:Should be included if supported by common tools (RightScale, Scalr, CloudForms, …)
  • Used by Clients” Candidates are widely used capabilities: Should be included if part of common libraries (Fog, Apache jclouds, etc)
Aligns with Technical Direction
  • Future Direction” Should reflect future technical direction (from the project technical teams and the TC) and help manage deprecated capabilities.
  • “Stable” Test is required stable for >2 releases because we don’t want core capabilities that do not have dependable APIs.
  • “Complete” Where the code being tested has a designated area of alternate implementation (extension framework) as per the Core Principles, there should be parity in capability tested across extension implementations.  This also implies that the capability test is not configuration specific or locked to non-open technology.

Plays Well with Others

  • “Discoverable” Capability being tested is Service Discoverable (can be found in Keystone and via service introspection)
  • “Doc’d” Should be well documented, particularly the expected behavior.  This can be a very subjective measure and we expect to refine this definition over time.
  • “Core in Last Release”  A test that is a must-pass test should stay a must-pass test.  This make makes core capabilities sticky release per release.  Leaving Core is disruptive to the ecosystem

Takes a System View

  • Foundation” Test capabilities that are required by other must-pass tests and/or depended on by many other capabilities
  • “Atomic” Capabilities is unique and cannot be build out of other must-pass capabilities
  • “Proximity” (sometimes called a Test Cluster) selects for Capabilities that are related to Core Capabilities.  This helps ensure that related capabilities are managed together.

Note: The 13th “non-admin” criteria has been removed because Admin APIs cannot be used for interoperability and cannot be considered Core.

Networking in Cloud Environments, SDN, NFV, and why it matters [part 1 of 2]

scott_jensen2Scott Jensen is an Engineering Director and colleague of mine from Dell with deep networking and operations experience.  He had first hand experience deploying OpenStack and Hadoop and has a critical role in defining Dell’s Reference Architectures in those areas.  When I saw this writeup about cloud networking, I asked if it would be OK to share it with you.

Guest Post 1 of 2 by Scott Jensen:

Having a basis in enterprise data center networking, Cloud computing I have many conversations with customers implementing a cloud infrastructure.  Their design the networking infrastructure can and should be different from a classic network configuration and many do not understand why.  Either due to a lack of knowledge in networking or due to a lack of understanding as to why cloud computing is different from virtualization.  Once you have an understanding of both of these areas you can begin to see why emerging technologies such as SDN (Software Defined Networking) and NFV (Network Function Virtualization) begin to address some of the issues that Cloud Computing can cause with your network.

Networking is all about traffic flows.  In order to properly design your infrastructure you need to understand where traffic is originating, where it is going and how much traffic will be following a specific route and at what times.

There are many differences between Cloud Computing and virtualization.  In many cases people I will talk to think of Cloud as virtualization in a different environment.  Of course this will work just fine however it does not take advantage of the goodness that a Cloud infrastructure can bring.  Some of the major differences between Virtualization and Cloud Computing have profound effects on how the network is utilized.  This all has to do with the application.  That is really what it is all about anyway.  Rob Hirschfeld has a great post on the difference between Pets and Cattle which describes this well.

Pets and Cattle as a workload evolution

In typical virtualized infrastructures, the applications have a fairly common pattern.  Many people describe these as Pets and are managed largely the same as a physical system.  They have a name, they are one of a kind, they are cared for, and when the die it can be traumatic (I know I have been there).

  • They run on large stateful VMs
  • They have a lifecycle which is typically very long such as years
  • The applications themselves are not designed to tolerate failures.  Other technologies are brought in to ensure uptime.
  • The application is scaled up when demands increase.  This is done by adding more memory or CPU to the VM.

Cloud applications are different.  Some people describe them as cattle and they are treated like cattle in many ways.  They do not necessarily have a name and if one dies it is sad but not a really big deal.  We should probably figure out what killed it but life goes on.

  • They run on smaller stateless VMs
  • They have a lifecycle measured in hours or months.  Sometimes even less than an hour.
  • The application is designed to expect failures
  • The application scales out by increasing the number of instances which is running when the demand increases.

In his follow-up post next week, Scott discusses how this impacts the network and how SDN and NFV promises to help.

 

 

OpenCrowbar.Anvil released – hammering out a gold standard in open bare metal provisioning

OpenCrowbarI’m excited to be announcing OpenCrowbar’s first release, Anvil, for the community.  Looking back on our original design from June 2012, we’ve accomplished all of our original objectives and more.
Now that we’ve got the foundation ready, our next release (OpenCrowbar Broom) focuses on workload development on top of the stable Anvil base.  This means that we’re ready to start working on OpenStack, Ceph and Hadoop.  So far, we’ve limited engagement on workloads to ensure that those developers would not also be trying to keep up with core changes.  We follow emergent design so I’m certain we’ll continue to evolve the core; however, we believe the Anvil release represents a solid foundation for workload development.
There is no more comprehensive open bare metal provisioning framework than OpenCrowbar.  The project’s focus on a complete operations model that comprehends hardware and network configuration with just enough orchestration delivers on a system vision that sets it apart from any other tool.  Yet, Crowbar also plays nicely with others by embracing, not replacing, DevOps tools like Chef and Puppet.
Now that the core is proven, we’re porting the Crowbar v1 RAID and BIOS configuration into OpenCrowbar.  By design, we’ve kept hardware support separate from the core because we’ve learned that hardware generation cycles need to be independent from the operations control infrastructure.  Decoupling them eliminates release disruptions that we experienced in Crowbar v1 and­ makes it much easier to use to incorporate hardware from a broad range of vendors.
Here are some key components of Anvil
  • UI, CLI and API stable and functional
  • Boot and discovery process working PLUS ability to handle pre-populating and configuration
  • Chef and Puppet capabilities including Birk Shelf v3 support to pull in community upstream DevOps scripts
  • Docker, VMs and Physical Servers
  • Crowbar’s famous “late-bound” approach to configuration and, critically, networking setup
  • IPv6 native, Ruby 2, Rails 4, preliminary scale tuning
  • Remarkably flexible and transparent orchestration (the Annealer)
  • Multi-OS Deployment capability, Ubuntu, CentOS, or Different versions of the same OS
Getting the workloads ported is still a tremendous amount of work but the rewards are tremendous.  With OpenCrowbar, the community has a new way to collaborate and integration this work.  It’s important to understand that while our goal is to start a quarterly release cycle for OpenCrowbar, the workload release cycles (including hardware) are NOT tied to OpenCrowbar.  The workloads choose which OpenCrowbar release they target.  From Crowbar v1, we’ve learned that Crowbar needed to be independent of the workload releases and so we want OpenCrowbar to focus on maintaining a strong ops platform.
This release marks four years of hard-earned Crowbar v1 deployment experience and two years of v2 design, redesign and implementation.  I’ve talked with DevOps teams from all over the world and listened to their pains and needs.  We have a long way to go before we’re deploying 1000 node OpenStack and Hadoop clusters, OpenCrowbar Anvil significantly moves the needle in that direction.
Thanks to the Crowbar community (Dell and SUSE especially) for nurturing the project, and congratulations to the OpenCrowbar team getting us this to this amazing place.