OpenCrowbar v2.1 Video Tour from Metal to OpenStack and beyond

Posted on January 8, 2015 by Rob H

With the OpenCrowbar v2.1 out, I’ve been asked to update the video library of Crowbar demos. Since a complete tour is about 3 hours, I decided to cut it down into focused demos that would allow you to start at an area of interest and work backwards.

I’ve linked all the videos below by title. Here’s a visual table on contents:

Crowbar v2.1 demo: Visual Table of Contents [click for playlist]

The heart of the demo series is the Annealer and Ready State (video #3).

Prepare Environment
Bootstrap Crowbar
Add Nodes ♥ Ready State (good starting point)
Boot Hardware
Install OpenStack (Juno using PackStack on CentOS 7)
Integrate with Chef & Chef Provisioning
Integrate with SaltStack

I’ve tried to do some post-production so limit dead air and focus on key areas. As always, I value content over production values so feedback is very welcome!

API Driven Metal = OpenCrowbar + Chef Provisioning

Posted on November 11, 2014 by Rob H

The OpenCrowbar community created a Chef-Provisioning driver that allows you to quickly build hardware clusters using Chef cookbooks.

When we started using Chef in 2011, there was a distinct gap around bootstrapping systems. The platform did a great job of automation and even connecting services together (via the Search anti-pattern, see below) but lacked a way to build the initial clusters automatically.

The current answer to this problem from Chef is refreshingly simply: a cookbook API extension called Chef Provisioning. This approach uses the regular Chef DSL in recipes to create request and bind a cluster into Chef. Basically, the code simply builds an array of nodes using an API that creates the nodes if they are missing from the array in the code. Specifically, when a node is missing from the array, Chef calls out to create the node in an external system.

For clouds, that means using the API to request a server and then inject credentials for Chef management. It’s trickier for physical gear because you cannot just make a server in the configuration you need it in. Physical systems must first be discovered and profiled to ready state: the system must know how many NICs and disk drives are available to correctly configure the hardware prior to laying down the Operating System.

Consequently, Chef Provisioning automation is more about reallocation of existing discovered physical assets to Chef. That’s exactly the approach the OpenCrowbar team took for our Chef Provisioning driver.

OpenCrowbar interacts with Chef Provisioning by pulling nodes from the System deployment into a Chef Provisioning deployment. That action then allows the API client to request specific configurations like Operating System or network that need to be setup for Chef to execute. Once these requests are made, Crowbar will simply run its normal annealing processes to ready state and then injects the Chef credentials. Chef waits until the work queue is empty and then takes over management of the asset. When Chef is finished, Crowbar can be instructed to reconfigure the node back to a base state.

Does that sound simple? It is simple because the Crowbar APIs match the Chef needs very cleanly.

It’s worth noting that this integration is a great test of the OpenCrowbar API design. Over the last two years, we’ve evolved the API to make it more final result focused. Late binding is a critical concept for the project and the APIs reflect that objective. For Chef Provisioning, we allow the integration to focus on simple requests like “give me a node then put this O/S on the node and go.” Crowbar has the logic needed to figure out how to accomplish those objectives without much additional instruction.

Bonus Side Note: Why Search can become an anti-pattern?

Search is an incredibly powerful feature in Chef that allows cross-role and cross-node integration; unfortunately, it’s also very difficult to maintain as complexity and contributor counts grow. The reason is that search creates “forward dependencies” in the scripts that require operators creating data to be aware of downstream, hidden consumers. High Availability (HA) is a clear example. If I add a new “cluster database” role to the system then it is very likely to return multiple results for database searches. That’s excellent until I learn that my scripts have coded search to assume that we only return one result for database lookups. It’s very hard to find these errors since the searches are decoupled and downstream of the database cookbook. Ultimately, the community had to advise against embedded search for shared cookbooks

You need a Squid Proxy fabric! Getting Ready State Best Practices

Posted on July 1, 2014 by Rob H

Sometimes a solving a small problem well makes a huge impact for operators. Talking to operators, it appears that automated configuration of Squid does exactly that.

If you were installing OpenStack or Hadoop, you would not find “setup a squid proxy fabric to optimize your package downloads” in the install guide. That’s simply out of scope for those guides; however, it’s essential operational guidance. That’s what I mean by open operations and creating a platform for sharing best practice.

Deploying a base operating system (e.g.: Centos) on a lot of nodes creates bit-tons of identical internet traffic. By default, each node will attempt to reach internet mirrors for packages. If you multiply that by even 10 nodes, that’s a lot of traffic and a significant performance impact if you’re connection is limited.

For OpenCrowbar developers, the external package resolution means that each dev/test cycle with a node boot (which is up to 10+ times a day) is bottle necked. For qa and install, the problem is even worse!

Our solution was 1) to embed Squid proxies into the configured environments and the 2) automatically configure nodes to use the proxies. By making this behavior default, we improve the overall performance of a deployment. This further improves the overall network topology of the operating environment while adding improved control of traffic.

This is a great example of how Crowbar uses existing operational tool chains (Chef configures Squid) in best practice ways to solve operations problems. The magic is not in the tool or the configuration, it’s that we’ve included it in our out-of-the-box default orchestrations.

It’s time to stop fumbling around in the operational dark. We need to compose our tool chains in an automated way! This is how we advance operational best practice for ready state infrastructure.

OpenCrowbar Design Principles: Attribute Injection [Series 6 of 6]

Posted on May 30, 2014 by Rob H

This is part 5 of 6 in a series discussing the principles behind the “ready state” and other concepts implemented in OpenCrowbar. The content is reposted from the OpenCrowbar docs repo.

Attribute Injection

Attribute Injection is an essential aspect of the “FuncOps” story because it helps clean boundaries needed to implement consistent scripting behavior between divergent sites.

It also allows Crowbar to abstract and isolate provisioning layers. This operational approach means that deployments are composed of layered services (see emergent services) instead of locked “golden” images. The layers can be maintained independently and allow users to compose specific configurations a la cart. This approach works if the layers have clean functional boundaries (FuncOps) that can be scoped and managed atomically.

To explain how Attribute Injection accomplishes this, we need to explore why search became an anti-pattern in Crowbar v1. Originally, being able to use server based search functions in operational scripting was a critical feature. It allowed individual nodes to act as part of a system by searching for global information needed to make local decisions. This greatly added Crowbar’s mission of system level configuration; however, it also created significant hidden interdependencies between scripts. As Crowbar v1 grew in complexity, searches became more and more difficult to maintain because they were difficult to correctly scope, hard to centrally manage and prone to timing issues.

Crowbar was not unique in dealing with this problem – the Attribute Injection pattern has become a preferred alternative to search in integrated community cookbooks.

Attribute Injection in OpenCrowbar works by establishing specific inputs and outputs for all state actions (NodeRole runs). By declaring the exact inputs needed and outputs provided, Crowbar can better manage each annealing operation. This control includes deployment scoping boundaries, time sequence of information plus override and substitution of inputs based on execution paths.

This concept is not unique to Crowbar. It has become best practice for operational scripts. Crowbar simply extends to paradigm to the system level and orchestration level.

Attribute Injection enabled operations to be:

Atomic – only the information needed for the operation is provided so risk of “bleed over” between scripts is minimized. This is also a functional programming preference.
Isolated Idempotent – risk of accidentally picking up changed information from previous runs is reduced by controlling the inputs. That makes it more likely that scripts can be idempotent.
Cleanly Scoped – information passed into operations can be limited based on system deployment boundaries instead of search parameters. This allows the orchestration to manage when and how information is added into configurations.
Easy to troubleshoot – since the information is limited and controlled, it is easier to recreate runs for troubleshooting. This is a substantial value for diagnostics.

OpenCrowbar.Anvil released – hammering out a gold standard in open bare metal provisioning

Posted on April 30, 2014 by Rob H

I’m excited to be announcing OpenCrowbar’s first release, Anvil, for the community. Looking back on our original design from June 2012, we’ve accomplished all of our original objectives and more.

Now that we’ve got the foundation ready, our next release (OpenCrowbar Broom) focuses on workload development on top of the stable Anvil base. This means that we’re ready to start working on OpenStack, Ceph and Hadoop. So far, we’ve limited engagement on workloads to ensure that those developers would not also be trying to keep up with core changes. We follow emergent design so I’m certain we’ll continue to evolve the core; however, we believe the Anvil release represents a solid foundation for workload development.

There is no more comprehensive open bare metal provisioning framework than OpenCrowbar. The project’s focus on a complete operations model that comprehends hardware and network configuration with just enough orchestration delivers on a system vision that sets it apart from any other tool. Yet, Crowbar also plays nicely with others by embracing, not replacing, DevOps tools like Chef and Puppet.

Now that the core is proven, we’re porting the Crowbar v1 RAID and BIOS configuration into OpenCrowbar. By design, we’ve kept hardware support separate from the core because we’ve learned that hardware generation cycles need to be independent from the operations control infrastructure. Decoupling them eliminates release disruptions that we experienced in Crowbar v1 and makes it much easier to use to incorporate hardware from a broad range of vendors.

Here are some key components of Anvil

UI, CLI and API stable and functional
Boot and discovery process working PLUS ability to handle pre-populating and configuration
Chef and Puppet capabilities including Birk Shelf v3 support to pull in community upstream DevOps scripts
Docker, VMs and Physical Servers
Crowbar’s famous “late-bound” approach to configuration and, critically, networking setup
IPv6 native, Ruby 2, Rails 4, preliminary scale tuning
Remarkably flexible and transparent orchestration (the Annealer)
Multi-OS Deployment capability, Ubuntu, CentOS, or Different versions of the same OS

Getting the workloads ported is still a tremendous amount of work but the rewards are tremendous. With OpenCrowbar, the community has a new way to collaborate and integration this work. It’s important to understand that while our goal is to start a quarterly release cycle for OpenCrowbar, the workload release cycles (including hardware) are NOT tied to OpenCrowbar. The workloads choose which OpenCrowbar release they target. From Crowbar v1, we’ve learned that Crowbar needed to be independent of the workload releases and so we want OpenCrowbar to focus on maintaining a strong ops platform.

This release marks four years of hard-earned Crowbar v1 deployment experience and two years of v2 design, redesign and implementation. I’ve talked with DevOps teams from all over the world and listened to their pains and needs. We have a long way to go before we’re deploying 1000 node OpenStack and Hadoop clusters, OpenCrowbar Anvil significantly moves the needle in that direction.

Thanks to the Crowbar community (Dell and SUSE especially) for nurturing the project, and congratulations to the OpenCrowbar team getting us this to this amazing place.

OpenStack Neutron using Linux Bridges (technical explanation)

Posted on October 16, 2013 by Rob H

Apparently this is “Showcase Dell OpenStack/Crowbar Team Member Week” because today I’m proxy positioning for Dell OpenStack engineer Chris Dearborn. Chris has been leading our OpenStack Neutron deployment for Grizzly and Havana.

If you’re familiar with the OpenStack Networking, skip over my introductory preamble and jump right down to the meat under “SDN Client Connection: Linux Bridge.” Hopefully we can convince Chris to put together more in this series and cover GRE and VLAN configurations too.

OpenStack and Software Defined Network

Software Defined Networking (SDN) is an emerging concept that describes a family of functionality. Like cloud, the exact meaning of SDN appears to be in the eye (or brochure) of the company providing the technology. Overall, the concept for SDN is to have programmable networks that can be automatically provisioned.

Early approaches to this used the OpenFlow™ API to programmatically modify switch routing tables (aka OSI Layer 2) on a flow by flow basis across multiple switches. While highly controlled, OpenFlow has proven difficult to implement at scale in dynamic environments; consequently, many SDN implementations are now using overlay networks based on inventoried VLANs and/or dynamic tunnels.

Inventoried VLAN overlay networks create a stable base layer 2 infrastructure that can be inventoried and handed out dynamically on-demand. Generally, the management infrastructure dynamically connects the end-points (typically virtual machines) to a dedicated existing layer 2 network. This provides all of the isolation desired without having to thrash the underlying network switch infrastructure.

Dynamic tunnel overlay network also uses client connection points to isolate traffic but do not rely on switch layer 2. Instead, they encrypt traffic before sending it over a shared network. This avoids having to match dynamic networks to static inventory; however, it also adds substantial encryption overhead to the network communication. Consequently, tunnels provide more flexibility and less up front-confirmation but with lower performance.

OpenStack Networking, project Neutron (previously Quantum), is responsible for connecting virtual machines setup by OpenStack Compute (aka Nova) to the software defined networking infrastructure. By design, Neutron accommodates different implementation plug-ins. That allows operators to choose between different approaches including the addition of commercial offerings. While it is possible to use open source capabilities for small deployments and trials, most large scale deployments choose proprietary SDN technologies.

The Crowbar OpenStack installation allows operators to choose between “Open vSwitch GRE Tunnels” and “Linux Bridge VLAN” configuration. The GRE option is more flexible and requires less up front configuration; however, the encryption used by GRE will degrade performance. The Linux Bridge VLAN option requires more upfront configuration and design.

Since GRE works with minimal configuration, let’s explore what’s required to for Crowbar to setup OpenStack Neutron Linux Bridge VLAN networking.

Note: This review assumes that you already have a working knowledge of Crowbar and OpenStack.

Background

Before we dig into how OpenStack configures SDN , we need to understand how we connect between virtual machines running in the system and the physical network. This connection uses Linux Bridges. For GRE tunnels, Crowbar configures an Open vSwitch (aka OVS) on the node to create and manage the tunnels.

One challenge with SDN traffic isolation is that we can no longer assume that virtual machines with network access can reach destinations on our same network. This means that the infrastructure must provide paths (aka gateways and routers) between the tenant and infrastructure networks. A major part of the OpenStack configuration includes setting up these connections when new tenant networks are created.

Note: In the OpenStack Grizzly and earlier releases, open source code for network routers were not configured in a highly available or redundant way. This problem is addressed in the Havana release.

For the purposes of this explanation, the “network node” is the shared infrastructure server that bridges networks. The “compute node” is any one of the servers hosting guest virtual machines. Traffic in the cloud can be between virtual machines within the cloud instance (internal) or between a virtual machine and something outside the OpenStack cloud instance (external).

Let’s make sure we’re on the same page with terminology.

OSI Layer 2 – just above physical connections (layer 1), Layer two manages traffic between servers including providing logical separation of traffic.
VLAN – Virtual Local Area Network are switch enforced isolation zones created by adding 1 of 4096 tags in the network traffic (aka tagged traffic).
Tenant – a group of users in a cloud that are logically isolated (cannot see other traffic or information) but still using shared resources.
Switch – a physical device used to provide layer 1 networking connections between end points. May provide additional services on other OSI layers such as VLANs.
Network Node – an OpenStack infrastructure server that connects tenant networks to infrastructure networks.
Compute Node – an OpenStack server that runs user workloads in virtual machines or containers.

SDN Client Connection: Linux Bridge

The VLAN range for Linux Bridge is configurable in /etc/quantum/quantum.conf by changing the network_vlan_ranges parameter. Note that this parameter is set by the Crowbar Neutron chef recipe. The VLAN range is configured to start at whatever the “vlan” attribute in the nova_fixed network in the bc-template-network.json is set to. The VLAN range end is hard coded to end at the VLAN start plus 2000.

Reminder: The maximum VLAN tag is 4096 so the VLAN tag for nova_fixed should never be set to anything greater than 2095 to be safe.

Networks are assigned the next available VLAN tag as they are created. For instance, the first manually created network will be assigned VLAN 501, the next VLAN 502, etc. Note that this is independent of what tenant the new network resides in.

The convention in Linux Bridge is to name the various network constructs including the first 11 characters of the UUID of the associated Neutron object. This allows you to run the quantum CLI command listing out the objects you are interested in, and grepping on the 11 uuid characters from the network construct name. This shows what Neutron object a given network construct maps to.

Network Creation

When a network is created, a corresponding bridge is created and is given the name br<network_uuid>. A subinterface of the NIC is also created and is named <interface_name>.<vlan_tag>. This subinterface is slaved to the bridge. Note that this only happens when the network is needed (when a VM is created on the network).

This occurs on both the network node and the compute nodes.

Additional Steps Taken On The Network Node During Network Creation

On the network node, a bridge and subinterface is created per network and the subinterface is slaved to the bridge as described above. If the network is attached to the router, then a TAP interface that the router listens on is created and slaved to the bridge. If DHCP is selected, then another TAP interface is created that the dnsmasq process talks to, and that interface is also slaved to the bridge.

VM Creation On A Compute Node

When a VM is created, a TAP interface is created and named tap<port_uuid>. The port is the Neutron port that the VM is plugged in to. This TAP interface is slaved to the bridge associated with the network that the user selected when creating the VM. Note that this occurs on compute nodes only.

Determining the dnsmasq port/tap interface for a network

The TAP port associated with dnsmasq for a network can be determined by first getting the uuid of the network, then looking on the network node in /var/lib/quantum/dhcp/<network_uuid>/interface. The interface will be named ns-. Note that this is only the first 11 characters of the uuid. The tap interface will be named tap.

Summary

Understanding OpenStack Networking is critical to operating a successful cloud deployment. The Crowbar Team at Dell has invested significant effort to automate the configuration of Neutron. This helps you eliminate the risk of manual configuration and leverage our extensive testing and field experience.

If you are interested in seeing the exact sequences used by Crowbar, please visit the Crowbar Github repository for the “Quantum Barclamp.”

7 takeaways from DevOps Days Austin

Posted on May 2, 2013 by Rob H

I spent Tuesday and Wednesday at DevOpsDays Austin and continue to be impressed with the enthusiasm and collaborative nature of the DOD events. We also managed to have a very robust and engaged twitter backchannel thanks to an impressive pace set by Gene Kim!

I’ve still got a 5+ post backlog from the OpenStack summit, but wanted to do a quick post while it’s top of mind.

My takeaways from DevOpsDays Austin:

DevOpsDays spends a lot of time talking about culture. I’m a huge believer on the importance of culture as the foundation for the type of fundamental changes that we’re making in the IT industry; however, it’s also a sign that we’re still in the minority if we have to talk about culture evangelism.
Process and DevOps are tightly coupled. It’s very clear that Lean/Agile/Kanban are essential for DevOps success (nice job by Dominica DeGrandis). No one even suggested DevOps+Waterfall as a joke (but Patrick Debois had a picture of a xeroxed butt in his preso which is pretty close).
Still need more Devs people to show up! My feeling is that we’ve got a lot of operators who are engaging with developers and fewer developers who are engaging with operators (the “opsdev” people).
Chef Omnibus installer is very compelling. This approach addresses issues with packaging that were created because we did not have configuration management. Now that we have good tooling we separate the concerns between bits, configuration, services and dependencies. This is one thing to watch and something I expect to see in Crowbar.
The old mantra still holds: If something is hard, do it more often.
Eli Goldratt’s The Goal is alive again thanks to Gene Kims’s smart new novel, The Phoenix project, about DevOps and IT (I highly recommend both, start with Kim).
Not DevOps, but 3D printing is awesome. This is clearly a game changing technology; however, it takes some effort to get right. Dell brought a Solidoodle 3D printer to the event to try and print OpenStack & Crowbar logos (watch for this in the future).

I’d be interested in hearing what other people found interesting! Please comment here and let me know.

5 things keeping DevOps from playing well with others (Chef, Crowbar and Upstream Patterns)

Posted on March 8, 2013 by Rob H

Since my earliest days on the OpenStack project, I’ve wanted to break the cycle on black box operations with open ops. With the rise of community driven DevOps platforms like Opscode Chef and Puppetlabs, we’ve reached a point where it’s both practical and imperative to share operational practices in the form of code and tooling.

Being open and collaborating are not the same thing.

It’s a huge win that we can compare OpenStack cookbooks. The real victory comes when multiple deployments use the same trunk instead of forking.

This has been an objective I’ve helped drive for OpenStack (with Matt Ray) and it has been the Crowbar objective from the start and is the keystone of our Crowbar 2 work.

This has proven to be a formidable challenge for several reasons:

diverging DevOps patterns that can be used between private, public, large, small, and other deployments -> solution: attribute injection pattern is promising
tooling gaps prevent operators from leveraging shared deployments -> solution: this is part of Crowbar’s mission
under investing in community supporting features because they are seen as taking away from getting into production -> solution: need leadership and others with join
drift between target versions creates the need for forking even if the cookbooks are fundamentally the same -> solution: pull from source approaches help create distro independent baselines
missing reference architectures interfere with having a stable baseline to deploy against -> solution: agree to a standard, machine consumable RA format like OpenStack Heat.

Unfortunately, these five challenges are tightly coupled and we have to progress on them simultaneously. The tooling and community requires patterns and RAs.

The good news is that we are making real progress.

Judd Maltin (@newgoliath), a Crowbar team member, has documented the emerging Attribute Injection practice that Crowbar has been leading. That practice has been refined in the open by ATT and Rackspace. It is forming the foundation of the OpenStack cookbooks.

Understanding, discussing and supporting that pattern is an important step toward accelerating open operations. Please engage with us as we make the investments for open operations and help us implement the pattern.

OpenStack Summit: Let’s talk DevOps, Fog, Upgrades, Crowbar & Dell

Posted on October 10, 2012 by Rob H

If you are coming to the OpenStack summit in San Diego next week then please find me at the show! I want to hear from you about the Foundation, community, OpenStack deployments, Crowbar and anything else. Oh, and I just ordered a handful of Crowbar stickers if you wanted some CB bling.

Matt Ray (Opscode), Jason Cannavale (Rackspace) and I were Ops track co-chairs. If you have suggestions, we want to hear. We managed to get great speakers and also some interesting sessions like DevOps panel and up streaming deploy working sessions. It’s only on Monday and Tuesday, so don’t snooze or you’ll miss it.

My team from Dell has a lot going on, so there are lots of chances to connect with us:

Monday
1. Crowbar session (including Crowbar2) by Scott Jensen @ 9:50
2. Pull From Source (DevOps on steroids!) by Andi Abes @ 11:50
3. Folsom to Grizzly migration ops strategy by Greg Althaus @ 4:30
4. Swift Automatic Ring construction by Andi Abes
Tuesday
1. Quantum Fog (Cloud API abstraction networking) by me @ 11:50
2. Distribution Panel sponsored by Dell with SUSE, Canonical, Redhat and others @ 3:40
Wednesday
1. Crowbar 2 mini summit (TBD)
2. Community Meetups with Kamesh Pemmaraju and Sean Roberts @ 2:40
Friday – I’ll be at the beach cleanup.

At the Dell booth, Randy Perryman will be sharing field experience about hardware choices. We’ve got a lot of OpenStack battle experience and we want to compare notes with you.

I’m on the board meeting on Monday so likely occupied until the Mirantis party.

See you in San Diego!

PS: My team is hiring for Dev, QA and Marketing. Let me know if you want details.

Crowbar 2.0 Design Summit Notes (+ open weekly meetings starting)

Posted on July 30, 2012 by Rob H

I could not be happier with the results Crowbar collaborators and my team at Dell achieved around the 1^stCrowbar design summit. We had great discussions and even better participation.

The attendees represented major operating system vendors, configuration management companies, OpenStack hosting companies, OpenStack cloud software providers, OpenStack consultants, OpenStack private cloud users, and (of course) a major infrastructure provider. That’s a very complete cross-section of the cloud community.

I knew from the start that we had too little time and, thankfully, people were tolerant of my need to stop the discussions. In the end, we were able to cover all the planned topics. This was important because all these features are interlocked so discussions were iterative. I was impressed with the level of knowledge at the table and it drove deep discussion. Even so, there are still parts of Crowbar that are confusing (networking, late binding, orchestration, chef coupling) even to collaborators.

In typing up these notes, it becomes even more blindingly obvious that the core features for Crowbar 2 are highly interconnected. That’s no surprise technically; however, it will make the notes harder to follow because of knowledge bootstrapping. You need take time and grok the gestalt and surf the zeitgeist.

Collaboration Invitation: I wanted to remind readers that this summit was just the kick-off for a series of open weekly design (Tuesdays 10am CDT) and coordination (Thursdays 8am CDT) meetings. Everyone is welcome to join in those meetings – information is posted, recorded, folded, spindled and mutilated on the Crowbar 2 wiki page.

These notes are my reflection of the online etherpad notes that were made live during the meeting. I’ve grouped them by design topic.

Introduction

Contributors need to sign CLAs
We are refactoring Crowbar at this time because we have a collection of interconnected features that could not be decoupled
Some items (Database use, Rails3, documentation, process) are not for debate. They are core needs but require little design.
There are 5 key topics for the refactor: online mode, networking flexibility, OpenStack pull from source, heterogeneous/multi operating systems, being CDMB agnostic
Due to time limits, we have to stop discussions and continue them online.
We are hoping to align Crowbar 2 beta and OpenStack Folsom release.

Online / Connected Mode

Online mode is more than simply internet connectivity. It is the foundation of how Crowbar stages dependencies and components for deploy. It’s required for heterogeneous O/S, pull from source and it has dependencies on how we model networking so nodes can access resources.
We are thinking to use caching proxies to stage resources. This would allow isolated production environments and preserves the run everything from ISO without a connection (that is still a key requirement to us).
Suse’s Crowbar fork does not build an ISO, instead it relies on RPM packages for barclamps and their dependencies.
Pulling packages directly from the Internet has proven to be unreliable, this method cannot rely on that alone.

Install From Source

This feature is mainly focused on OpenStack, it could be applied more generally. The principals that we are looking at could be applied to any application were the source code is changing quickly (all of them?!). Hadoop is an obvious second candidate.
We spent some time reviewing the use-cases for this feature. While this appears to be very dev and pre-release focused, there are important applications for production. Specifically, we expect that scale customers will need to run ahead of or slightly adjacent to trunk due to patches or proprietary code. In both cases, it is important that users can deploy from their repository.
We discussed briefly our objective to pull configuration from upstream (not just OpenStack, but potentially any common cookbooks/modules). This topic is central to the CMDB agnostic discussion below.
The overall sentiment is that this could be a very powerful capability if we can manage to make it work. There is a substantial challenge in tracking dependencies – current RPMs and Debs do a good job of this and other configuration steps beyond just the bits. Replicating that functionality is the real obstacle.

CMDB agnostic (decoupling Chef)

This feature is confusing because we are not eliminating the need for a configuration management database (CMDB) tool like Chef, instead we are decoupling Crowbar from the a single CMDB to a pluggable model using an abstraction layer.
It was stressed that Crowbar does orchestration – we do not rely on convergence over multiple passes to get the configuration correct.
We had strong agreement that the modules should not be tightly coupled but did need a consistent way (API? Consistent namespace? Pixie dust?) to share data between each other. Our priority is to maintain loose coupling and follow integration by convention and best practices rather than rigid structures.
The abstraction layer needs to have both import and export functions
Crowbar will use attribute injection so that Cookbooks can leverage Crowbar but will not require Crowbar to operate. Crowbar’s database will provide the links between the nodes instead of having to wedge it into the CMDB.
In 1.x, the networking was the most coupled into Chef. This is a major part of the refactor and modeling for Crowbar’s database.
There are a lot of notes captured about this on the etherpad – I recommend reviewing them

Heterogeneous OS (bare metal provisioning and beyond)

This topic was the most divergent of all our topics because most of the participants were using some variant of their own bare metal provisioning project (check the etherpad for the list).
Since we can’t pack an unlimited set of stuff on the ISO, this feature requires online mode.
Most of these projects do nothing beyond OS provisioning; however, their simplicity is beneficial. Crowbar needs to consider users who just want a stream-lined OS provisioning experience.
We discussed Crowbar’s late binding capability, but did not resolve how to reconcile that with these other projects.
Critical use cases to consider:
- an API for provisioning (not sure if it needs to be more than the current one)
- pick which Operating Systems go on which nodes (potentially with a rules engine?)
- inventory capabilities of available nodes (like ohai and factor) into a database
- inventory available operating systems

Rob Hirschfeld

On Computing, Containers, Cloud & Tech Culture

Category Archives: Opscode