Heterogeneous Operating Systems – chose which operating system you want to install on the target servers.
CMDB Flexibility – don’t be locked in to a devops toolset. Attribute injection allows clean abstraction boundaries so you can use multiple tools (Chef and Puppet, playing together).
Ops Annealer –the orchestration at Crowbar’s heart combines the best of directed graphs with late binding and parallel execution. We believe annealing is the key ingredient for repeatable and OpenOps shared code upgrades
Upstream Friendly – infrastructure as code works best as a community practice and Crowbar use upstream code without injecting “crowbarisms” that were previously required. So you can share your learning with the broader DevOps community even if they don’t use Crowbar.
Node Discovery (or not) – Crowbar maintains the same proven discovery image based approach that we used before, but we’ve streamlined and expanded it. You can use Crowbar’s API outside of the PXE discovery system to accommodate Docker containers, existing systems and VMs.
Hardware Configuration – Crowbar maintains the same optional hardware neutral approach to RAID and BIOS configuration. Configuring hardware with repeatability is difficult and requires much iterative testing. While our approach is open and generic, my team at Dell works hard to validate a on specific set of gear: it’s impossible to make statements beyond that test matrix.
Network Abstraction – Crowbar dramatically extended our DevOps network abstraction. We’ve learned that a networking is the key to success for deployment and upgrade so we’ve made Crowbar networking flexible and concise. Crowbar networking works with attribute injection so that you can avoid hardwiring networking into DevOps scripts.
Out of band control – when the Annealer hands off work, Crowbar gives the worker implementation flexibility to do it on the node (using SSH) or remotely (using an API). Making agents optional means allows operators and developers make the best choices for the actions that they need to take.
Technical Debt Paydown – We’ve also updated the Crowbar infrastructure to use the latest libraries like Ruby 2, Rails 4, Chef 11. Even more importantly, we’re dramatically simplified the code structure including in repo documentation and a Docker based developer environment that makes building a working Crowbar environment fast and repeatable.
Why change to OpenCrowbar? This new generation of Crowbar is structurally different from Crowbar 1 and we’ve investing substantially in refactoring the tooling, paying down technical debt and cleanup up documentation. Since Crowbar 1 is still being actively developed, splitting the repositories allow both versions to progress with less confusion. The majority of the principles and deployment code is very similar, I think of Crowbar as a single community.
Interested? Our new Docker Admin node is quick to setup and can boot and manage both virtual and physical nodes.
The OpenStack Foundation Board has been having a broadening conservation about this topic. Feeling left out? Please don’t be! Now is the time to start getting involved: we had to start very narrowly focused to avoid having the discussion continue to go in circles. As we’ve expanding the dialog, we have incorporated significant feedback to drive consensus.
No matter where I go, people are passionate about the subject of OpenStack Core.
Overall, there is confusion of scope covered by “what is core” because people bring in their perspective from public, private solution, ecosystem or internal deployment objectives. In discussion, everyone sees that we have to deal with issues around the OpenStack mark and projects first, but they are impatient to get into the deep issues. Personally, we can get consensus on core and will always have a degree of healthy tension between user types.
The following are my notes, not my opinions. I strive to faithfully represent a wide range of positions here. Clarifications, comments and feedback are welcome!
Reference/Alternate Implementation (not plug-in): Not using “plug-ins” to describe the idea that OpenStack projects should have a shared API with required code and clearly specified areas where code is replaceable. It is the Technical Committee (TC) that makes these decisions. The most meaningful language around this point is to say that OpenStack will have an open reference implementation with allowable alternate implementations.
Alternate implementations are useful: We want to ensure upstream contribution and collaboration on the code base. Reference implementations ensure that there’s a reason to keep open source OpenStack strong. Alternate Implementations are important to innovation.
Small vs. Large Core: This is an ongoing debate about if OpenStack should have a lot of projects as part of core. We don’t have an answer but people feel like we’re heading in a direction that resolves this question.
Everyone likes tests: We’re heading towards a definition of core that relies heavily on tests. Everyone expresses concerns that this will place a lot of stress on Tempest (or another framework) and that needs to be addressed as we move forward.
Monolithic vs. Granular Trademark: We did not discuss if vendors will be able to claim OpenStack trademarks on subcomponents of the whole. This is related to core but wide considered secondary.
API vs. implementation tension: We accept that OpenStack will lead with implementation. There’s no official policy that “we are not a standards body” but we may also have to state that tests are not a specification. There’s a danger that tests will be considered more than they are. What are they? “They are an implementation and a source of information. They are not the definition.” We expect to have a working model that drives the API not vice versa.
Brouhaha about EC2 APIs: It’s not clear if defining core helps address the OpenStack API discussion. I hope it will but have not tested it.
Usability as core: I had many people insist that usability and ease of use should be as requirements for core because it supports adoption. Our current positions do not have any statements to support this view.
Toxic neighbors: We have not discussed if use of the mark and criteria could be limited by what else you put in your product. Are there implementation options that we’d consider toxic and automatically violate the mark? Right now, the positions are worded that if you pass then you play even if you otherwise stink.
Which tests are required? It appears that we’re moving towards using must-pass tests to define the core. Moving towards tests determining core, we want actual field data to drive which tests are required. That will allow actual user experience to shape which tests are important rather than having it be a theoretical decision. There’s some interest in asking the User Committee (UC) to recommend which tests are required. This would be an added responsibility for the UC and needs more discussion.
Need visualization: With 12 positions so far, it’s getting hard to keep it all together. I’ve taken on an action item to create a diagram that shows which statements apply to which projects against the roles of ownership.
I’ve had some great discussions about core and am looking forward to many more. I hope these notes help bring you up to speed. As always, comments and discussion are welcome!
A strong project has utility, community, and longevity.
Utility, community and longevity are the fundamental objectives of any project or product. It must do something that people find useful (utility). It’s not enough for one person to like the project, there must be a market (community). And that useful and popular work must be sustainable over multiple “generations” (longevity).
These goals are basic. The challenge is finding the right rules to keep OpenStack in the sustainable project zone. Unfortunately, as an open source project, the OpenStack Foundation ultimately has very little real power (like hiring flocks of developers) to enforce use or maintenance of the code base.
The Foundation’s tools are velocity, culture, and brand. Understanding “what is core” hones these tools to ensure they are effective.
Velocity – the rate of progress and quality of the code base. A project at sufficient velocity is not worth forking or duplicating. The fact that >1000 developers companies are contributing and 100s of companies are deploying OpenStack makes it profitable to remain in our community. Make no mistake: being part of a community takes effort so there must be a return on that investment. The foundation must ensure that commercial entities find an ROI from their participation.
Culture – open source culture strongly encourages sharing and collaboration. I have seen that culture as a more potent force than the legalese and licenses. While a strong culture reinforces itself, a toxic culture will rot a project like ice cream in the summer. Culture maintenance is a chief foundation objective and includes fostering new users, documentation, orderly interactions and co-opetitive collaboration.
Brand – when all else fails, OpenStack can use legal means to define our brand. This is the weakest of all the tools because the strength of the defense is only as good as the brand. If we allow the OpenStack brand (sometimes we say it’s mark) to become weak or diluted then people have little reason to support velocity or culture.
An important insight when looking at these three control levers is that they are very different between individuals and corporations. While individuals may highly motivated by culture they are not as motivated by brand; conversely, corporations are highly motivated by brand and compliance and minimally by culture.
As the OpenStack Foundation Board takes up the “what is core” question, we must be cognizant of the duality between individual and corporate interests. OpenStack must be both meaningful culturally to individuals and strong brand-wise to corporations. Both are needed to sustain OpenStack’s velocity.
I’ve been leading an effort with Alan Clark to define “what is OpenStack core” for the Foundation Board. Now that I am sitting here at OSCON and celebrating OpenStack’s third birthday, I think it’s a great time to bring the general community into the discussion.
I could not be happier with the results Crowbar collaborators and my team at Dell achieved around the 1st Crowbar design summit. We had great discussions and even better participation.
The attendees represented major operating system vendors, configuration management companies, OpenStack hosting companies, OpenStack cloud software providers, OpenStack consultants, OpenStack private cloud users, and (of course) a major infrastructure provider. That’s a very complete cross-section of the cloud community.
I knew from the start that we had too little time and, thankfully, people were tolerant of my need to stop the discussions. In the end, we were able to cover all the planned topics. This was important because all these features are interlocked so discussions were iterative. I was impressed with the level of knowledge at the table and it drove deep discussion. Even so, there are still parts of Crowbar that are confusing (networking, late binding, orchestration, chef coupling) even to collaborators.
In typing up these notes, it becomes even more blindingly obvious that the core features for Crowbar 2 are highly interconnected. That’s no surprise technically; however, it will make the notes harder to follow because of knowledge bootstrapping. You need take time and grok the gestalt and surf the zeitgeist.
Collaboration Invitation: I wanted to remind readers that this summit was just the kick-off for a series of open weekly design (Tuesdays 10am CDT) and coordination (Thursdays 8am CDT) meetings. Everyone is welcome to join in those meetings – information is posted, recorded, folded, spindled and mutilated on the Crowbar 2 wiki page.
These notes are my reflection of the online etherpad notes that were made live during the meeting. I’ve grouped them by design topic.
We are refactoring Crowbar at this time because we have a collection of interconnected features that could not be decoupled
Some items (Database use, Rails3, documentation, process) are not for debate. They are core needs but require little design.
There are 5 key topics for the refactor: online mode, networking flexibility, OpenStack pull from source, heterogeneous/multi operating systems, being CDMB agnostic
Due to time limits, we have to stop discussions and continue them online.
We are hoping to align Crowbar 2 beta and OpenStack Folsom release.
Online / Connected Mode
Online mode is more than simply internet connectivity. It is the foundation of how Crowbar stages dependencies and components for deploy. It’s required for heterogeneous O/S, pull from source and it has dependencies on how we model networking so nodes can access resources.
We are thinking to use caching proxies to stage resources. This would allow isolated production environments and preserves the run everything from ISO without a connection (that is still a key requirement to us).
Suse’s Crowbar fork does not build an ISO, instead it relies on RPM packages for barclamps and their dependencies.
Pulling packages directly from the Internet has proven to be unreliable, this method cannot rely on that alone.
Install From Source
This feature is mainly focused on OpenStack, it could be applied more generally. The principals that we are looking at could be applied to any application were the source code is changing quickly (all of them?!). Hadoop is an obvious second candidate.
We spent some time reviewing the use-cases for this feature. While this appears to be very dev and pre-release focused, there are important applications for production. Specifically, we expect that scale customers will need to run ahead of or slightly adjacent to trunk due to patches or proprietary code. In both cases, it is important that users can deploy from their repository.
We discussed briefly our objective to pull configuration from upstream (not just OpenStack, but potentially any common cookbooks/modules). This topic is central to the CMDB agnostic discussion below.
The overall sentiment is that this could be a very powerful capability if we can manage to make it work. There is a substantial challenge in tracking dependencies – current RPMs and Debs do a good job of this and other configuration steps beyond just the bits. Replicating that functionality is the real obstacle.
CMDB agnostic (decoupling Chef)
This feature is confusing because we are not eliminating the need for a configuration management database (CMDB) tool like Chef, instead we are decoupling Crowbar from the a single CMDB to a pluggable model using an abstraction layer.
It was stressed that Crowbar does orchestration – we do not rely on convergence over multiple passes to get the configuration correct.
We had strong agreement that the modules should not be tightly coupled but did need a consistent way (API? Consistent namespace? Pixie dust?) to share data between each other. Our priority is to maintain loose coupling and follow integration by convention and best practices rather than rigid structures.
The abstraction layer needs to have both import and export functions
Crowbar will use attribute injection so that Cookbooks can leverage Crowbar but will not require Crowbar to operate. Crowbar’s database will provide the links between the nodes instead of having to wedge it into the CMDB.
In 1.x, the networking was the most coupled into Chef. This is a major part of the refactor and modeling for Crowbar’s database.
There are a lot of notes captured about this on the etherpad – I recommend reviewing them
Heterogeneous OS (bare metal provisioning and beyond)
This topic was the most divergent of all our topics because most of the participants were using some variant of their own bare metal provisioning project (check the etherpad for the list).
Since we can’t pack an unlimited set of stuff on the ISO, this feature requires online mode.
Most of these projects do nothing beyond OS provisioning; however, their simplicity is beneficial. Crowbar needs to consider users who just want a stream-lined OS provisioning experience.
The response to Crowbar has been exciting and humbling. I most appreciate those who looked at Crowbar and saw more than a bare metal installer. They are the ones who recognized that we are trying to solve a bigger problem: it has been too difficult to cope with change in IT operations.
During this year, we have made many changes. Many have been driven by customer, user and partner feedback while others support Dell product delivery needs. Happily, these inputs are well aligned in intent if not always in timing.
Introduction of barclamps as modular components
Expansion into multiple applications (most notably OpenStack and Apache Hadoop)
Working in the open (with public commits)
Collaborative License Agreements
Dell‘s understanding of open source and open development has made a similar transformation. Crowbar was originally Apache 2 open sourced because we imagined it becoming part of the OpenStack project. While that ambition has faded, the practical benefits of open collaboration have proven to be substantial.
The results from this first year are compelling:
For OpenStack Diablo, coordination with the Rackspace Cloud Builder team enabled Crowbar to include the Keystone and Dashboard projects into Dell’s solution
We’ve amassed hundreds of mail subscribers and Github followers
Support for multiple releases of RHEL, Centos & Ubuntu including Ubuntu 12.04 while it was still in beta.
SuSE does their own port of Crowbar to SuSE with important advances in Crowbar’s install model (from ISO to package).
We stand on the edge of many exciting transformations for Crowbar’s second year. Based on the amount of change from this year, I’m hesitant to make long term predictions. Yet, just within next few months there are significant plans based on Crowbar 2.0 refactor. We have line of site to changes that expand our tool choices, improve networking, add operating systems and become more even production ops capable.
Today I presented about how Crowbar + DevOps + OpenStack = CloudOps. The highlight of the presentation (to me, anyway) is the Images vs Layers analogy of Soup vs Sandwiches. I hope it helps explain why we believe that a DevOps approach to Cloud is essential to success.
I’m not usually a big fan of launch videos (too much markitecture); however, these turned out to be nice and meaty. The meaty part explains why it looks like I’m about to eat a big sandwich in the last video. yum!
Crowbar started as a Dell OpenStack installer project and then grew beyond that in scope. Now it can be extended to work with other vendors’ kits and other solutions bits.
We are contributing Crowbar to the community because we believe that everyone benefits by sharing in the operational practices that Crowbar embodies. These are rooted in Opscsode Chef (which Crowbar tightly integrates with) and the cloud & hyper-scale proven DevOps practices that are reflected in our deployment model.
Build scripts so you can create your own Crowbar install ISO
Switch discovery so you can create Chef Cookbooks that are network aware.
Open source Chef server that powers much of Crowar’s functionality
What’s not included?
Non-open source license components (BIOS+RAID config) that we could not distribute under the Apache 2 license. We are working to address this and include them in our release. They are available in the Dell Licensed version of Crowbar.
Dell Branded Components (skin + overview page). Crowbar has an OpenSource skin with identical functionality.
Pre-built ISOs with install images (you must download the open source components yourself, we cannot redistribute them to you as a package)
Crowbar uses Chef Server as its database and relies on cookbooks for node deployments. It is installed (using Chef Solo) automatically as part of the Crowbar install.
Wednesday, July 27, 7-9 pm, at Spirit of 77 (right across from the Oregon
Convention Center at the close of the day). Join us to toast the first
anniversary of the fastest-growing open source project! Please register here and
help promote the event: http://openstack-one-year.eventbrite.com