Yes, you can now build the open version of Crowbar and it has the code to configure a bare metal server.
Let me be very specific about this… my team at Dell tests Crowbar on a limited set of hardware configurations. Specifically, Dell server versions R720 + R720XD (using WSMAN and iIDRAC) and C6220 + C8000 (using open tools). Even on those servers, we have a limited RAID and NIC matrix; consequently, we are not positioned to duplicate other field configurations in our lab. So, while we’re excited to work with the community, caveat emptor open source.
Another thing about RAID and BIOS is that it’s REALLY HARD to get right. I know this because our team spends a lot of time testing and tweaking these, now open, parts of Crowbar. I’ve learned that doing hard things creates value; however, it’s also means that contributors to these barclamps need to be prepared to get some silicon under their fingernails.
I’m proud that we’ve reached this critical milestone and I hope that it encourages you to play along.
PS: It’s worth noting is that community activity on Crowbar has really increased. I’m excited to see all the excitement.
Scale out platforms like Hadoop have different operating rules. I heard an interesting story today in which the performance of the overall system was improved 300% (run went from 15 mins down to 5 mins) by the removal of a node.
In a distributed system that coordinates work between multiple nodes, it only takes one bad node to dramatically impact the overall performance of the entire system.
Finding and correcting this type of failure can be difficult. While natural variability, hardware faults or bugs cause some issues, the human element is by far the most likely cause. If you can turn down noise injected by human error then you’ve got a chance to find the real system related issues.
Consequently, I’ve found that management tooling and automation are essential for success. Management tools help diagnose the cause of the issue and automation creates repeatable configurations that reduce the risk of human injected variability.
I’d also like to give a shout out to benchmarks as part of your tooling suite. Without having a reasonable benchmark it would be impossible to actually know that your changes improved performance.
Teaming Related Post Script: In considering the concept of system performance, I realized that distributed human systems (aka teams) have a very similar characteristic. A single person can have a disproportionate impact on overall team performance.
We’re just sprints from release; consequently, it’s time for the Crowbar/OpenStack community to come and play! You can learn Grizzly and help tune the open source Ops scripts.
While the Crowbar team has been generating a lot of noise around our Crowbar 2.0 work, we have not neglected progress on OpenStack Grizzly. We’ve been building Grizzly deploys on the 1.x code base using pull-from-source to ensure that we’d be ready for the release. For continuity, these same cookbooks will be the foundation of our CB2 deployment.
Features of Crowbar’s OpenStack Grizzly Deployments
We’ve had Nova Compute, Glance Image, Keystone Identity, Horizon Dashboard, Swift Object and Tempest for a long time. Those, of course, have been updated to Grizzly.
Swift Object Barclamps made a lot of progress in Folsom that translates to Grizzly
Apache Web Service
“Under the covers” improvements for Crowbar 1.x
Substantial improvements in how we configure host networking
Numerous bug fixes and tweaks
Pull from Source via the Git barclamp
Grizzly branch was switched to use Ubuntu & SUSE packages
We’ve made substantial progress, but there are still gaps. We do not have upgrade paths from Essex or Folsom. While we’ve been adding fault-tolerance features, full automatic HA deployments are not included.
Today my boss at Dell, John Igoe, is part of announcing of the report from the TechAmerica Federal Big Data Commission (direct pdf), I was fully expecting the report to be a real snoozer brimming with corporate synergies and win-win externalities. Instead, I found myself reading a practical guide to applying Big Data to government. Flipping past the short obligatory “what is…” section, the report drives right into a survey of practical applications for big data spanning nearly every governmental service. Over half of the report is dedicated to case studies with specific recommendations and buying criteria.
Ultimately, the report calls for agencies to treat data as an asset. An asset that can improve how government operates.
There are a few items that stand out in this report:
Clear tables of case studies on page 16 and characteristics on page 11 that help pin point a path through the options.
Definitive advice to focus on a single data vector (velocity, volume or variety) for initial success on page 28 (and elsewhere)
I strongly agree with one repeated point in the report: although there is more data available, our ability to comprehend this data is reduced. The sheer volume of examples the report cites is proof enough that agencies are, and will be continue to be, inundated with data.
One short coming of this report is that it does not flag the extreme storage of data scientists. Many of the cases discussed assume a ready army of engineers to implement these solutions; however, I’m uncertain how the government will fill positions in a very tight labor market. Ultimately, I think we will have to simply open the data for citizen & non-governmental analysis because, as the report clearly states, data is growing faster than capability to use it.
I commend the TechAmerica commission for their Big Data clarity: success comes from starting with a narrow scope. So the answer, ironically, is in knowing which questions we want to ask.
“Double wide” is not a term I’ve commonly applied to servers, but that’s one of the cool things about this new class of servers that Dell, my employer, started shipping today.
My team has been itching for the chance to start cloud and big data reference architectures using this super dense and flexible chassis. You’ll see it included in our next Apache Hadoop release and we’ve already got customers who are making it the foundation of their deployments (Texas Adv Computing Center case study).
If you’re tracking the latest big data & cloud hardware then the Dell PowerEdge C8000 is worth some investigation.
Basically, the Dell C8000 is a chassis that holds a flexible configuration of compute or storage sleds. It’s not a blade frame because the sleds minimize shared infrastructure. In our experience, cloud customers like the dedicated i/o and independence of sleds (as per the Bootstrapping clouds white paper). Those attributes are especially well suited for Hadoop and OpenStack because they support a “flat edges” and scale out design. While i/o independence is valued, we also want shared power infrastructure and density for efficiency reasons. Using a chassis design seems to capture the best of both worlds.
The novelty for the Dell PowerEdge C8000 is that the chassis are scary flexible. You are not locked into a pre-loaded server mix.
There are a plethora of sled choices so that you can mix choices for power, compute density and spindle counts. That includes double-wide sleds positively brimming with drives and expanded GPU processers. Drive density is important for big data configurations that are disk i/o hungry; however, our experience is the customer deployments are highly varied based on the planned workload. There are also significant big data trends towards compute, network, and balanced hardware configurations. Using the C8000 as a foundation is powerful because it can cater to all of these use-case mixes.
Not only are we simultaneously releasing both of these solutions, they reflect a significant acceleration in pace of delivery. Both solutions had beta support for their core technologies (Cloudera 4 & OpenStack Essex) when the components were released and we have dramatically reduced the lag from component RC to solution release compared to past (3.7 & Diablo) milestones.
As before, the core deployment logic of these open source based solutions was developed in the open on Crowbar’s github. You are invited to download and try these solutions yourself. For Dell solutions, we include validated reference architectures, hardware configuration extensions for Crowbar, services and support.
The latest versions of Hadoop and OpenStack represent great strides for both solutions. It’s great to be able have made them more deployable and faster to evaluate and manage.
The response to Crowbar has been exciting and humbling. I most appreciate those who looked at Crowbar and saw more than a bare metal installer. They are the ones who recognized that we are trying to solve a bigger problem: it has been too difficult to cope with change in IT operations.
During this year, we have made many changes. Many have been driven by customer, user and partner feedback while others support Dell product delivery needs. Happily, these inputs are well aligned in intent if not always in timing.
Introduction of barclamps as modular components
Expansion into multiple applications (most notably OpenStack and Apache Hadoop)
Working in the open (with public commits)
Collaborative License Agreements
Dell‘s understanding of open source and open development has made a similar transformation. Crowbar was originally Apache 2 open sourced because we imagined it becoming part of the OpenStack project. While that ambition has faded, the practical benefits of open collaboration have proven to be substantial.
The results from this first year are compelling:
For OpenStack Diablo, coordination with the Rackspace Cloud Builder team enabled Crowbar to include the Keystone and Dashboard projects into Dell’s solution
We’ve amassed hundreds of mail subscribers and Github followers
Support for multiple releases of RHEL, Centos & Ubuntu including Ubuntu 12.04 while it was still in beta.
SuSE does their own port of Crowbar to SuSE with important advances in Crowbar’s install model (from ISO to package).
We stand on the edge of many exciting transformations for Crowbar’s second year. Based on the amount of change from this year, I’m hesitant to make long term predictions. Yet, just within next few months there are significant plans based on Crowbar 2.0 refactor. We have line of site to changes that expand our tool choices, improve networking, add operating systems and become more even production ops capable.
With the GA drop, the Crowbar Cloudera Barclamps are effectively at release candidate state (ISO). The Cloudera Barclamps include a freemium version of Cloudera Enterprise 4 that supports up to 50 nodes.
We are planning many improvements to our OpenStack Essex and Crowbar Framework; however, most deployments can proceed without these enhancements. This also enables participants in the 5/31 OpenStack Essex Deploy Day.
By releasing a core stable Essex reference deployment, we are accelerating field deployments and enabling the OpenStack ecosystem. In terms of previous posts, we are eliminating release interlocks to enable more downstream development. Ultimately, we hope that we are also creating a baseline OpenStack deployment.
We are also reducing the pressure to rush more disruptive Crowbar changes (like enabling high availability, adding multiple operating systems, moving to Rails 3, fewer crowbarisms in cookbooks and streamlining networking). With this foundational Essex release behind us (we call it an MVP), we can work on more depth and breadth of capability in OpenStack.
One small challenge, some of the changes that we’d expected to drop have been postponed slightly. Specifically, markdown based documentation (/docs) and some new UI pages (/network/nodes, /nodes/families). All are already in the product under but not wired into the default UI (basically, a split test).
On the bright side, we did manage to expose 10g networking awareness for barclamps; however, we have notyet refactored to barclamps to leverage the change.
My team at Dell has been driving to transparency and openness around Crowbar plus our OpenStack and Hadoop powered solutions. Specifically, our work for our coming release is maintained in the open on the Dell CloudEdge Github site. You can see (and participate in!) our development and validation work in advance of our official release.
Both the new and original Hadoop barclamp use the Cloudera Hadoop distribution (aka CDH); however, the new barclamp is able to leverage Cloudera‘s latest management capabilities. For the Dell solution, Cloudera Manager has always been part of the offering. The primary difference is that we are improving the level of integration. I promise to post more about the features of the solution as we get closer to release.