“Double wide” is not a term I’ve commonly applied to servers, but that’s one of the cool things about this new class of servers that Dell, my employer, started shipping today.
My team has been itching for the chance to start cloud and big data reference architectures using this super dense and flexible chassis. You’ll see it included in our next Apache Hadoop release and we’ve already got customers who are making it the foundation of their deployments (Texas Adv Computing Center case study).
If you’re tracking the latest big data & cloud hardware then the Dell PowerEdge C8000 is worth some investigation.
Basically, the Dell C8000 is a chassis that holds a flexible configuration of compute or storage sleds. It’s not a blade frame because the sleds minimize shared infrastructure. In our experience, cloud customers like the dedicated i/o and independence of sleds (as per the Bootstrapping clouds white paper). Those attributes are especially well suited for Hadoop and OpenStack because they support a “flat edges” and scale out design. While i/o independence is valued, we also want shared power infrastructure and density for efficiency reasons. Using a chassis design seems to capture the best of both worlds.
The novelty for the Dell PowerEdge C8000 is that the chassis are scary flexible. You are not locked into a pre-loaded server mix.
There are a plethora of sled choices so that you can mix choices for power, compute density and spindle counts. That includes double-wide sleds positively brimming with drives and expanded GPU processers. Drive density is important for big data configurations that are disk i/o hungry; however, our experience is the customer deployments are highly varied based on the planned workload. There are also significant big data trends towards compute, network, and balanced hardware configurations. Using the C8000 as a foundation is powerful because it can cater to all of these use-case mixes.
Last week, my team at Dell led a world-wide OpenStack Essex Deploy event. Kamesh Pemmaraju, our OpenStack-powered solution product manager, did a great summary of the event results (200+ attendees!). What started as a hack-a-thon for deploy scripts morphed into a stunning 14+ hour event with rotating intro content and an ecosystem showcase (videos). Special kudos to Kamesh, Andi Abes, Judd Maltin, Randy Perryman & Mike Pittaro for leadership at our regional sites.
Clearly, OpenStack is attracting a lot of interest. We’ve been investing time in content to help people who are curious about OpenStack to get started.
On that measure, we have room for improvement. We had some great discussions about how to handle upgrades and market drivers for OpenStack; however, we did not spend the time improving Essex deployments that I was hoping to achieve. I know it’s possible – I’ve talked with developers in the Crowbar community who want this.
If you wanted more expert interaction, here are some of my thoughts for future events.
Expert track did not get to deploy coding. I think that we need to simply focus more even tightly on to Crowbar deployments. That means having a Crowbar Hack with an OpenStack focus instead of vice versa.
Efforts to serve OpenStack n00bs did not protect time for experts. If we offer expert sessions then we won’t try to have parallel intro sessions. We’ll simply have to direct novices to the homework pages and videos.
Combining on-site and on-line is too confusing. As much as I enjoy meeting people face-to-face, I think we’d have a more skilled audience if we kept it online only.
Connectivity! Dropped connections, sigh.
Better planning for videos (not by the presenters) to make sure that we have good results on the expert track.
This event was too long. It’s just not practical to serve Europe, US and Asia in a single event. I think that 2-3 hours is a much more practical maximum. 10-12am Eastern or 6-8pm Pacific would be much more manageable.
Do you have other comments and suggestions? Please let me know!
Matt Ray from Opscode presented some of the work with Chef and OpenStack. He talked about the three main chef repos floating around. He called out Anso’s original cookbook set that is the basis for the Crowbar cookbooks (his second set), and his final set is the emerging set of cookbooks in OpenStack proper. The third one is interesting and what he plans to continue working on to make into his public openstack cookbooks. These are an amalgamation of smokestack, RCB, Anso improvements, and his (Crowbar’s).
He then demoed his knife plugin (slideshare) to build openstack virtual servers using the Openstack API. This is nice and works against TryStack.org (previously “Free Cloud”) and RCB’s demo cloud. All of that is on his github repo with instructions how to build and use. Matt and I talked about trying to get that into our Crowbar distro.
There were some questions about flow and choice of OpenStack API versus Amazon EC2 API because there was already an EC2 knife set of plugins.
Ziad Sawalha talks about Keystone
Ziad Sawalha is the PLT (Project Technical Lead) for Keystone. He works for Rackspace out of San Antonio. He drove up for the meeting.
He split his talk into two pieces, Incubation Process and Keystone Overview. He asked who was interested in what and focused his talk more towards overview than incubation.
Some key take-aways:
Keystone comes from Rackspace’s strong, flexible, and scalable API. It started as a known quantity from his perspective.
Community trusted nothing his team produced from an API perspective
Community is python or nothing
His team was ignored until they had a python prototype implementing the API
At this point, comments on API came in.
Churn in API caused problems with implementation and expectations around the close of Diablo.
Because comments were late, changes occurred.
Official implementation lagged and stalled into arriving.
API has been stable since Diablo final, but code is changing. that is good and shows strength of API.
Side note from Greg, Keystone represents to me the power of API over Code. You can have innovation around the implementation as long all the implementations have a fair ground work to plan under which is an API specification. The replacement of Keystone with the Keystone Light code base is an example of this. The only reason this is possible is that the API was sound and documented. (Rob’s post on this)
Ziad spent the rest of his time talking about the work flow of Keystone and the API points. He covered the API points.
Client to Keystone, Keystone to Client for initial auth token
Client to Middleware API for the services to have a front.
Middleware to Keystone to verify and establish identity.
Middleware to Service to pass identity
Not many details other then flow and flexibility. He stressed the API design separated protocol from actions and data at all the layers. This allows for future variations and innovations while maintaining the APIs.
Ziad talked about the state of Essex.
RBAC (aka Role Based Access Control)
Code replacement Keystone Light
Federation was planned but will most likely be pushed to G
Federation is the ability for multiple independent Keystones to operate (bursting use case)
Dependent upon two other federation components (networking and billing/metering)
I had a question about moving barclamps between solutions. Since Victor just changed the barclamp build to create a tar for each barclamp (with the debs/rpms), I thought it was the perfect time to explain the new feature.
You can find the barclamps on the Crowbar ISO under “/dell/barclamps” and you can install the TAR onto a Crowbar system using “./barclamp_install foo.tar.gz” where foo is the name of your barclamp.
Here’s a video of how to find and install barclamp tars:
Note: while you can install OpenStack into a Hadoop system, that combination is NOT tested. We only test OpenStack on Ubuntu 10.10 and Hadoop on RHEL 5.7. Community help in expanding support is always welcome!
This release raises the bar on open Hadoop deployments by making them faster, scalable, more integrated and repeatable.
These barclamps were developed in conjunction with our licensed Dell | Cloudera Solution. The licensed solution is for customers seeking large scale and professionally supported big data solutions. The purpose of the open barclamps (which pull the open source parts from the Cloudera distro) is to help you get started with Hadoop and reduce your learning curve. Our team invested significant testing effort in ensuring that these barclamps work smoothly because they are the foundational layer of our for-pay Hadoop solution.
Included in the Hadoop barclamp suite are Hadoop Map Reduce, Hive, Pig, ZooKeeper and Sqoop running on RHEL 5.7. These barclamps cover the core parts of the Hadoop suite. Like other Crowbar deployments (see OpenStack), the barclamps automatically discover the service configurations and interoperate. One of our team members (call him Scott Jensen) said it very simply “I can deploy a fully an integrated Hadoop cluster in a few hours. That friggin’ rocks!” I just can’t put it more eloquently than that!
I’ll post again when we flip the “open” bit and invite our community to dig in and help us continue to set the standards on open Hadoop deployments.
Since some of you cannot make it to the show and see the demo in person, we’ve captured it as a video for your enjoyment. The OpenStack deployment is available in our open source distribution. We are currently in QA for the overall solution so expect additional refinement as we progress towards our next OpenStack solution release.
REMINDER: Dell Hardware is NOT required to use Crowbar for OpenStack. The open source version has everything you need – the BIOS and RAID barclamps are optional (but handy).