We had a great discussion about OpenStack, Ops and Crowbar. I appreciate Niki’s insightful questions and an opportunity to share my opinions. I feel that we covered years of material in just 1 hour and I appreciate the opportunity to appear on the podcast.
51:00 We should be encouraging people to use OpenStack for its use cases
51:30 Existential question for OpenStack: are we a suite or product. The community is split here
51:30 In comparing with Amazon, does OpenStack have to implement it or build an ecosystem to compete
53:00 As soon as you make something THE OpenStack project (like Heat) you are sending a message that the alternates are not welcome
54:30 OpenStack ends up in a trap if we pick a single project and make it the way that we are going do something. New implementations are going to surface from WITHIN the projects and we need to ready for that.
55:15 new implementations are coming, we have to be ready for that. We can make ourselves vulnerable to splitting if we do not prepare.
56:00 API vs Implementation? This is something that splits the community. Ultimately we to be an API spec but we are not ready for that. We have a lot of work to do first using the same code base.
56:50 DefCore has taken a balanced approach using our diversity as a strength
57:20 Bylaws did not allow for enough flexibility for what is core
To get the meeting started, Marc Padovani from HP (this month’s sponsor) provided some lessons learned from the HP OpenStack-Powered Cloud. While Marc noted that HP has not been able to share much of their development work on OpenStack; he was able to show performance metrics relating to a fix that HP contributed back to the OpenStack community. The defect related to the scheduler’s ability to handle load. The pre-fix data showed a climb and then a gap where the scheduler simply stopped responding. Post-fix, the performance curve is flat without any “dead zones.” (sharing data like this is what I call “open operations“)
The meat of the meetup was a freeform discussion about what the group would like to see discussed at the Design Summit. My objective for the discussion was that the Austin OpenStack community could have a broader voice is we showed consensus for certain topics in advance of the meeting.
At Jim Plamondon‘s suggestion, we captured our brain storming on the OpenStack etherpad. The Etherpad is super cool – it allows simultaneous editing by multiple parties, so the notes below were crowd sourced during the meeting as we discussed topics that we’d like to see highlighted at the conference. The etherpad preserves editors, but I removed the highlights for clarity.
Imagine the late end-game: can Azure/VMWare adopt OPenStack’s APIs and data formats to deliver interop, without running OpenStack’s code? Is this good? Are there conversations on displacing incumbents and spurring new adoption?
Dev docs vs user docs
Lag of update/fragmentation (10 blogs, 10 different methods, 2 “work”)
Per release getting started guide validated and available prior or at release.
Error messages and codes vs python stack traces
Alternatively put, “how can we make error messages more ops-friendly, without making them less developer-friendly?”
Upgrade and operations of rolling updates and upgrades. Hot migrations?
If OpenStack was installable on Windows/Hyper-V as a simple MSI/Service installer – would you try it as a node?
Is Nova too big? How does it get fixed?
make it smaller sub-projects
shorter release cycles?
volume split out?
volume expansion of backend storage systems
Is nova-volume the canonical control plane for storage provisioning? Regardless of transport? It presently deals in block devices only… is the following blueprint correctly targeted to nova-volume?
What is a contribution that warrants an invitation
Look at Launchpad’s Karma system, which confers karma for many different “contributory” acts, including bug fixes and doc fixes, in addition to code commitments
Is there a time for an operations summit?
How about an operators’ track?
Just a note: forums.openstack.org for users/operators to drive/show need and participation.
How can we capture the implicit knowledge (of mailing list and IRC content) in explicit content (documentation, forums, wiki, stackexchange, etc.)?
Hypervisors: room for discussion?
Do we want hypervisor featrure parity?
From the cloud-app developer’s perspective, I want to “write once, run anywhere,” and if hypervisor features preclude that (by having incompatible VM images, foe example)
(RobH: But “write once, run anywhere” [WORA] didn’t work for Java, right?)
(JimP: Yeah, but I was one of Microsoft’s anti-Java evangelists, when we were actively preventing it from working — so I know the dirty tricks vendors can use to hurt WORA in OpenStack, and how to prevent those trick from working.)
Swift API is an evolving de facto open alternative to S3… CDMI is SNIA standards track. Should Swift API become CDMI compliant? Should CDMI exist as a shim… a la the S3 stuff.
Tomorrow (3/1), numerous sites are gathering around a World Wide Essex Hack Day on 3/1. If you want to participate or even host a hack venue, get on the list and IRC channel (details).
My team at Dell is organizing a community a follow-upOpenStack Essex Install Day next week (3/8) in both Austin and Boston. Just like the Hack Day, the install fest will focus on Essex release code with both online and local presence. Unlike the Hack Day, our focus will be on deployments. For the Dell team, that means working on the Essex deployment for Crowbar. We’re still working on a schedule and partner list so stay tuned. I’m trying to webcast Crowbar & OpenStack training sessions during the install day.
I was very impressed by the quality of discussion at the Deployment topic meeting for Austin OpenStack Meetup (#OSATX). Of the 45ish people attending, we had representations for at least 6 different OpenStack deployments (Dell, HP, ATT, Rackspace Internal, Rackspace Cloud Builders, Opscode Chef)! Considering the scope of those deployments (several are aiming at 1000+ nodes), that’s a truly impressive accomplishment for such a young project.
Even with the depth of the discussion (notes below), we did not go into details on how individual OpenStack components are connected together. The image my team at Dell uses is included below. I also recommend reviewing Rackspace’s published reference architecture.
Our deployment discussion was a round table so it is difficult to link statements back to individuals, but I was able to track companies (mostly).
picked Ubuntu & KVM because they were the most vetted. They are also using Chef for deployment.
running Diablo 2, moving to Diablo Final & a flat network model. The network controller is a bottleneck. Their biggest scale issue is RabbitMQ.
is creating their own Nova Volume plugin for their block storage.
At this point, scale limits are due to simultaneous loading rather than total number of nodes.
The Nova node image cache can get corrupted without any notification or way to force a refresh – this defect is being addressed in Essex.
has setup availability zones are completely independent (500 node) systems. Expecting to converge them in the future.
is using the latest Ubuntu. Always stays current.
using Puppet to setup their cloud.
They are expecting to go live on Essex and are keeping their deployment on the Essex trunk. This is causing some extra work but they expect it to pay back by allowing them to get to production on Essex faster.
Deploying on XenServer
“Devs move fast, Ops not so much.” Trying to not get behind.
Rackspace Cloud Builders (RCB) is running major releases being run through an automated test suite. The verified releases are being published to https://github.com/cloudbuilders (note: Crowbar is pulling our OpenStack bits from this repo).
Dell commented that our customers are using Crowbar primarily pilots – they are learning how to use OpenStack
Said they have >10 customer deployments pending
ATT is using OpenSource version of Crowbar
Need for Keystone and Dashboard were considered essential additions to Diablo
KVM is considered the top one for now
Libvirt (which uses KVM) also supports LXE which people found to be interesting
XenServer via XAPI are also popular
No so much activity on ESX & HyperV
We talked about why some hypervisors are more popular – it’s about the node agent architecture of OpenStack.
NetApp via Nova Volume appears to be a popular block storage
Keystone / Dashboard
Customers want both together
Including keystone/dashboard was considered essential in Diablo. It was part of the reason why Diablo Final was delayed.
HP is not using dashboard
Members of the Audience made comments that we need to deprecate the EC2 APIs (because it does not help OpenStack long term to maintain EC2 APIs over its own). [1/5 Note: THIS IS NOT OFFICIAL POLICY, it is a reflection of what was discussed]
HP started on EC2 API but is moving to the OpenStack API
Next meeting is Tuesday 1/10 and sponsored by SUSE (note: Tuesday is just for this January). Topic TBD.
We’ve got sponsors for the next SIX meetups! Thanks for Dell (my employeer), Rackspace, HP, SUSE, Canonical and PuppetLabs for sponsoring.
We discussed topics for the next meetings (see the post image). We’re going to throw it to a vote for guidance.
The OSATX tag is also being used by Occupy San Antonio. Enjoy the cross chatter!
This turned out to be a major open cloud gab fest! In addition to Dell OpenStack leads (Greg and I), we had the Nova Project Technical Lead (PTL, Vish Ishaya, @vish), HP’s Cloud Architect (Alex Howells, @nixgeek), Opscode OpenStack cookbook master (Matt Ray, @mattray). We were joined by several other Chef Summit attendees with OpenStack interest including a pair of engineers from Spain.
We’d planned to demo using Knife-OpenStack against the Crowbar Diablo build. Unfortunately, the knife-openstack is out of date (August 15th?!). We need Keystone support. Anyone up for that?
There’s no way I can recapture everything that was said, but here are some highlights I jotted down the on the way home.
After the miss with Keystone and the Diablo release, solving the project dependency problem is an important problem. Vish talked at length about the ambiguity challenge of Keystone being required and also incubated. He said we were not formal enough around new projects even though we had dependencies on them. Future releases, new projects (specifically, Quantum) will not be allowed to be dependencies.
The focus for Essex is on quality and stability. The plan is for Essex to be a long-term supported (LTS) release tied to the Ubuntu LTS. That’s putting pressure on all the projects to ensure quality, lock features early, and avoid unproven dependencies.
There is a lot of activity around storage and companies are creating volume plug-ins for Nova. Vish said he knew of at least four.
Networking has a lot of activity. Quantum has a lot of activity, but may not emerge as a core project in time for Essex. There was general agreement that Quantum is “the killer app” for OpenStack and will take cloud to the next level. The Quantum Open vSwitch implementaiton is completely open source and free. Some other plugins may require proprietary hardware and/or software, but there is definitely a (very) viable and completely open source option for Quantum networking.
HP has some serious cloud mojo going on. Alex talked about defects they have found and submitted fixes back to core. He also hinted about some interesting storage and networking IP that’s going into their OpenStack deployment. Based on his comments, I don’t expect those to become public so I’m going to limit my observations about them here.
We talked about hypervisors for a while. KVM and XenServer (via XAPI) were the primary topics. We did talk about LXE & OpenVZ as popular approaches too. Vish said that some of the XenServer work is using Xen Storage Manager to manage SAN images.
Vish is seeing a constant rise in committers. It’s hard to judge because some committers appear to be individuals acting on behalf of teams (10 to 20 people).
Based on our last meetup, it appears deployment is a hot topic, so we’ll kick off with that – bring your experiences, opinions, and thoughts! We’ll also open the floor to other OpenStack topics that would be discussed – open technical and business discussions – no commercials please!
We’ll also talk about organizing future OpenStack meet ups! If your company is interested in sponsoring a future meetup, find Joseph George at the meetup and he can work with you on details.
Dell has substantial IT assets to bring to bear on cloud solutions. All of them are ultimately tied to products that generate revenue for Dell; however, that does not prevent our being able to collaborate and share. On the contrary, we benefiting from input from our partners, customers and community to determine which features are needed to accelerate adoption. Our recent decision to accelerate Crowbar modularization is a clear example of that process.
It is essential to understand that this is not just about cloud technologies! It is about the collaborative way we are promoting them and the processes we are using to deliver them.
With Dell’s cloud moving at hurricane speed, it has been interesting to watch how other companies are setting their own OpenStack initiatives. It seems to me that many of these efforts involve forks from OpenStack that cannot/will not be contributed back the community. One (but not the only) example is from HP’s Emil Sayegh who says that “HP developers … ideas will be shared…” He does not commit to sharing HP’s code in his post. I hope that is an oversight and not their plan.
In time, forking may be needed. Right now, we need to focus on building a strong foundation. Open contributions of code are the engine of that success.