Austin OpenStack Meetup (January Minutes) + OpenStack Foundation Web Cast!

Sorry for the brevity… At the last Austin OpenStack meetup, we had >60 stackers!  Some from as far away as Portland and Boston (as in Oregon and Massachusetts).

Notes:

  • Suse introduced their OpenStack beta and talked about their Suse Studio that can deploy images against the OpenStack APIs
  • I showed off DevStack.org code that can setup the truck of OpenStack (now Essex) in about 10 minutes on a single node.  Great for developers!
  • I showed an OpenStack Diablo Final deployment from Crowbar.  I focused mainly on Dashboard and used our reference architecture (see below) as illustration of the many parts.
  • Matt Ray suggested everyone watch the webcasts about the OpenStack Foundation (Thurs 6pm central  & Friday 9am central)
  • We planned the next few meetups.
    • For February, we’ll talk about Swift and Dashboard.
    • For March, we’ll talk about Essex and DevStack to prep for the next design summit (in SF).
    • For April, we’ll debrief the conference

Thank you Suse and Dell (my employer) for sponsoring!   The next meetup is sponsored by Canonical.

OpenStack Deployments Abound at Austin Meetup (12/9)

I was very impressed by the quality of discussion at the Deployment topic meeting for Austin OpenStack Meetup (#OSATX). Of the 45ish people attending, we had representations for at least 6 different OpenStack deployments (Dell, HP, ATT, Rackspace Internal, Rackspace Cloud Builders, Opscode Chef)!  Considering the scope of those deployments (several are aiming at 1000+ nodes), that’s a truly impressive accomplishment for such a young project.

Even with the depth of the discussion (notes below), we did not go into details on how individual OpenStack components are connected together.  The image my team at Dell uses is included below.  I also recommend reviewing Rackspace’s published reference architecture.

Figure 1 Diablo Software Architecture. Source Dell/OpenStack (cc w/ attribution)

Notes

Our deployment discussion was a round table so it is difficult to link statements back to individuals, but I was able to track companies (mostly).

  • HP
    • picked Ubuntu & KVM because they were the most vetted. They are also using Chef for deployment.
    • running Diablo 2, moving to Diablo Final & a flat network model. The network controller is a bottleneck. Their biggest scale issue is RabbitMQ.
    • is creating their own Nova Volume plugin for their block storage.
    • At this point, scale limits are due to simultaneous loading rather than total number of nodes.
    • The Nova node image cache can get corrupted without any notification or way to force a refresh – this defect is being addressed in Essex.
    • has setup availability zones are completely independent (500 node) systems. Expecting to converge them in the future.
  • Rackspace
    • is using the latest Ubuntu. Always stays current.
    • using Puppet to setup their cloud.
    • They are expecting to go live on Essex and are keeping their deployment on the Essex trunk. This is causing some extra work but they expect it to pay back by allowing them to get to production on Essex faster.
    • Deploying on XenServer
    • “Devs move fast, Ops not so much.”  Trying to not get behind.
  • Rackspace Cloud Builders (RCB) is running major releases being run through an automated test suite. The verified releases are being published to https://github.com/cloudbuilders (note: Crowbar is pulling our OpenStack bits from this repo).
  • Dell commented that our customers are using Crowbar primarily pilots – they are learning how to use OpenStack
    • Said they have >10 customer deployments pending
    • ATT is using OpenSource version of Crowbar
    • Need for Keystone and Dashboard were considered essential additions to Diablo
  • Hypervisors
    • KVM is considered the top one for now
    • Libvirt (which uses KVM) also supports LXE which people found to be interesting
    • XenServer via XAPI are also popular
    • No so much activity on ESX & HyperV
    • We talked about why some hypervisors are more popular – it’s about the node agent architecture of OpenStack.
  • Storage
    • NetApp via Nova Volume appears to be a popular block storage
  • Keystone / Dashboard
    • Customers want both together
    • Including keystone/dashboard was considered essential in Diablo. It was part of the reason why Diablo Final was delayed.
    • HP is not using dashboard
OpenStack API
  • Members of the Audience made comments that we need to deprecate the EC2 APIs (because it does not help OpenStack long term to maintain EC2 APIs over its own).  [1/5 Note: THIS IS NOT OFFICIAL POLICY, it is a reflection of what was discussed]
  • HP started on EC2 API but is moving to the OpenStack API

Meetup Housekeeping

  • Next meeting is Tuesday 1/10 and sponsored by SUSE (note: Tuesday is just for this January).  Topic TBD.
  • We’ve got sponsors for the next SIX meetups! Thanks for Dell (my employeer), Rackspace, HP, SUSE, Canonical and PuppetLabs for sponsoring.
  • We discussed topics for the next meetings (see the post image). We’re going to throw it to a vote for guidance.
  • The OSATX tag is also being used by Occupy San Antonio.  Enjoy the cross chatter!

Extending Chef’s reach: “Managed Nodes” for External Entities.

Note: this post is very technical and relates to detailed Chef design patterns used by Crowbar. I apologize in advance for the post’s opacity. Just unleash your inner DevOps geek and read on. I promise you’ll find some gems.

At the Opscode Community Summit, Dell’s primary focus was creating an “External Entity” or “Managed Node” model. Matt Ray prefers the term “managed node” so I’ll defer to that name for now. This model is needed for Crowbar to manage system components that cannot run an agent such as a network switch, blade chassis, IP power distribution unit (PDU), and a SAN array. The concept for a managed node is that there is an instance of the chef-client agent that can act as a delegate for the external entity. We’ve been reluctant to call it a “proxy” because that term is so overloaded.

My Crowbar vision is to manage an end-to-end cloud application life-cycle. This starts from power and network connections to hardware RAID and BIOS then up to the services that are installed on the node and ultimately reaches up to applications installed in VMs on those nodes.

Our design goal is that you can control a managed node with the same Chef semantics that we already use. For example, adding a Network proposal role to the Switch managed node will force the agent to update its configuration during the next chef-client run. During the run, the managed node will see that the network proposal has several VLANs configured in its attributes. The node will then update the actual switch entity to match the attributes.

Design Considerations

There are five key aspects of our managed node design. They are configuration, discovery, location, relationships, and sequence. Let’s explore each in detail.

A managed node’s configuration is different than a service or actuator pattern. The core concept of a node in chef is that the node owns the configuration. You make changes to the nodes configuration and it’s the nodes job to manage its state to maintain that configuration. In a service pattern, the consumer manages specific requests directly. At the summit (with apologies to Bill Clinton), I described Chef configuration as telling a node what it “is” while a service provide verbs that change a node. The critical difference is that a node is expected to maintain configuration as its composition changes (e.g.: node is now connected for VLAN 666) while a service responds to specific change requests (node adds tag for VLAN 666). Our goal is the maintain Chef’s configuration management concept for the external entities.

Managed nodes also have a resource discovery concept that must align with the current ohai discovery model. Like a regular node, the manage node’s data attributes reflect the state of the managed entity; consequently we’d expect a blade chassis managed node to enumerate the blades that are included. This creates an expectation that the manage node appears to be “root” for the entity that it represents. We are also assuming that the Chef server can be trusted with the sharable discovered data. There may be cases where these assumptions do not have to be true, but we are making them for now.

Another essential element of managed nodes is that their agent location matters because the external resource generally has restricted access. There are several examples of this requirement. Switch configuration may require a serial connection from a specific node. Blade SANs and PDUs management ports are restricted to specific networks. This means that the manage node agents must run from a specific location. This location is not important to the Chef server or the nodes’ actions against the managed node; however, it’s critical for the system when starting the managed node agent. While it’s possible for managed nodes to run on nodes that are outside the overall Chef infrastructure, our use cases make it more likely that they will run as independent processes from regular nodes. This means that we’ll have to add some relationship information for managed nodes and perhaps a barclamp to install and manage managed nodes.

All of our use cases for managed nodes have a direct physical linkage between the managed node and server nodes. For a switch, it’s the ports connected. For a chassis, it’s the blades installed. For a SAN, it’s the LUNs exposed. These links imply a hierarchical graph that is not currently modeled in Chef data – in fact, it’s completely missing and difficult to maintain. At this time, it’s not clear how we or Opscode will address this. My current expectation is that we’ll use yet more roles to capture the relationships and add some hierarchical UI elements into Crowbar to help visualize it. We’ll also need to comprehend node types because “managed nodes” are too generic in our UI context.

Finally, we have to consider the sequence of action for actions between managed nodes and nodes.  In all of our uses cases, steps to bring up a node requires orchestration with the managed node.  Specifically, there needs to be a hand-off between the managed node and the node.  For example, installing an application that uses VLANs does not work until the switch has created the VLAN,  There are the same challenges on LUNs and SAN and blades and chassis.  Crowbar provides orchestration that we can leverage assuming we can declare the linkages.

For now, a hack to get started…

For now, we’ve started on a workable hack for managed nodes. This involves running multiple chef-clients on the admin server in their own paths & processes. We’ll also have to add yet more roles to comprehend the relationships between the managed nodes and the things that are connected to them. Watch the crowbar listserv for details!

Extra Credit

Notes on the Opscode wiki from the Crowbar & Managed Node sessions

OpenStack Seattle Meetup 11/30 Notes

We had an informal OpenStack meetup after the Opscode Summit in Seattle.

This turned out to be a major open cloud gab fest! In addition to Dell OpenStack leads (Greg and I), we had the Nova Project Technical Lead (PTL, Vish Ishaya, @vish), HP’s Cloud Architect (Alex Howells, @nixgeek), Opscode OpenStack cookbook master (Matt Ray, @mattray). We were joined by several other Chef Summit attendees with OpenStack interest including a pair of engineers from Spain.

We’d planned to demo using Knife-OpenStack against the Crowbar Diablo build.  Unfortunately, the knife-openstack is out of date (August 15th?!).  We need Keystone support.  Anyone up for that?

Highlights

There’s no way I can recapture everything that was said, but here are some highlights I jotted down the on the way home.

  • After the miss with Keystone and the Diablo release, solving the project dependency problem is an important problem. Vish talked at length about the ambiguity challenge of Keystone being required and also incubated. He said we were not formal enough around new projects even though we had dependencies on them. Future releases, new projects (specifically, Quantum) will not be allowed to be dependencies.
  • The focus for Essex is on quality and stability. The plan is for Essex to be a long-term supported (LTS) release tied to the Ubuntu LTS. That’s putting pressure on all the projects to ensure quality, lock features early, and avoid unproven dependencies.
  • There is a lot of activity around storage and companies are creating volume plug-ins for Nova. Vish said he knew of at least four.
  • Networking has a lot of activity. Quantum has a lot of activity, but may not emerge as a core project in time for Essex. There was general agreement that Quantum is “the killer app” for OpenStack and will take cloud to the next level.  The Quantum Open vSwitch implementaiton is completely open source and free. Some other plugins may require proprietary hardware and/or software, but there is definitely a (very) viable and completely open source option for Quantum networking.
  • HP has some serious cloud mojo going on. Alex talked about defects they have found and submitted fixes back to core. He also hinted about some interesting storage and networking IP that’s going into their OpenStack deployment. Based on his comments, I don’t expect those to become public so I’m going to limit my observations about them here.
  • We talked about hypervisors for a while. KVM and XenServer (via XAPI) were the primary topics. We did talk about LXE & OpenVZ as popular approaches too. Vish said that some of the XenServer work is using Xen Storage Manager to manage SAN images.
  • Vish is seeing a constant rise in committers. It’s hard to judge because some committers appear to be individuals acting on behalf of teams (10 to 20 people).

Note: cross posted on the OpenStack Blog.

Reminder: 12/8 Meetup @ Austin!

Missed this us in Seattle? Join us at the 12/8 OpenStack meetup in Austin co-hosted by Dell and Rackspace.

Based on our last meetup, it appears deployment is a hot topic, so we’ll kick off with that – bring your experiences, opinions, and thoughts! We’ll also open the floor to other OpenStack topics that would be discussed – open technical and business discussions – no commercials please!

We’ll also talk about organizing future OpenStack meet ups! If your company is interested in sponsoring a future meetup, find Joseph George at the meetup and he can work with you on details.

Notes from 10/27 OpenStack Austin Meetup (via Stephen Spector)

Stephen Spector (now a Dell Services employee!) gave me permission to repost his excellent notes from the first OpenStack Austin (#OSATX) Meetup Group.

Here are his notes:

[Stephen] wanted to update everyone on the Austin OpenStack Meetup last night at the Austin TechRanch sponsored by Joseph and Rob (that’s me!) of the Dell OpenStack team (I think I got that right?). You can find all the tweets from the event at https://twitter.com/#!/search/%23osatx as we created a new hashtag for tweeting during the event, #osatx.

Here are some highlights from the event:

  • About 60 or so attendees with a good amount from Dell (Barton George, Logan McCloud)and Rackspace, Opscode (Matt Ray), Puppet Labs, SUSE talked about their OpenStack commitment (http://t.co/bBnIO7xv), and Ubuntu folks as well
  • Jon Dickinson who is the Project Technical Lead for Swift (Object Storage) was there and presented information on the current Swift offering; It is interesting to note that Swift releases continuously when most of OpenStack releases during the 6 month development cycle like Nova (Compute)
  • Stephen and Jim Plamondon from Rackspace presented information on the overall community and talked about the announcement yesterday from Internap about their Compute public cloud and the information on the MercadoLibre 600 Node Compute cloud running their business:

“With 58 million users of MercadoLibre.com and growing rapidly, we need to provide our teams instant access to computing resources without heavy administrative layers. With OpenStack, our internal users can instantly provision what they need without having to wait for a system administrator,” said Alejandro Comisario, Infrastructure Senior Engineer, MercadoLibre, the largest online trading platform in Latin America. “With our success running OpenStack Compute in production, we plan to roll OpenStack Diablo out more broadly across the company, and have appreciated the community support in this venture, especially through the OpenStack Forums, where we are also global moderators.”

  • Discussion on the OpenStack API Issue which is a significant open issue at this time – should OpenStack focus on creating an API specification and then let multiple implementations of that API move forward or build 1 implementation of the API as official OpenStack (see my post for more on this).
  • Greg Althaus gave a demo of the Nova Dashboard
  • Future Meetings
  • Three organizations have offered to help host (pizza $ and TechRanch space $) but we always need more!  You can offer to sponsor via the meetup site.
  • There will be future OpenStack Austin Meetups so sign up for the group and you’ll be notified automatically.

Pictures…

Continue reading