Austin OpenStack Meetup: Keystone & Knife (2/20 notes via Greg Althaus)

I could not make it to the recent Austin OpenStack Meetup, but Greg Althaus generously let me post his notes from the event.

Background

Matt Ray talks about Chef

Matt Ray from Opscode presented some of the work with Chef and OpenStack. He talked about the three main chef repos floating around. He called out Anso’s original cookbook set that is the basis for the Crowbar cookbooks (his second set), and his final set is the emerging set of cookbooks in OpenStack proper. The third one is interesting and what he plans to continue working on to make into his public openstack cookbooks. These are an amalgamation of smokestack, RCB, Anso improvements, and his (Crowbar’s).

He then demoed his knife plugin (slideshare) to build openstack virtual servers using the Openstack API. This is nice and works against TryStack.org (previously “Free Cloud”) and RCB’s demo cloud. All of that is on his github repo with instructions how to build and use. Matt and I talked about trying to get that into our Crowbar distro.

There were some questions about flow and choice of OpenStack API versus Amazon EC2 API because there was already an EC2 knife set of plugins.

Ziad Sawalha talks about Keystone

Ziad Sawalha is the PLT (Project Technical Lead) for Keystone. He works for Rackspace out of San Antonio. He drove up for the meeting.

He split his talk into two pieces, Incubation Process and Keystone Overview. He asked who was interested in what and focused his talk more towards overview than incubation.

Some key take-aways:

  • Keystone comes from Rackspace’s strong, flexible, and scalable API. It started as a known quantity from his perspective.
  • Community trusted nothing his team produced from an API perspective
  • Community is python or nothing
    • His team was ignored until they had a python prototype implementing the API
    • At this point, comments on API came in.
  • Churn in API caused problems with implementation and expectations around the close of Diablo.
    • Because comments were late, changes occurred.
    • Official implementation lagged and stalled into arriving.
  • API has been stable since Diablo final, but code is changing. that is good and shows strength of API.
  • Side note from Greg, Keystone represents to me the power of API over Code. You can have innovation around the implementation as long all the implementations have a fair ground work to plan under which is an API specification. The replacement of Keystone with the Keystone Light code base is an example of this. The only reason this is possible is that the API was sound and documented.  (Rob’s post on this)

Ziad spent the rest of his time talking about the work flow of Keystone and the API points. He covered the API points.

  • Client to Keystone, Keystone to Client for initial auth token
  • Client to Middleware API for the services to have a front.
  • Middleware to Keystone to verify and establish identity.
  • Middleware to Service to pass identity

Not many details other then flow and flexibility. He stressed the API design separated protocol from actions and data at all the layers. This allows for future variations and innovations while maintaining the APIs.

Ziad talked about the state of Essex.

  • Planned
    • RBAC (aka Role Based Access Control)
    • Stability
    • Many backends
  • Actual
    • Code replacement Keystone Light
    • Stability
    • LDAP backend
    • SQL backend

Folsum work:

  • RBAC
  • Stability
  • AD backend
  • Another backend
  • Federation was planned but will most likely be pushed to G
    • Federation is the ability for multiple independent Keystones to operate (bursting use case)
    • Dependent upon two other federation components (networking and billing/metering)

Analyze This! Big Data | Apache Hadoop | Dell | Cloudera | Crowbar

This article about Target using buying patterns to expose a teen was pregnant before she told her parents puts big data analysis into everyday terms better than the following 555 words (of course, I recommend that you read both).

Recently, I had the pleasure of being one of our team presenting Dell’s BIG DATA story at an internal conference. From the questions and buzz, it’s clear that the big data is big news this year. My team is at the center of that storm because we are responsible for the Dell | Cloudera Apache™ Hadoop™ solution. The solution is significant because we’ve integrated many pieces necessary to build and sustain a Hadoop cluster: that includes Dell servers, the Cloudera Hadoop distribution, the Crowbar framework and Services to make it useful.

Big Data Analytics spins data straws into information gold.

Before I jump into technical details, it’s worth stating the big data analytics value proposition. The problem is that we are awash in a tsunami of data: we’ve grown beyond the neat rows and columns of application databases, data today include source like website click logs and emails to call records and cash register receipts to including social media tweets and posts. While much of the data is unstructured noise, there is also incredibility valuable information.  (video of my Hadoop “escalator pitch”)

Value is not just hidden inside the bulk data; it lies in correlations between sets of the data.

The big data analytics value proposition is to provide a system to hold a lot of loosely structured information (thus “big data”) and then sift and correlate the information (thus “analytics”). The result is a technology that helps us make data driven decisions. In many applications, the analysis is fed directly back into applications so they can alter behavior in near real-time. For example, an online retail store could offer you purple bunny slippers as you browse for crowbars in the hardware section knowing that you’re reading this post. That is the type of correlations on disparate data that I’m talking about.

This is really two problems: storing a lot of data and then computing over it.

Hadoop, the leading open source big data analytics project, is a suite of applications that implement and extend two core capabilities: a distributed file system (HDFS) and the map-reduce (M-R) algorithm. My point is not to define Hadoop (others have done better and here); instead, I want to highlight that it’s a combination big data analysis is a merger of storage and compute. When learning about any big data analysis solution, you cannot decouple how the data is stored from how the data is analyzed – storage and compute are fundamentally linked.

For that reason, the architecture of a Hadoop cluster is different than either a traditional database or compute cluster. The IO and the resiliency patterns are different. Since Hadoop is a distributed system, hardware redundancy is less important and eliminating IO bottlenecks is paramount. For this reason, our Hadoop clusters use a lot of local, non-RAID drives with a target of delivering a 1:1 CPU core to spindle ratio (ratios are tuned based on planned loads).

Imagine that you are looking for correlations in web click data. To do that analysis, Hadoop need to spend a lot of time cracking open log files, sifting for specific data and then reporting back its results. That process involves thousands of jobs each doing disk IO, CPU & RAM workload and then network transfer; consequently, contention between network and disk demands reduces performance.

Wow… that’s a lot of description and just scratching the surface of Big Data Analytics. I’ll going to have to add the technical details about the Dell solution architecture (Hardware) and software components (Cloudera & Crowbar) in another post.

2/9 Webcast about mixing GPUs & Big Data Analysis

Here’s something from my employer (Dell) that may be interesting to you: it’s about using GPUs for Big Data Analytics.   I meant to discuss/post this earlier, but… oh well.  Here’s the information

Premieres LIVE: 2pm EST (11 AM PST) TODAY  Free  – Register Now!         

What You’ll Learn:

  • Not just for video games any more: GPUs for simulation and parallel processing
  • Impact on business workflows in seismic processing, interpretation and reservoir modeling
  • ROI: 5x performance in 5 days
  • Cost-effective and flexible cluster configurations
  • Show me the metrics: Tangible results from a variety of customers

Need More Details?

OpenStack Boston Meetup 2/1 covers Quantum & Foundation

My team at Dell was in Beantown (several of us are Nashua based) for an annual team meeting so the timing for this Boston meetup.  Special thanks to Andi Abes for organizing and Suse for Sponsoring!!

We covered two primary topics: Quantum and the OpenStack Foundation.

In typing up my notes from the sessions, I ended up with so much information that it made more sense to break them into independent blog posts. Wow – that’s a lot of value from a free meetup!eetup was ideal for us. While we showed up in force, so did many other Stackers including people from HP, Nicira, Suse, Havard, Voxel, RedHat, ESPN and many more! The turnout for the event was great and I’m taking notes that Austin may need to upgrade our pizza and Boston may need to upgrade their cookies (just sayin’).

The Quantum session by David Lapsley from Nicira talked about the architecture and applications of Quantum. I think that Quantum is an exciting incubated project for OpenStack; however, it is important to remember that Essex stands alone without it. I believe this fact gets forgotten in enthusiasm over Quantum’s shiny potential.

The OpenStack session by Rob Hirschfeld from Dell (me!) talked about the importance of governance for OpenStack and how the Foundation will play a key role in transitioning it from Rackspace to a neutral party. There are many feel-good community benefits that the Foundation brings; however, the collaborators’ ROI is driver for creating a strong foundation. There is nothing wrong with acknowledging that fact and using it to create a more sustainable OpenStack.

Quantum: Network Virtualization in the OpenStack Essex Release

This post is part of my notes from the 2/1 Boston OpenStack meetup.

Quantum

David Lapsley from Nicira gave the Quantum presentation (his slides). My notes include additional explication and interpretation so he is not to blame for errors (but I’ll share credit for clarity).

The objective for Quantum is to replace the current networking modes (flat, vlan, dhcp, dhcp ha) with a programmatic networking API. The idea is that cloud users would use the API to request the network topology they wanted to implement rather than have it imposed by the infrastructure’s network mode. To accomplish this, the API must allow users to create complex & hierarchical network topologies without being aware of the underlying network infrastructure (aka “an abstraction layer”).

In simpler terms: Quantum allows users to design their own isolated networks without knowing how the network is actually deployed.

Quantum is a stand-alone service with its own API. It is not simply an extension of the Nova API. The Quantum API an extensibility model similar to Nova and it also has a plug-in architecture so that it can be implementation agnostic. The plug-ins are needed to map the user’s API abstraction into actual networking. For example, if the user requests a network tunnel between two VMs then the plug in may choose to implement a tagged VLAN, OpenFlow connections, IPtable filters, or encapsulated tunnels. The goal is that the implementation of the API should not matter to the user of the API!

For (hopefully) obvious reasons, the use cases the Quantum are similar to the Amazon EC2 VPC. The notable exception is service injection. Quantum wants to allow vendors/providers to innovate around value-added services. This should result in a diversity of choices as vendors offer additional network services such as load balancers, IPS, IDS, etc. While this is a great concept, it’s important to note that Quantum is currently limited to a single plug-in!  [see note in comments by Quantum PTL Dan Wendlandt (@danwendlandt)]

The expectation is that cloud users will want to create traditional application topologies with different tiers of access. For example, applications may require a dedicated network between web and database tiers or a DMZ between web and load balancer. The challenge is that these are patterns not rigid requirements. Ultimately, the simplest solution for the feature is to allow users to create “virtual VLANs.”

Essentially, the current Quantum API is creating virtual VLANs.

The Quantum API has four basic abstractions: interface, network, port and attachment. These primitives are used to build up a virtual network just as they are in physical networks.

  • Interface: cloud / tenant / server / GUID / eth0
  • Network: cloud / tenant / network / GUID
  • Port: cloud ID / tenant / network / GUID / port / GUID
  • Attachement : interface & network & port

To use the Quantum API, you must create a network, add ports (to network) and interfaces (to vms) then attach the network, interface, and port together. This gives users very fine grained control over their network topology. It is up to the plug-in to translate these primitives into a working physical topology.

According to my teammate and OSBOS organizer, Andi Abes, the Quantum API reached consensus in the community quickly because these it started with this basic but extensible API. In the meeting, I added that this approach is typical for OpenStack where it is considered better to demonstrate working core functionality than build extra complexity into the initial delivery. This approach links back to the API vs. Implementation debate I’ve discussed before. This simple API also provides room for innovation – while providing the basic constructs it is light, and does not encumber mappings of this API to different underlying technologies with lots of extras. OEM Vendors and service providers this have an easier time differentiating their offerings be it equipment or services.

In my experience, people often link OpenFlow and Quantum into a single technology base. I have certainly been guilty making that generalization. Quantum does not require OpenFlow or vice versa; however, they are highly complementary. OpenFlow takes over the switches’ “flow table” and allows administrators to control how every packet that touches the switch is routed. The potential for OpenFlow is to create highly dynamic and controlled network conduits. Quantum needs exactly that functionality to most directly map the virtual network requests into a physical fabric. In this way, OpenFlow is the most direct approach to building a fully enabled Quantum plug-in.

In the Essex release, progress has been made (and still is being made) towards integrating Nova and Quantum. The workflow of attaching a VIF (virtual interface) to the right network, and assigning it an appropriate IP (using Melange – the OpenStack IP address management project) are making headway. That said, the dashboard integration still lags and more progress is required.

Overall, my impression is that Quantum has great potential; however, I think that Nova in Essex will be sufficient for real applications without Quantum. As my freshman roommate used to say, “potential means you’ve got to keep working on it.”

Why Governance Matters in Open Source: Discussing the OpenStack Foundation

This post is part of my notes from the 2/1 Boston OpenStack meetup.

OpenStack Foundation

Your’s truly (Rob Hirschfeld) gave the presentation about the OpenStack Foundation.  To readers of this blog, it’s obvious that I’m a believer in the OpenStack mission; however, it’s not obvious how creating a foundation helps with that mission and why OpenStack needs its own. As one person at the meetup put it, “Why not? Every major project needs a foundation!”

Governance does not sound sexy compared to writing code and deploying clouds, but it’s very important to the success of the project.

Here are my notes without the poetic elocution I exuded during the meetup…

The basics:

  • What: Creating a neutral body to govern OpenStack. Rackspace has been leading OpenStack. This means that they own the copyrights, name and also pay the people who organize the community. They committed (to executives at Dell and others) that they would ultimately setup a standalone body to govern the project before the project was public and endorsed by those early partners. Dell (my employer), Citrix, Accenture and NASA were some of biggest names at the Austin conference launch.
  • Why: A neutral body is needed because a lot of companies are committing significant time and money to the project. They cannot risk their investments on Rackspace good will alone. This may mean many things. It could be they don’t like Rackspace direction or they feel that Rackspace is not investing enough.
  • When: Right now and over the next few releases.  You should give feedback right now on the OpenStack Foundations mission.  The actual foundation will take more time to establish because it requires legal work and funding commitments.
  • Who: The community – all stakeholders. This is important stuff! While trying to standup a financially independent Foundation, which requires moneys, the little guys are not left out. There is a clear realization and desire to enable independent developers and contributors and small players to have a seat at the table.
  • How Much: The amounts are unclear, but establishing a foundation will require a significant ongoing investment from highly involved and moneyed parties (Rackspace, Dell, Cisco, HP, Citrix, NTT, startups?, etc).  The funding will pay salaries for people dedicated to the community doing the things that I’ll discuss below.  Overall, the ROI for those investments must be clear!

The foundation does “governance.” But, what does that mean? Here is a list of vitally important work that the foundation is responsible for.

  • Branding – Protecting, certifying, and promoting the OpenStack brand is important because it ensures that “OpenStack” has a valuable and predictable meaning to contributors and users. A strong the brand also means a stronger temptation for people to abuse the brand by claiming compatibility, participation and integration.
  • API – Many would assume that the OpenStack API is the very heart of the project and there is merit to this position. As more and more OpenStack implementations emerge, it is essential that we have a body that can certify which implementations (and even which versions of the implementation!) are valid. This is a substantial value to the community because API integrity ensures project continuity and helps the ecosystem monetize the project. Note: my opinion differs from others here because I think we should favor API over implementation
  • Community – The OpenStack community is not an accident. It is the function of deliberate actions and choices made by Rackspace and supported by key contributors. That community requires virtual and physical places to coalesce and leaders to organize and manage those meeting places. The excellent conferences, wikis, blogs, media awareness, documentation and meetups are a product of consistent community management.
  • Arbitration – An open source community is a family and siblings do not always get along. Today, Rackspace must be very careful about balancing their own interests because they are like the oldest sibling playing the parent role – you can get away with it until something serious happens. We need a neutral party so that Rackspace can protect their own interests (alternate spin: because Rackspace protects their own interests at the expense of the community).
  • Leadership – OpenStack today is a collection of projects with individual leadership. We will increasingly need coordinated leadership as the number of projects and users increases. Centralized leadership is essential because the good of the project as a whole may mean sacrifices within individual projects. It may even mean that some projects chose to leave the OpenStack tent. Stewarding these challenges will require a new level of leadership.
  • Legal – This is a function of all the above but also something more. From a legal stand point, OpenStack be able to represent itself. There is a significant amount of intellectual property being created. It would be foolish to overlook that this property is valuable and needs adequate legal representation.

I used “vitally important” to describe the above items. Is that an exaggeration? Our goal is collaboration and that requires some infrastructure and rules to make it sustainable. We must have a foundation that encourages innovation (multiple implementations) and collaboration (discourages forking). Innovation and collaboration are the heartbeat of an open source project.

The foundation is vitally important because collaboration by competitors is fragile.

In addition to the core areas above, the foundation needs to handle routine tactical items such as:

  • Delivering on milestones & releases
  • Moving new subprojects into OpenStack
  • Electing and maintaining Project Policy Board
  • Electing and maintaining Project Technical Leads
  • Ensuring adherence and extensions to the current bylaws

At the end of the day, OpenStack monetization is the central value for the Foundation.

In order for the OpenStack project, and thus its foundation, to flourish, the contributors, ecosystem, sponsors and users of the project must be able to see a reasonable return (ROI) on their investment. I would love to believe that the foundation is allow about people banding together to solve important problems for the benefit of all; however, it is more realistic to embrace that we can both collaborate and profit simultaneously. Acknowledging the pragmatic self-interested view allows us to create the right incentives and processes as embodied by the OpenStack foundation.

OpenStack Keystone makes smart & bold move to improve quality

Just after the OpenStack
Essex 3 milestone, Ziad Sawalha of Rackspace announced a major shift in the Keystone code base. I applaud the clarity of Ziad’s email but want to restate my understanding of the changes here rather than simply parrot him.

These changes improve Keystone and OpenStack in several ways.

The Keystone team is keeping the current APIs while swapping their implementation. They recommend switching back to an implementation based on the Rackspace Cloud Builder’s Keystone Light code base. I say switching back because my team at Dell has some experience with the Keystone Light (KSL) code. KSL was used with our first Diablo release work while legacy Keystone (Diablo Keystone?) was being readied for release. Upon reflection, the confusion around Keystone readiness for Diablo may have been an indicator to some disconnects that ultimately contributed to last week’s decision.

This is not an 11th hour rewrite. Keystone Light (now Essex Keystone?) offers

  • An existing code base that has been proven in real deployments
  • Stronger identity pluggability, better EC2 compatibility and higher production readiness
  • An existing testing framework and proven extensibility and flexibility
  • Plus, the team has committed to ensure a simple migration path

Beyond the code and Keystone, making a change like this takes confidence and guts.

This change is not all sunshine and rainbows. Making a major change midway through the release cycle introduces schedule and delivery risk. Even though not fully graduated to core project status, Keystone is already an essential component in OpenStack. People will certainly raise valid questions about production readiness and code churn within the project. Changes like these are the reality for any major project and doubly so for platforms.

The very fact that this change is visible and discussed by the OpenStack community shows our strength.

Acknowledging and quickly fixing a weakness in the OpenStack code base is exactly the type of behavior that the community needs to be successful and converge towards a great platform. The fact that maintaining the API is a priority shows that OpenStack is moving in the direction of more API based standards. While the Keystone change is not a recommendation for dual implementations (the Diablo Keystone fork will likely die out), it should help set the stage for how the community will handle competing implementations. If nothing else, it is a strong argument for maintaining API tests and compliance.

The Keystone change is a forward looking one. Our Crowbar team will investigate how we will incorporate it.  As part of OpenStack, the new Keystone code will (re)surface for the Essex deployment and that code will be part of the Dell OpenStack-Powered Cloud.  This work, like the previous, will be done in the open as part of the OpenStack barclamps that we maintain on the Crowbar github.

Crowbar+OpenStack Insights for the week: Food Fight Podcast & Boston Meetup 2/1

Please don’t confuse a lack of posts with a lack of activity!  I’ve been in the center of a whirlwind of Crowbar, OpenStack and Hadoop for my team at Dell.  I’ve also working on an interesting side project with Liquid Leadership author (and would-be star ship captain) Brad Szollose.

I just don’t have time to post all of the awesomeness.  I can tell you that my team is very focused on Hadoop (RHEL 6.2/CentOS 6.2 + open Cloudera Distro) barclamps as we get some Diablo deployments done.  Also the Crowbar list has been very active about Diablo.  If you’re looking for advanced information, there is  some inside scoop on the Crowbar FoodFight podcast I did with Bryan Berry & Matt Ray.

I’ll be in BOSTON THIS WEDNESDAY 2/1 for the OpenStack Meetup there.  We’re going to be talking about Quantum and the OpenStack Foundation.  I suspect that Keystone will come up too (but that’s the subject of another post).  Of course, it’s not just your humble blogger: the whole Dell CloudEdge OpenStack/Crowbar team will be on hand!  So put on your cloud geek hat and take a trip to Harvard for the meetup!

CloudOps white paper explains “cloud is always ready, never finished”

I don’t usually call out my credentials, but knowing the I have a Masters in Industrial Engineering helps (partially) explain my passion for process as being essential to successful software delivery. One of my favorite authors, Mary Poppendiek, explains undeployed code as perishable inventory that you need to get to market before it loses value. The big lessons (low inventory, high quality, system perspective) from Lean manufacturing translate directly into software and, lately, into operation as DevOps.

What we have observed from delivering our own cloud products, and working with customers on thier’s, is that the operations process for deployment is as important as the software and hardware. It is simply not acceptable for us to market clouds without a compelling model for maintaining the solution into the future. Clouds are simply moving too fast to be delivered without a continuous delivery story.

This white paper [link here!] has been available since the OpenStack conference, but not linked to the rest of our OpenStack or Crowbar content.