why is hardware hard? Ready State Physical Ops Meetup on Tuesday 12/2 9am PT

meh.  Compared to cloud, Ops on physical infrastructure sinks.physical outlet

Unfortunately, the cloud and scale platforms need to run someone so someone’s got to deal with it.  In fact, we’ve got to deal with crates of cranky servers and flocks of finicky platforms.  It’s enough to keep a good operator down.

If that’s you, or someone you care about, join us for a physical ops support group meetup on Tuesday 9am PT (11 central).

There is a light at the end of the tunnel!  We can make it repeatable to provision OpenStack, Hadoop and other platforms.

As a community, we’re steadily bringing best practices and proven automation from cloud ops down into the physical space.   On the OpenCrowbar project, we’re accelerating this effort using the ready state concept as a hand off point for “physical-cloud equivalency” and exploring the concept of “functional operations” to make DevOps scripts more portable.

Join me and Rafael Knuth for a discussion about how operators can work together to break the cycle.

Running with scissors > DefCore “must-pass” Road Show Starts [VIDEOS]

The OpenStack DefCore committee has been very active during this cycle turning the core definition principles into an actual list of “must-pass” capabilities (working page).  This in turn gives the community something tangible enough to review and evaluate.

Capabilities SelectionTL;DR!  We appreciate those in the community who have been patient enough to help define and learn the process we’re using the make selections; however, we also recognize that most people want to jump to the results.

This week, we started a “DefCore roadshow” with the goal of learning how to make this huge body of capabilities, process and impact easier to digest (draft write-up for review & Troy Toman’s notes).  So far we’ve had two great sessions on this topic.  We took notes and recorded at both meetups (San Francisco & Austin).

My takeaways of these initial meetups are:

  • Jump to the Capabilities right away, the process history is not needed up front
  • You need more graphics – specifically, one for the selection criteria (what do you think of my 1st attempt?)
  • Work from some examples of scored capabilities
  • Include some specific use-cases with a user, 2 types of private cloud and a public cloud to help show the impact

Overall, people like what they are hearing.  It makes sense and decisions are justified.

We need more feedback!  Please help us figure out how to explain this for the broader community.

OpenStack Core Online Forum, Oct 16 13:30 UTC / Oct 22 0100 UTC

Go Online!OpenStack Community, you are invited on an online discussion about OpenStack Core on October 16th at UTC 13:30 (8:30 am US Central) and October 22nd at UTC 0100 (8:00 pm US Central)

At the next OpenStack Foundation Board meeting, we will be setting a timeline for implementing an OpenStack Core Definition process that promotes a clear and implementation driven metric for deciding which projects should be considered “required.”  This is your chance to review and influence the process!

We’ll review the OpenStack Core Definition process (20 minutes) and then open up the channel for discussion using the IRC (#openstack-meeting) & Google Hangout on Air (link posted in IRC).

The forum will be coordinated through the IRC channel for links and questions.

Can’t make it?  The session was recorded > here!

Thinking about how to Implement OpenStack Core Definition

THIS POST IS #10 IN A SERIES ABOUT “WHAT IS CORE.”

Tied UpWe’ve had a number of community discussions (OSCON, SFO & SA-TX) around the process for OpenStack Core definition.  These have been animated and engaged discussions (video from SA-TX): my notes for them are below.

While the current thinking of a testing-based definition of Core adds pressure on expanding our test suite, it seems to pass the community’s fairness checks.

Overall, the discussions lead me to believe that we’re on the right track because the discussions jump from process to impacts.  It’s not too late!  We’re continuing to get community feedback.  So what’s next?

First…. Get involved: Upcoming Community Core Discussions

These discussions are expected to have online access via Google Hangout.  Watch Twitter when the event starts for a link.

Want to to discuss this in your meetup? Reach out to me or someone on the Board and we’ll be happy to find a way to connect with your local community!

What’s Next?  Implementation!

So far, the Core discussion has been about defining the process that we’ll use to determine what is core.  Assuming we move forward, the next step is to implement that process by selecting which tests are “must pass.”  That means we have to both figure out how to pick the tests and do the actual work of picking them.  I suspect we’ll also find testing gaps that will have developers scrambling in Ice House.

Here’s the possible (aggressive) timeline for implementation:

  • November: Approval of approach & timeline at next Board Meeting
  • January: Publish Timeline for Roll out (ideally, have usable definition for Havana)
  • March: Identify Havana must pass Tests (process to be determined)
  • April: Integration w/ OpenStack Foundation infrastructure

Obviously, there are a lot of details to work out!  I expect that we’ll have an interim process to select must-pass tests before we can have a full community driven methodology.

Notes from Previous Discussions (earlier notes):

  • There is still confusion around the idea that OpenStack Core requires using some of the project code.  This requirement helps ensure that people claiming to be OpenStack core have a reason to contribute, not just replicate the APIs.
  • It’s easy to overlook that we’re trying to define a process for defining core, not core itself.  We have spent a lot of time testing how individual projects may be effected based on possible outcomes.  In the end, we’ll need actual data.
  • There are some clear anti-goals in the process that we are not ready to discuss but will clearly going to become issues quickly.  They are:
    • Using the OpenStack name for projects that pass the API tests but don’t implement any OpenStack code.  (e.g.: an OpenStack Compatible mark)
    • Having speciality testing sets for flavors of OpenStack that are different than core.  (e.g.: OpenStack for Hosters, OpenStack Private Cloud, etc)
  • We need to be prepared that the list of “must pass” tests identifies a smaller core than is currently defined.  It’s possible that some projects will no longer be “core”
  • The idea that we’re going to use real data to recommend tests as must-pass is positive; however, the time it takes to collect the data may be frustrating.
  • People love to lobby for their favorite projects.  Gaps in testing may create problems.
  • We are about to put a lot of pressure on the testing efforts and that will require more investment and leadership from the Foundation.
  • Some people are not comfortable with self-reporting test compliance.   Overall, market pressure was considered enough to punish cheaters.
  • There is a perceived risk of confusion as we migrate between versions.  OpenStack Core for Havana seems to specific but there is concern that vendors may pass in one release and then skip re-certification.  Once again, market pressure seems to be an adequate answer.
  • It’s not clear if a project with only 1 must-pass test is a core project.  Likely, it would be considered core.  Ultimately, people seem to expect that the tests will define core instead of the project boundary.

What do you think?  I’d like to hear your opinions on this!

Stop the Presses! Austin OpenStack Meetup 7/12 features docs, bugs & cinder

Don’t miss the 7/12 OpenStack Austin meetup!  We’ve got a great agenda lined up.

This meetup is sponsored by HP (Mark Padovani will give the intro).

Topics will include

  1. 6:30 pre-meeting OpenStack intro & overview for N00bs.
  2. Anne Gentle, OpenStack Technical Writer at Rackspace Hosting, talking about How to contribute to docs & the areas needed. *
  3. Report on the Folsom.3 bug squash day (http://wiki.openstack.org/BugDays/20120712BugSquashing)
  4. (tentative) Greg Althaus, Dell, talking about the “Cinder” Block Storage project
  5. White Board – Next Meeting Topics

* if you contribute to docs then you’ll get an invite to the next design summit!   It’s a great way to support OpenStack even if you don’t write code.

OpenStack Meetup 4/12: Austin at Summit, DevStack Essex

Austin Stackers!  This Thursday is our April meetup at the Austin TechRanch.

Please RSVP so that we know how much food to get!  SUSE is this Month’s sponsor for food and my team at Dell continues to pickup the room rental.  We have 35 RSVPs as of Monday noon – this will be another popular meeting (last meeting minutes).

Topics for the meetup are:

With the Summit next week, I think it is very important that we pre-discuss Summit topics and priorities as a community.  It will help us be more productive individually and for our collective interests when we engage the larger community next week.

OpenStack Austin: What we’d like to see at the Design Summit

Last week, the OpenStack Austin user group discussed what we’d like to see at the upcoming OpenStack Design Summit. We had a strong turnout (48?!).

  1. To get the meeting started, Marc Padovani from HP (this month’s sponsor) provided some lessons learned from the HP OpenStack-Powered Cloud. While Marc noted that HP has not been able to share much of their development work on OpenStack; he was able to show performance metrics relating to a fix that HP contributed back to the OpenStack community. The defect related to the scheduler’s ability to handle load. The pre-fix data showed a climb and then a gap where the scheduler simply stopped responding. Post-fix, the performance curve is flat without any “dead zones.” (sharing data like this is what I call “open operations“)
  2. Next, I (Rob Hirschfeld) gave a brief overview of the OpenStack Essex Deploy Day (my summary) that Dell coordinated with world-wide participation. The Austin deploy day location was in the same room as the meetup so several of the OSEDD participants were still around.
  3. The meat of the meetup was a freeform discussion about what the group would like to see discussed at the Design Summit. My objective for the discussion was that the Austin OpenStack community could have a broader voice is we showed consensus for certain topics in advance of the meeting.

At Jim Plamondon‘s suggestion, we captured our brain storming on the OpenStack etherpad. The Etherpad is super cool – it allows simultaneous editing by multiple parties, so the notes below were crowd sourced during the meeting as we discussed topics that we’d like to see highlighted at the conference. The etherpad preserves editors, but I removed the highlights for clarity.

The next step is for me to consolidate the list into a voting page and ask the membership to rank the items (poll online!) below.

Brain storm results (unedited)

Stablity vs. Features

API vs. Code

  • What is the measurable feature set?
  • Is it an API, or an implementation?
  • Is the Foundation a formal-ish standards body?
  • Imagine the late end-game: can Azure/VMWare adopt OPenStack’s APIs and data formats to deliver interop, without running OpenStack’s code? Is this good? Are there conversations on displacing incumbents and spurring new adoption?
  • Logo issues

Documentation Standards

  • Dev docs vs user docs
  • Lag of update/fragmentation (10 blogs, 10 different methods, 2 “work”)
  • Per release getting started guide validated and available prior or at release.

Operations Focus

  • Error messages and codes vs python stack traces
  • Alternatively put, “how can we make error messages more ops-friendly, without making them less developer-friendly?”
  • Upgrade and operations of rolling updates and upgrades. Hot migrations?

If OpenStack was installable on Windows/Hyper-V as a simple MSI/Service installer – would you try it as a node?

  • Yes.

Is Nova too big?  How does it get fixed?

  • libraries?
  • sections?
  • make it smaller sub-projects
  • shorter release cycles?

nova-volume

  • volume split out?
  • volume expansion of backend storage systems
  • Is nova-volume the canonical control plane for storage provisioning?  Regardless of transport? It presently deals in block devices only… is the following blueprint correctly targeted to nova-volume?

https://blueprints.launchpad.net/nova/+spec/filedriver

Orchestration

  • Is the Donabe project dead?

Discussion about invitations to Summit

  • What is a contribution that warrants an invitation
  • Look at Launchpad’s Karma system, which confers karma for many different “contributory” acts, including bug fixes and doc fixes, in addition to code commitments

Summit Discussions

  • Is there a time for an operations summit?
  • How about an operators’ track?
  • Just a note: forums.openstack.org for users/operators to drive/show need and participation.

How can we capture the implicit knowledge (of mailing list and IRC content) in explicit content (documentation, forums, wiki, stackexchange, etc.)?

Hypervisors: room for discussion?

  • Do we want hypervisor featrure parity?
  • From the cloud-app developer’s perspective, I want to “write once, run anywhere,” and if hypervisor features preclude that (by having incompatible VM images, foe example)
  • (RobH: But “write once, run anywhere” [WORA] didn’t work for Java, right?)
  • (JimP: Yeah, but I was one of Microsoft’s anti-Java evangelists, when we were actively preventing it from working — so I know the dirty tricks vendors can use to hurt WORA in OpenStack, and how to prevent those trick from working.)

CDMI

Swift API is an evolving de facto open alternative to S3… CDMI is SNIA standards track.  Should Swift API become CDMI compliant?  Should CDMI exist as a shim… a la the S3 stuff.