OpenStack DefCore Review [interview by Jason Baker]

I was interviewed about DefCore by Jason Baker of Red Hat as part of my participation in OSCON Open Cloud Day (speaking Monday 11:30am).  This is just one of fifteen in a series of speaker interviews covering everything from Docker to Girls in Tech.

This interview serves as a good review of DefCore so I’m reposting it here:

Without giving away too much, what are you discussing at OSCON? What drove the need for DefCore?

I’m going to walk through the impact of the OpenStack DefCore process in real terms for users and operators. I’ll talk about how the process works and how we hope it will make OpenStack users’ lives better. Our goal is to take steps towards interoperability between clouds.

DefCore grew out of a need to answer hard and high stakes questions around OpenStack. Questions like “is Swift required?” and “which parts of OpenStack do I have to ship?” have very serious implications for the OpenStack ecosystem.

It was impossible to reach consensus about these questions in regular board meetings so DefCore stepped back to base principles. We’ve been building up a process that helps us make decisions in a transparent way. That’s very important in an open source community because contributors and users want ground rules for engagement.

It seems like there has been a lot of discussion over the OpenStack listservs over what DefCore is and what it isn’t. What’s your definition?

First, DefCore applies only to commercial uses of the OpenStack name. There are different rules for the integrated code base and community activity. That’s the place of most confusion.

Basically, DefCore establishes the required minimum feature set for OpenStack products.

The longer version includes that it’s a board managed process that’s designed to be very transparent and objective. The long-term objective is to ensure that OpenStack clouds are interoperable in a measurable way and that we also encourage our vendor ecosystem to keep participating in upstream development and creation of tests.

A final important component of DefCore is that we are defending the OpenStack brand. While we want a vibrant ecosystem of vendors, we must first have a community that knows what OpenStack is and trusts that companies using our brand comply with a meaningful baseline.

Are there other open source projects out there using “designated sections” of code to define their product, or is this concept unique to OpenStack? What lessons do you think can be learned from other projects’ control (or lack thereof) of what must be included to retain the use of the project’s name?

I’m not aware of other projects using those exact words. We picked up ‘designated sections’ because the community felt that ‘plug-ins’ and ‘modules’ were too limited and generic. I think the term can be confusing, but it was the best we found.

If you consider designated sections to be plug-ins or modules, then there are other projects with similar concepts. Many successful open source projects (Eclipse, Linux, Samba) are functionally frameworks that have very robust extensibility. These projects encourage people to use their code base creatively and then give back some (not all) of their lessons learned in the form of code contributes. If the scope returning value to upstream is too broad then sharing back can become onerous and forking ensues.

All projects must work to find the right balance between collaborative areas (which have community overhead to join) and independent modules (which allow small teams to move quickly). From that perspective, I think the concept is very aligned with good engineering design principles.

The key goal is to help the technical and vendor communities know where it’s safe to offer alternatives and where they are expected to work in the upstream. In my opinion, designated sections foster innovation because they allow people to try new ideas and to target specialized use cases without having to fight about which parts get upstreamed.

What is it like to serve as a community elected OpenStack board member? Are there interests you hope to serve that are difference from the corporate board spots, or is that distinction even noticeable in practice?

It’s been like trying to row a dragon boat down class III rapids. There are a lot of people with oars in the water but we’re neither all rowing together nor able to fight the current. I do think the community members represent different interests than the sponsored seats but I also think the TC/board seats are different too. Each board member brings a distinct perspective based on their experience and interests. While those perspectives are shaped by their employment, I’m very happy to say that I do not see their corporate affiliation as a factor in their actions or decisions. I can think of specific cases where I’ve seen the opposite: board members have acted outside of their affiliation.

When you look back at how OpenStack has grown and developed over the past four years, what has been your biggest surprise?

Honestly, I’m surprised about how many wheels we’ve had to re-invent. I don’t know if it’s cultural or truly a need created by the size and scope of the project, but it seems like we’ve had to (re)create things that we could have leveraged.

What are you most excited about for the “K” release of OpenStack?

The addition of platform services like database as a Service, DNS as a Service, Firewall as a Service. I think these IaaS “adjacent” services are essential to completing the cloud infrastructure story.

Any final thoughts?

In DefCore, we’ve moved slowly and deliberately to ensure people have a chance to participate. We’ve also pushed some problems into the future so that we could resolve the central issues first. We need to community to speak up (either for or against) in order for us to accelerate: silence means we must pause for more input.

OpenStack DefCore Update & 7/16 Community Reviews

The OpenStack Board effort to define “what is core” for commercial use (aka DefCore).  I have blogged extensively about this topic and rely on you to review that material because this post focuses on updates from recent activity.

First, Please Join Our Community DefCore Reviews on 7/16!

We’re reviewing the current DefCore process & timeline then talking about the Advisory Havana Capabilities Matrix (decoder).

To support global access, there are TWO meetings (both will also be recorded):

  1. July 16, 8 am PDT / 1500 UTC
  2. July 16, 6 pm PDT / 0100 UTC July 17

Note: I’m presenting about DefCore at OSCON on 7/21 at 11:30!

We want community input!  The Board is going discuss and, hopefully, approve the matrix at our next meeting on 7/22.  After that, the Board will be focused on defining Designated Sections for Havana and Ice House (the TC is not owning that as previously expected).

The DefCore process is gaining momentum.  We’ve reached the point where there are tangible (yet still non-binding) results to review.  The Refstack efforts to collect community test results from running clouds is underway: the Core Matrix will be fed into Refstack to validate against the DefCore required capabilities.

Now is the time to make adjustments and corrections!  

In the next few months, we’re going to be locking in more and more of the process as we get ready to make it part of the OpenStack by-laws (see bottom of minutes).

If you cannot make these meetings, we still want to hear from you!  The most direct way to engage is via the DefCore mailing list but 1×1 email works too!  Your input is import to us!

SDN’s got Blind Spots! What are these Projects Ignoring? [Guest Post by Scott Jensen]

Scott Jensen returns as a guest poster about SDN!  I’m delighted to share his pointed insights that expand on previous 2 Part serieS about NFV and SDN.  I especially like his Rumsfeldian “unknowable workloads”

In my [Scott's] last post, I talked about why SDN is important in cloud environments; however, I’d like to challenge the underlying assumption that SDN cures all ops problems.

SDN implementations which I have looked at make the following base assumption about the physical network.  From the OpenContrails documentation:

The role of the physical underlay network is to provide an “IP fabric” – its responsibility is to provide unicast IP connectivity from any physical device (server, storage device, router, or switch) to any other physical device. An ideal underlay network provides uniform low-latency, non-blocking, high-bandwidth connectivity from any point in the network to any other point in the network.

The basic idea is to build an overlay network on top of the physical network in order to utilize a variety of protocols (Netflow, VLAN, VXLAN, MPLS etc.) and build the networking infrastructure which is needed by the applications and more importantly allow the applications to modify this virtual infrastructure to build the constructs that they need to operate correctly.

All well and good; however, what about the Physical Networks?

Under Provisioned / FunnyEarth.comThat is where you will run into bandwidth issues, QOS issues, latency differences and where the rubber really meets the road.  Ignoring the physical networks configuration can (and probably will) cause the entire system to perform poorly.

Does it make sense to just assume that you have uniform low latency connectivity to all points in the network?  In many cases, it does not.  For example:

  • Accesses to storage arrays have a different traffic pattern than a distributed storage system.
  • Compute resources which are used to house VMs which are running web applications are different than those which run database applications.
  • Some applications are specifically sensitive to certain networking issues such as available bandwidth, Jitter, Latency and so forth.
  • Where others will perform actions over the network at certain times of the day but then will not require the network resources for the rest of the day.  Classic examples of this are system backups or replication events.

Over Provisioned / zilya.netIf the infrastructure you are trying to implement is truly unknown as to how it will be utilized then you may have no choice than to over-provision the physical network.  In building a public cloud, the users will run whichever application they wish it may not be possible to engineer the appropriate traffic patterns.

This unknowable workload is exactly what these types of SDN projects are trying to target!

When designing these systems you do have a good idea of how it will be utilized or at least how specific portions of the system will be utilized and you need to account for that when building up the physical network under the SDN.

It is my belief that SDN applications should not just create an overlay.  That is part of the story, but should also take into account the physical infrastructure and assist with modifying the configuration of the Physical devices.  This balance achieves the best use of the network for both the applications which are running in the environment AND for the systems which they run on or rely upon for their operations.

Correctly ProvisionedWe need to reframe our thinking about SDN because we cannot just keep assuming that the speeds of the network will follow Moore’s Law and that you can assume that the Network is an unlimited resource.

You need a Squid Proxy fabric! Getting Ready State Best Practices

Sometimes a solving a small problem well makes a huge impact for operators.  Talking to operators, it appears that automated configuration of Squid does exactly that.

Not a SQUID but...

If you were installing OpenStack or Hadoop, you would not find “setup a squid proxy fabric to optimize your package downloads” in the install guide.   That’s simply out of scope for those guides; however, it’s essential operational guidance.  That’s what I mean by open operations and creating a platform for sharing best practice.

Deploying a base operating system (e.g.: Centos) on a lot of nodes creates bit-tons of identical internet traffic.  By default, each node will attempt to reach internet mirrors for packages.  If you multiply that by even 10 nodes, that’s a lot of traffic and a significant performance impact if you’re connection is limited.

For OpenCrowbar developers, the external package resolution means that each dev/test cycle with a node boot (which is up to 10+ times a day) is bottle necked.  For qa and install, the problem is even worse!

Our solution was 1) to embed Squid proxies into the configured environments and the 2) automatically configure nodes to use the proxies.   By making this behavior default, we improve the overall performance of a deployment.   This further improves the overall network topology of the operating environment while adding improved control of traffic.

This is a great example of how Crowbar uses existing operational tool chains (Chef configures Squid) in best practice ways to solve operations problems.  The magic is not in the tool or the configuration, it’s that we’ve included it in our out-of-the-box default orchestrations.

It’s time to stop fumbling around in the operational dark.  We need to compose our tool chains in an automated way!  This is how we advance operational best practice for ready state infrastructure.

Who’s in charge here anyway? We need to start uncovering OpenStack’s Hidden Influencers

After the summit (#afterstack), a few of us compared notes and found a common theme in an under served but critical part of the OpenStack community.  Sean Roberts (HIS POST), Allison Randal (her post), and I committed to expand our discussion to the broader community.

PortholesLack of Product Management¹ was a common theme at the Atlanta OpenStack summit.  That effectively adds fuel to the smoldering “lacking a benevolent dictator” commentary that lingers like smog at summits.  While I’ve come think this criticism has merit, I think that it’s a dramatic oversimplification of the leadership dynamic.  We have plenty of leaders in OpenStack but we don’t do enough to coordinate them because they are hidden.

One reason to reject “missing product management” as a description is that there are LOTS of PMs in OpenStack.  It’s simply that they all work for competing companies.  While we spend a lot of time coordinating developers from competing companies, we have minimal integration between their direct engineering managers or product managers.

We spend a lot of time getting engineering talking together, but we do not formally engage discussion between their product or line managers.  In fact, they likely encourage them to send their engineers instead of attending summits themselves; consequently, we may not even know those influencers!

When the managers are omitted then the commitments made by engineers to projects are empty promises.

At best, this results in a discrepancy between expected and actual velocity.  At worst, work may be waiting on deliveries that have been silently deprioritized by managers who do not directly participate or simply felt excluded the technical discussion.

We need to recognize that OpenStack work is largely corporate sponsored.  These managers directly control the engineers’ priorities so they have a huge influence on what features really get delivered.

To make matters worse (yes, they get worse), these influencers are often invisible.  Our tracking systems focus on code committers and completely miss the managers who direct those contributors.  Even if they had the needed leverage to set priorities, OpenStack technical and governance leaders may not know who contact to resolve conflicts.

We’ve each been working with these “hidden influencers” at our own companies and they aren’t a shadowy spy-v-spy lot, they’re just human beings.  They are every bit as enthusiastic about OpenStack as the developers, users and operators!  They are frequently the loudest voices saying “Could you please get us just one or two more headcount for the team, we want X and Y to be able to spend full-time on upstream contribution, but we’re stretched too thin to spare them at the moment”.

So it’s not intent but an omission in the OpenStack project to engage managers as a class of contributors. We have clear avenues for developers to participate, but pretty much entirely ignore the managers. We say that with a note of caution, because we don’t want to bring the managers in to “manage OpenStack”.

We should provide avenues for collaboration so that as they’re managing their team of devs at their company, they are also communicating with the managers of similar teams at other companies.

This is potentially beneficial for developers, managers and their companies: they can gain access to resources across company lines. Instead of being solely responsible for some initiative to work on a feature for OpenStack, they can share initiatives across teams at multiple companies. This does happen now, but the coordination for it is quite limited.

We don’t think OpenStack needs more management; instead, I think we need to connect the hidden influencers.   Transparency and dialog will resolve these concerns more directly than adding additional process or controls.

Continue reading

Understanding OpenStack Designated Code Sections – Three critical questions

A collaboration with Michael Still (TC Member from Rackspace) & Joshua McKenty and Cross posted by Rackspace.

After nearly a year of discussion, the OpenStack board launched the DefCore process with 10 principles that set us on path towards a validated interoperability standard.   We created the concept of “designated sections” to address concerns that using API tests to determine core would undermine commercial and community investment in a working, shared upstream implementation.

Designated SectionsDesignated sections provides the “you must include this” part of the core definition.  Having common code as part of core is a central part of how DefCore is driving OpenStack operability.

So, why do we need this?

From our very formation, OpenStack has valued implementation over specification; consequently, there is a fairly strong community bias to ensure contributions are upstreamed. This bias is codified into the very structure of the GNU General Public License (GPL) but intentionally missing in the Apache Public License (APL v2) that OpenStack follows.  The choice of Apache2 was important for OpenStack to attract commercial interests, who often consider GPL a “poison pill” because of the upstream requirements.

Nothing in the Apache license requires consumers of the code to share their changes; however, the OpenStack foundation does have control of how the OpenStack™ brand is used.   Thus it’s possible for someone to fork and reuse OpenStack code without permission, but they cannot called it “OpenStack” code.  This restriction only has strength if the OpenStack brand has value (protecting that value is the primary duty of the Foundation).

This intersection between License and Brand is the essence of why the Board has created the DefCore process.

Ok, how are we going to pick the designated code?

Figuring out which code should be designated is highly project specific and ultimately subjective; however, it’s also important to the community that we have a consistent and predictable strategy.  While the work falls to the project technical leads (with ratification by the Technical Committee), the DefCore and Technical committees worked together to define a set of principles to guide the selection.

This Technical Committee resolution formally approves the general selection principles for “designated sections” of code, as part of the DefCore effort.  We’ve taken the liberty to create a graphical representation (above) that visualizes this table using white for designated and black for non-designated sections.  We’ve also included the DefCore principle of having an official “reference implementation.”

Here is the text from the resolution presented as a table:

Should be DESIGNATED: Should NOT be DESIGNATED:
  • code provides the project external REST API, or
  • code is shared and provides common functionality for all options, or
  • code implements logic that is critical for cross-platform operation
  • code interfaces to vendor-specific functions, or
  • project design explicitly intended this section to be replaceable, or
  • code extends the project external REST API in a new or different way, or
  • code is being deprecated

The resolution includes the expectation that “code that is not clearly designated is assumed to be designated unless determined otherwise. The default assumption will be to consider code designated.”

This definition is a starting point.  Our next step is to apply these rules to projects and make sure that they provide meaningful results.

Wow, isn’t that a lot of code?

Not really.  Its important to remember that designated sections alone do not define core: the must-pass tests are also a critical component.   Consequently, designated code in projects that do not have must-pass tests is not actually required for OpenStack licensed implementation.

OpenCrowbar Design Principles: Attribute Injection [Series 6 of 6]

This is part 5 of 6 in a series discussing the principles behind the “ready state” and other concepts implemented in OpenCrowbar.  The content is reposted from the OpenCrowbar docs repo.

Attribute Injection

Attribute Injection is an essential aspect of the “FuncOps” story because it helps clean boundaries needed to implement consistent scripting behavior between divergent sites.

attribute_injectionIt also allows Crowbar to abstract and isolate provisioning layers. This operational approach means that deployments are composed of layered services (see emergent services) instead of locked “golden” images. The layers can be maintained independently and allow users to compose specific configurations a la cart. This approach works if the layers have clean functional boundaries (FuncOps) that can be scoped and managed atomically.

To explain how Attribute Injection accomplishes this, we need to explore why search became an anti-pattern in Crowbar v1. Originally, being able to use server based search functions in operational scripting was a critical feature. It allowed individual nodes to act as part of a system by searching for global information needed to make local decisions. This greatly added Crowbar’s mission of system level configuration; however, it also created significant hidden interdependencies between scripts. As Crowbar v1 grew in complexity, searches became more and more difficult to maintain because they were difficult to correctly scope, hard to centrally manage and prone to timing issues.

Crowbar was not unique in dealing with this problem – the Attribute Injection pattern has become a preferred alternative to search in integrated community cookbooks.

Attribute Injection in OpenCrowbar works by establishing specific inputs and outputs for all state actions (NodeRole runs). By declaring the exact inputs needed and outputs provided, Crowbar can better manage each annealing operation. This control includes deployment scoping boundaries, time sequence of information plus override and substitution of inputs based on execution paths.

This concept is not unique to Crowbar. It has become best practice for operational scripts. Crowbar simply extends to paradigm to the system level and orchestration level.

Attribute Injection enabled operations to be:

  • Atomic – only the information needed for the operation is provided so risk of “bleed over” between scripts is minimized. This is also a functional programming preference.
  • Isolated Idempotent – risk of accidentally picking up changed information from previous runs is reduced by controlling the inputs. That makes it more likely that scripts can be idempotent.
  • Cleanly Scoped – information passed into operations can be limited based on system deployment boundaries instead of search parameters. This allows the orchestration to manage when and how information is added into configurations.
  • Easy to troubleshoot – since the information is limited and controlled, it is easier to recreate runs for troubleshooting. This is a substantial value for diagnostics.

Just for fun, putting themes to OpenStack Conferences

I’ve been to every OpenStack summit and, in retrospect, each one has a different theme.  I see these as community themes beyond the releases train that cover how the OpenStack ecosystem has changed.

The themes are, of course, highly subjective and intented to spark reflection and discussion.

City Release Theme My Commentary
ATL Ice House Its my sandbox! The new marketplace is great and there are also a lot of vendors who want to differentiate their offering and are not sure where to play.
HK Havana Project land grab It felt like a PTL gold rush as lots of new projects where tossed into the ecosystem mix.  I’m wary of perceived “anointed” projects that define “the way” to do things.
PDX Grizzly Shiny new things We went from having a defined core set of projects to a much richer and varied platforms, environments and solutions.
SD Folsom Breaking up is hard to do Nova began to fragment (cinder & quantum neutron)
SF Essex New kids are here Move over Rackspace.  Lots of new operating systems, providers, consulting and hosting companies participating.  Stackalytics makes it into a real commit race.
BOS Diablo Race to be the first Everyone was trying to show that OpenStack could be used for real work.  Lots of startups launched.
SJC Cactus Oh, you like us! We need some process This is real so everyone was exporing OpenStack.  We clearly needed to figure out how to work together.  This is where we migrated to git.
SA Bexar We’re going to take over the world We handed out rose-colored classes that mostly turned out pretty accurate; however. many some top names from that time are not in the community now (Citrix, NASA, Accenture, and others).
ATX Austin We choose “none of the above” There was a building sense of potential energy while companies figured out that 1) there was a gap and 2) they wanted to fill it together.

OpenStack ATL Recap to the 11s: the danger of drama + 5 challenges & 5 successes

HallwayI’ve come to accept that the “Hallway Track” is my primary session at OpenStack events.  I want to thank the many people in the community who make that the best track.  It’s not only full of deep technical content; there are also healthy doses of intrigue, politics and “let’s fix that” in the halls.

I think honest reflection is critical to OpenStack growth (reflections from last year).  My role as a Board member must not translate into pom-pom waving robot cheerleader.

 

What I heard that’s working:

  1. Foundation event team did a great job on the logistics and many appreciate the user and operator focus.  There’s is no doubt that OpenStack is being deployed at scale and helping transform cloud infrastructure.  I think that’s a great message.
  2. DefCore criteria were approved by the Board.  The overall process and impact was talked about positively at the summit.  To accelerate, we need +1s and feedback because “crickets” means we need to go slower.  I’ll have to dedicate a future post to next steps and “designated sections.”
  3. Marketplace!  Great turn out by vendors of all types, but I’m not hearing about them making a lot of money from OpenStack (which is needed for them to survive).  I like the diversity of the marketplace: consulting, aaServices, installers, networking, more networking, new distros, and ecosystem tools.
  4. There’s some real growth in aaS services for openstack (database, load balancer, dns, etc).   This is the ecosystem that many want OpenStack to drive because it helps displace Amazon cloud.  I also heard concerns that to be sure they are pluggable so companies can complete on implementation.
  5. Lots of process changes to adapt to growing pains.  People felt that the community is adapting (yeah!) but were concerned having to re-invent tooling (meh).

There are also challenges that people brought to me:

  1. Our #1 danger is drama.  Users and operators want collaboration and friendly competition.  They are turned off by vendor conflict or strong-arming in the community (e.g.: the WSJ Red Hat article and fallout).  I’d encourage everyone to breathe more and react less.
  2. Lack of product management is risking a tragedy of the commons.  Helping companies work together and across projects is needed for our collaboration processes to work.  I’ll be exploring this with Sean Roberts in future posts.
  3. Making sure there’s profit being generated from shared code.  We need to remember that most of the development is corporate funded so we need to make sure that companies generate revenue.  The trend of everyone creating unique distros may indicate a problem.
  4. We need to be more operator friendly.  I know we’re trying but we create distance with operators when we insist on creating new tools instead of using the existing ecosystem.  That also slows down dealing with upgrades, resilient architecture and other operational concerns.
  5. Anointed projects concerns have expanded since Hong Kong.  There’s a perception that Heat (orchestration), Triple0 (provisioning), Solum (platform) are considered THE only way OpenStack solves those problems and other approaches are not welcome.  While that encourages collaboration, it also chills competition and discussion.
  6. There’s a lot of whispering about the status of challenged projects: neutron (works with proprietary backends but not open, may not stay integrated) and openstack boot-strap (state of TripleO/Ironic/Heat mix).  The issue here is NOT if they are challenged but finding ways to discuss concerns openly (see anointed projects concern).

I’d enjoy hearing more about success and deeper discussion around concerns.  I use community feedback to influence my work in the community and on the board.  If you think I’ve got it right or wrong then please let me know.

Hugs & Rants Welcome: OpenStack reaching out with “community IRC office hours”

2013-07-11_20-07-21_286This is a great move by the OpenStack community managers that I feel like is worth amplifying.  Copied from community email:

Hello folks

one of the requests in Atlanta was to setup carefully listening ears for developers and users alike so they can highlight roadblocks, vent frustration and hopefully also give kudos to people, suggest solutions, etc.

I and Tom have added two 1 hour slots to the OpenStack Meetings calendar

    •  Tuesdays at 0800 UTC on #openstack-community (hosted by Tom)
    • Fridays at 1800 UTC on #openstack-community (hosted by Stefano)

so if you have anything you’d like the Foundation to be aware of please hop on the channel and talk to us. If you don’t/can’t use IRC, send us an email and we’ll use something else: just talk to us.

Regards,

Stef

https://wiki.openstack.org/wiki/Meetings/Community