Unknown's avatar

About Rob H

A Baltimore transplant to Austin, Rob thinks about ways of building scale infrastructure for the clouds using Agile processes. He sat on the OpenStack Foundation board for four years. He co-founded RackN enable software that creates hyperscale converged infrastructure.

Avoid false agreements and saying no with a yes. #TeamDeath

caution

One of my favorite things about Agile is how it helps teams get committed toward a shared goal.  There are so many distractions and confusions, that we need to double down ways to help people get and then stay on the same page.  In some cases, it comes down to something as simple as word choice!

First, I feel like I need some explanation…

There comes a time in any disagreement when the team needs everyone to get on the same page even if they don’t agree.  As a rule, this should be a relatively small window (maybe 20 minutes max) because the team can defer issues by having a sprint long spike* or exploration story that collects more information to settle arguments down the road. 

Personal Experience Note: A team should NEVER spend much time arguing about the mid or long-term future!  It’s just not worth the time to convince someone that your vision is more compelling.  It’s more efficient to accept that there are MULTIPLE VALID FUTURES and that the team needs to watch to see which one(s) is  taking shape.  There is no need to be “right” about the future.

So, back to the fake agreement phrases that effective teams avoid.

#1 “Yes, but…”

This statement really means “Will you shut up already?  I don’t agree.”  The speaker says “yes” to acknowledge the first person has finished; however, it does not mean that they agree.  The confusing thing is the speaker typically does not even realize that they are sending you into a discussion death spiral. 

Anytime someone says “but” then they are disagreeing.   Just for fun, trying have discussions where people are not allowed to say but – it creates a whole new positive dynamic.

#2 “I don’t disagree”

This statement really means “You are full of shit and my opinion is more right.”  The speaker is trying to avoid addressing your points directly and refocus discussion on their opinion.  Agreement means that everyone believes the same thing.  There are many ways to not agree and only one way to agree.

This is one of my pet peeves because the speaker thinks they are rewarding you with some back-handed pat on the head.  In reality, they shutting your ideas down without validation or acknowledgement.

There are many such statements that waste team time and mask disagreement.  If you have some that bug you, please comment on this post and add to the dialog.  I’m sure that I won’t disagree with any of them!

* Spike stories are time bounded stories that have specific research or opinion deliverables.  They are intended to collect enough information that the team can take action and move forward.   Sometimes these are also called “time box” stories.

Preview Crowbar GUI (OpenStack Installer by Dell for Cloud)

I can’t show you the really cool Overview screen yet, but here’s the one that replaces the one we’ve demo’ed before.  The nodes are grouped by switch and ordered by port so it creates a very nice “rack” layout if your wiring is organized.

Props to Jon Roberts (@emptyflask) for his excellent UI work!

PaaS Simplified: an application architecture that responds to load

handoff

In addition to attending the great sessions at the OpenStack Design Conference, our Dell team realized that we’ve been making Platform as a Service (PaaS) much more complex.  Stripping away the detritus is important because it looks like “What is a PaaS” is changing on a daily basis so boiling it down to the must fundamental is essential.

At its core, a PaaS is an application that changes its architecture based on the load.   That’s it no further definition is required.

I’ve been playing with this definition since April and am finding that it’s a much more productive definition of PaaS than any that I’ve used so far.  The reason is that it’s

  1. application focused,
  2. not language or services bound and
  3. captures the business use cases

Of course, I’m going to have to provide more backup in future posts.  I want to invite discussion about this perspective on PaaS.  I’m especially interesting in seeing how recent offerings from VMware (OpenPaaS/CloudFoundry) or Amazon (Elastic Beanstalk) measure against this concept.

Bad Premise: Cloud Outages are *not* driving IT back to premises

trapped

I wrote this responding to Lauren Carlson‘s (Software Advice) Blog Post.  Lauren – I’d be more likely to agree with the statement that “SLAs are dead”  Here’s why…

<soapbox>

Recent industry buzz about cloud service level agreements (SLAs) and reliability miss the core point about cloud.  Cloud is about agility, business models, consumerization of software and merciless pursuit of efficiency.

The fact that Amazon EC2 built its base without an “enterprise” SLA is exhibit #1 that the IT world changed and it’s not going back.

Here are my reasons why IT pandoras can’t get cloud back into the box.

#1. Cloud has vastly superior network connectivity

The concept of your users accessing your applications from inside your firewall is so 2005.  Today’s reality is that significant amounts of network access is externally routed means that applications need to live where they have excellent bandwidth to their users and to other applications.

#2. Cloud has elastic consumption of resources

Cloud is not less expensive infrastructure, it is mainly more flexible.  If you’re worried about an outage, then cloud is exactly the investment for you because you position a backup site at another location without having to pay for online resources.  It’s much harder to take down a site that invests the time to design a system that dynamically reallocates load between sites.

#3. Cloud drives more robust architecture

The fact that cloud delivery is more opaque and modular without a five 9s SLA has driven a cloud application architecture revolution (see CAP).  We have shifted the app paradigm from robust scale up hardware to robust scale out software.  Also significant, DevOps innovations have made deployments repeatable and adaptable.

The only “logical” argument for pulling applications back from the cloud is to assert control over more of the delivery chain for your application.  It the same reason that we think that driving is safer than flying – we’re the ones sitting behind the wheel when we drive.  News flash – driving is NOT safer than flying.

Cloud applications are not about hardware infrastructure, they are about SOFTWARE.  Perhaps one of the greatest disservices foisted on the market was saying cloud is synonymous with “Infrastructure as a Service” and “Virtualization.”  Cloud applications are powerful because we created ways that circumvent the limitations of IaaS and VMs!

</soapbox>

Cybera’s OpenStack efforts (includes Dell xrefs)

Cybera's Everett ToewsIt’s awesome to see new deployments of OpenStack so I wanted to point out Cybera’s post about their OpenStack efforts!

Everett Toews does a nice job talking about the rationale for their decisions including some analysis of their hardware and vendor selections.  Of course, I’m also happy to have them posting links back to my Dell team’s white paper and content. 

I wanted to highlight one point that Everett makes:

“Is this the best mix of hardware possible for OpenStack? As always the answer is, “It depends.” It depends primarily on your the use cases for your cloud. We think we got a good mix of hardware but time will truely tell if it was the best mix possible for DAIR.”

I strongly agree, we (Dell) are still recommending starting with a smaller set general purpose hardware config that can be easily repurposed.  Once you’ve figured out how your application maps into OpenStack then we’ll be ready to work togther to tune that order for 1000s of servers.

Not all APIs are equal: the power of API + implementation (OpenStack vs LibCloud vs DeltaCloud)

sky

I’ve been getting a lot of questions about Apache LibCloud and RedHat’s DeltaCloud vs. OpenStack.  While all of these projects offer APIs, only OpenStack is based on an implementation.

Having an implementation means that the API is reflected by code that delivers the functionality of the API.  This means that the implementation based API more closely reflects the actual workings of the system while the “pure” API must abstract the working of multiple systems.   The API only approach ends up having to become a least common deminator instead of a vision of the pure use cases.

LibCloud and DeltaCloud are important and useful.  They provide abstractions that help developers write applications without being tied to a specific cloud vendor.  While lack of lock-in is a concrete benefit, it comes at a price.  The price is that the API shim cannot expose features that differentiate the platforms.  This may represent a significant loss of functionality or performance.

When developers implement directly against an implemented API, they can take advantage of the full feature set of their target cloud.  They can also test and verify more directly.  These are significant benefits that result in richer, more robust and faster to market products.

Both approaches have their place and are needed in the market.  If I needed to write against multiple clouds for portability then Libcloud is a slam dunk.  If I needed rich features and an ecosystem then OpenStack or Amazon are better choices.

Hungry for Nova Cuisine? Adding Chef recipes for OpenStack Nova

As promised, here’s the other drop in advance of our OpenStack team’s Crowbar release. 

This is the second part of the Swift and Nova recipes that we are intentionally leaking out to the community.

USAGE NOTE: These recipes are designed to work with Crowbar!  They are not intended to stand alone.

As part of our collaboration with Opscode, Matt Ray, has been merging our recipes into his most excellent OpenStack cookbook tree.  If you want to see our unmerged recipes, we’re also posting those to our github

In addition to our Swift recipes, you can now check out the Nova recipes.

ADDITIONAL USAGE NOTE: The Matt’s tree is more complete – these are released for reference only.  They will ultimately be maintained as part of the Crowbar.

Virtualizing #OpenStack Nova: looking at the many ways to skin the CAcTus (#KVM v #XenServer v #ESX)

<service bulletin> Server virtualization is not cloud: it is a commonly used technology that creates convenient  resource partitions for cloud operations and infrastructure as a service providers. </service bulletin>

OpenStack claims support for nearly every virtualization platform on the market.  While the basics of “what is virtualization” are common across all platforms, there are important variances in how these platforms are deployed.   It is important to understand these variances to make informed choices about virtualization platforms. 

Your virtualization model choice will have deep implications on your server/networking choice, deployment methodology and operations infrastructure.

My focus is on architecture not specific hypervisors so I’m generalizing to just three to make the each architecture description more concrete:

  1. KVM (open source) is highly used by developers and single host systems
  2. XenServer (open/freemium) leads public cloud infrastructure (Amazon EC2, Rackspace Cloud, and GoGrid)
  3. ESX/vCenter (licensed) leads enterprise virtualized infrastructure

Of course, there are many more hypervisors and many different ways to deploy the three I’m referencing.

This picture shows all three options as a single system.  In practice, only operators wishing to avoid exposure to RESTful recreational activities would implement multiple virtualization architectures in a single system.   Let’s explore the three options:

OS + Hypervisor (KVM) architecture deploys the hypervisor a free standing application on top of an operating system (OS).  In this model, the service provider manages the OS and the hypervisor independently.  This means that the OS needs to be maintained, but is also allows the OS to be enhanced to better manage the cloud or add other functions (share storage).  Because they are least restricted, free standing hypervisors lead the virtualization innovation wave.

Bare Metal Hypervisor (XenServer) architecture integrates the hypervisor and the OS as a single unit.  In this model, the service provider manages the hypervisor as a single unit.  This makes it easier to support and maintain the hypervisor because the platform can be tightly controlled; however, it limits the operator’s ability to extend or multi-purpose the server.   In this model, operators may add agents directly to the individual hypervisor but would not make changes to the underlying OS or resource allocation.

Clustered Hypervisor (ESX + vCenter) architecture integrates multiple servers into a single hypervisor pool.  In this model, the service provider does not manage the individual hypervisor; instead, they operate the environment through the cluster supervisor.  This makes it easier to perform resource balancing and fault tolerance within the domain of the cluster; however, the operator must rely on the supervisor because directly managing the system creates a multi-master problem.  Lack of direct management improves supportability at the cost of flexibility.  Scale is also a challenge for clustered hypervisors because their span of control is limited to practical resource boundaries: this means that large clouds add complexity as they deal with multiple clusters.

Clearly, choosing a virtualization architecture is difficult with significant trade-offs that must be considered.  It would be easy to get lost in the technical weeds except that the ultimate choice seems to be more stylistic.

Ultimately, the choice of virtualization approach comes down to your capability to manage and support cloud operations.  The Hypervisor+OS approach maximum flexibility and minimum cost but requires an investment to build a level competence.  Generally, this choice pervades an overall approach to embrace open cloud operations.  Selecting more controlled models for virtualization reduces risk for operations and allows operators to leverage (at a price, of course) their vendor’s core competencies and mature software delivery timelines.

While all of these choices are seeing strong adoption in the general market, I have been looking at the OpenStack community in particular.  In that community, the primary architectural choice is an agent per host instead of clusters.  KVM is favored for development and is the hypervisor of NASA’s Nova implementation.  XenServer has strong support from both Citrix and Rackspace. 

Choice is good: know thyself.