Podcast: Gina Rosenthal (Minks) on Ops Challenges, Day 2 Ops Support, and Dev Ops Communication

In this week’s podcast, we speak with Gina Rosenthal (Minks), Product Marketing Manager, VMware and experienced sys-admin/operator. She also hosts the Wide World of Tech podcast.

  • Cloud debate on virtualization and hypervisors as requirement
  • What makes Ops so hard?
  • Technical Communities for Day 2 Ops
  • Community Support for Vendors and Open Source
  • Is DevOps different than 5 years ago?
  • Devs and Operators Communication and Working Together

Topic                                                   Time (Minutes.Seconds)

Introduction                                             0.0 – 0.55
Background and Current Work            0.55 – 2.05
Wide World of Tech Podcast                2.05 – 4.00
Sys-Admin and Operators                     4.00 – 4.50
vSphere & Hypervisors for Cloud         4.50 – 5.28 (Hypervisors are a MUST for Cloud?)
What is a Cloud? Virtualization              5.28 – 7.33 (Building Blocks are Virtual?)
OpenStack Experience                           7.33 – 8.16 (Didn’t Fix Metal Part)
What makes Ops so hard?                     8.16 – 12.25
Devs want latest and Ops has old        12.25 – 16.03 (Demos and Stories)
Demo Day 2 for Ops                                16.03 – 19.10 (Maintaining product post install issues)
Community Vendor vs Open Source    19.10 – 25.03 (Vendors not accepted in open source)
Choosing Multiple Vendors/Tech         25.03 – 27.18 (Innovations and Stability)
2 Classes of Operators                            27.18 – 30.00 (Tension b/w new and stable is good)
DevOps is Dead                                        30.00 – 37.44 (VMware covered over Ops issues)
Too Much Abstraction for Devs?            37.44 – 49.55 (Key to Ops and Devs Communication)
Wrap Up                                                     49.55 – END

Podcast Guest
Gina Rosenthal (Minks), Product Marketing Manager, VMware

I have a varied background: technical trainer, *nix sysadmin, technical training developer, community manager, social media marketing manager, and now product marketing manager.

Those are just my paid gigs, I also have a social justice background, and have been blogging for 12 years. All these threads weave together in interesting and powerful ways.

At my core, I’m a storyteller and educator. I’m interested in telling the story of technology in simple, clear terms.

To avoid echo chamber, OpenStack must embrace competitive cloud ecosystem

wpid-20151023_100533.jpg
Japanese Bullet Train View

I was in Japan before the Tokyo summit on a bullet train to Kyoto watching the mix of heavy industry and bucolic mountains pass by. That scene reflects an OpenStack duality: we want to be both a dominant platform delivering core cloud services and an open source values driven collective.

First, I fundamentally believe in the success of OpenStack as the open virtual infrastructure management platform.

I believe that we have solved the virtual compute/storage/network problem sufficiently to become the de facto open IaaS platform. While not perfect, the technologies are sufficient assuming we continue to improve ease of use and operational hardening. Pursing that base capability is my primary motivation for DefCore work.

I don’t believe that the OpenStack community is, or should try to become, the authority on “all things cloud.”

In the presence of Amazon, VMware, Microsoft and Google, we cannot make that claim with any degree of self-respect. Even newcomers like DigitalOcean have an undeniable footprint and influence. Those vendor platforms drive cloud ecosystems and technologies which foster fast innovation because there is no friction to joining their ecosystems and they are sufficiently large and stable enough to represent a target market. We’ve seen clear signs from Rackspace, HP and others that platform diversity improves cloud strength.

I continue to think we (OpenStack) spend too much time evaluating what is “in” or “out” of the project and too little time talking about what’s “on,” “under” and “with” the project like Kubernetes, Mesos, Docker, SDN, Hadoop and Ceph. That type of thinking creates distance between OpenStack efforts and the majority of the market.

What motivates the drive to an all open captive community? It’s the reasonable concern that critical parts of the infrastructure will become pay-to-play. For example, what if a non-OpenStack alternative to Heat Orchestration gained popularity for OpenStack implementers. Perhaps something that ran on Amazon also. That would create external pressure that would drive internal priorities. These “non-OpenStack” products would then have influence without having to contribute back to upstream.

Can we afford to have external entities driving internal priorities? Hell yes, that’s what customer adoption looks like.

OpenStack does not own the market sufficiently to create cloud echo chamber. The next wave of cloud innovation (my money is on container platforms) will follow the path of least resistance and widest adoption. We need to embrace that these innovations will not all be inside our community so that we can welcome them as part of our ecosystem. The community needs to find peace with that.

Hidden costs of Cloud? No surprises, it’s still about complexity = people cost

Last week, Forbes and ZDnet posted articles discussing the cost of various cloud (451 source material behind wall) full of dollar per hour costs analysis.  Their analysis talks about private infrastructure being an order of magnitude cheaper (yes, cheaper) to own than public cloud; however, the open source price advantages offered by OpenStack are swallowed by added cost of finding skilled operators and its lack of maturity.

At the end of the day, operational concerns are the differential factor.

The Magic 8 Cube

The Magic 8 Cube

These articles get tied down into trying to normalize clouds to $/vm/hour analysis and buried the lead that the operational decisions about what contributes to cloud operational costs.   I explored this a while back in my “magic 8 cube” series about six added management variations between public and private clouds.

In most cases, operations decisions is not just about cost – they factor in flexibility, stability and organizational readiness.  From that perspective, the additional costs of public clouds and well-known stacks (VMware) are easily justified for smaller operations.  Using alternatives means paying higher salaries and finding talent that requires larger scale to justify.

Operational complexity is a material cost that strongly detracts from new platforms (yes, OpenStack – we need to address this!)

Unfortunately, it’s hard for people building platforms to perceive the complexity experienced by people outside their community.  We need to make sure that stability and operability are top line features because complexity adds a very real cost because it comes directly back to cost of operation.

In my thinking, the winners will be solutions that reduce BOTH cost and complexity.  I’ve talked about that in the past and see the trend accelerating as more and more companies invest in ops automation.

VMware Integrated OpenStack (VIO) is smart move, it’s like using a Volvo to tow your ski boat

I’m impressed with VMware’s VIO (beta) play and believe it will have a meaningful positive impact in the OpenStack ecosystem.  In the short-term, it paradoxically both helps enterprises stay on VMware and accelerates adoption of OpenStack.  The long term benefit to VMware is less clear.

From VWVortex

Sure, you can use a Volvo to tow a boat

Why do I think it’s good tactics?  Let’s explore an analogy….

My kids think owning a boat will be super fun with images of ski parties and lazy days drifting at anchor with PG13 umbrella drinks; however, I’ve got concerns about maintenance, cost and how much we’d really use it.  The problem is not the boat: it’s all of the stuff that goes along with ownership.  In addition to the boat, I’d need a trailer, a new car to pull the boat and driveway upgrades for parking.  Looking at that, the boat’s the easiest part of the story.

The smart move for me is to rent a boat and trailer for a few months to test my kids interest.  In that case, I’m going to be towing the boat using my Volvo instead of going “all in” and buying that new Ferd 15000 (you know you want it).  As a compromise, I’ll install a hitch in my trusty sedan and use it gently to tow the boat.  It’s not ideal and causes extra wear to the transmission but it’s a very low risk way to explore the boat owning life style.

Enterprise IT already has the Volvo (VMware vCenter) and likely sees calls for OpenStack as the illusion of cool ski parties without regard for the realities of owning the boat.  Pulling the boat for a while (using OpenStack on VMware) makes a lot of sense to these users.  If the boat gets used then they will buy the truck and accessories (move off VMware).  Until then, their still learning about the open source boating life style.

Putting open source concerns aside.  This helps VMware lead the OpenStack play for enterprises but may ultimately backfire if they have not setup their long game to keep the customers.

The real workloads begin: Crowbar’s Sophomore Year

Given Crowbar‘s frenetic Freshman year, it’s impossible to predict everything that Crowbar could become. I certainly aspire to see the project gain a stronger developer community and the seeds of this transformation are sprouting. I also see that community driven work is positioning Crowbar to break beyond being platforms for OpenStack and Apache Hadoop solutions that pay the bills for my team at Dell to invest in Crowbar development.

I don’t have to look beyond the summer to see important development for Crowbar because of the substantial goals of the Crowbar 2.0 refactor.

Crowbar 2.0 is really just around the corner so I’d like to set some longer range goals for our next year.

  • Growing acceptance of Crowbar as an in data center extension for DevOps tools (what I call CloudOps)
  • Deeper integration into more operating environments beyond the core Linux flavors (like virtualization hosts, closed and special purpose operating systems.
  • Improvements in dynamic networking configuration
  • Enabling more online network connected operating modes
  • Taking on production ops challenges of scale, high availability and migration
  • Formalization of our community engagement with summits, user groups, and broader developer contributions.

For example, Crowbar 2.0 will be able to handle downloading packages and applications from the internet. Online content is not a major benefit without being able to stage and control how those new packages are deployed; consequently, our goals remains tightly focused improvements in orchestration.

These changes create a foundation that enables a more dynamic operating environment. Ultimately, I see Crowbar driving towards a vision of fully integrated continuous operations; however, Greg & Rob’s Crowbar vision is the topic for tomorrow’s post.

Please support me for the OpenStack Policy Board

I’m posting my OpenStack bio here and asking for support putting me on the Policy Board by voting for me.  NOTE: You can only vote if you’re registered and you got the “Poll: OpenStack Governance Elections” email.

Project Policy Board Objective

I am seeking a role on the OpenStack Policy Board to further the adoption of OpenStack within and beyond the community.  As the OpenStack technology lead within Dell, I am the engineer who is most actively engaged with field deployments; consequently, I am uniquely positioned to represent our development community, hosters and enterprise user bases.  I bring substantial process experience (Agile/Lean/CI) into my decision making.  My focus will be on ensuring OpenStack is deployable and ready for use.

Background

I am a Principal Engineer at Dell working as the lead for our OpenStack cloud initiative (http://dell.com/openstack).  My team at Dell is responsible for bringing hyper-scale cloud solutions to market and works closely with our cloud optimized hardware division (DCS).  Before working on the OpenStack project, I was involved in cloud projects for Azure, Eucalyptus, and Joyent at Dell.
My involvement with OpenStack goes back to the very earliest days before the project was launched where I was part of the evaluation team that advocated for Dell to join the project.  Since then, I have been active participant at every design conference.  It was my recommendation that Dell focus on making deployment capabilities for OpenStack and to ensure that those contributions are open sourced (Apache 2).  At this point in the project, I am Dell’s technical authority on OpenStack for community and customer interactions.
My team is responsible for the Crowbar cloud deployer (http://github.com/dellcloudedge/crowbar).  The purpose of this project is to ensure that OpenStack is be quickly and reliably deployed in a wide range of configurations on any hardware platform.  I believe that ease of deployability is essential for the success of OpenStack as a project because it ensure adoption by non-developers.  I also believe strongly in continuous integration and am working to adapt Crowbar as a CI platform.  I have been the primary driver to ensure that the Crowbar project is open sourced and accepting of input from the community.
My team also designs technical reference architectures (RAs) for OpenStack.  These RAs help drive adoption by providing crisp guidance on how deploy OpenStack.  I am a vocal proponent of open operations (keeping best practices public) and following a DevOps approach for ongoing cloud deployment life-cycles.
In addition to my work at Dell, I work to ensure community access and communication.  My independent blog provides technical detail and insights about the OpenStack and other cloud initiatives.  My blog also focuses on Agile and Lean practices that I believe are essential to success in technology innovation.
I have been working with cloud computing since 2001.  The company I founded with Dave McCrory (@mccrory), now owned by Quest, was the first multi-server VMware ESX deployment ouside of VMware.  We pioneered the concept of elastic vm management (look up the patents!) so I have a very deep understanding of the problems and architectures required.

Collaboration between Dell Crowbar & VMware Cloud Foundry – unleashes your inner cloud

Sometimes a single sprint can deliver magic: when I signed up to document how to create a Crowbar module (aka a barclamp) two weeks ago, I had no idea that it would add a new flavor to Crowbar .

I’m proud to announce that the first public non-Dell Crowbar module will be supporting the VMware Cloud Foundry Open PaaS project.

Development is still in progress (on the Crowbar “CF” branch) and you’ll be able to watch us (even help!) collaborate on this project.  Initially, the deployment will be to single server but we’re hoping to quickly expand to a distributed install that fully leverages the capabilities of both projects.

By creating a Crowbar module, Cloud Foundry™ is able to leverage the cloud deployment capabilities that allow it to be setup on any physical or virtualized data center.  This is core to the Crowbar message: the value of a cloud solution can best be realized when it’s coupled with open practices for deploying it.

There are many significant aspects of this collaboration:

  1. Cloud Foundry is taking the right approach to PaaS.  Their team’s perspective on PaaS mirrors my own: A PaaS is a collection of application services.  That approach makes it extensible and flexible.  Plus, they are also multi-language and multi-platform.
  2. Crowbar is proving our breadth of support.  Last week we announced coming RHEL support and now adding Cloud Foundry is a natural extension.  We did not design Crowbar to be a one-trick pony.  It’s modular design makes it easy to extend while leveraging the existing body of work.
  3. Big companies are acting like start-ups.  Both Crowbar and Cloud Foundry are projects that focus on putting the core functionality out quickly to prove their value proposition, get feedback, and change the game.  This collaboration is positive proof of these companies being Agile and starting a project Lean.
  4. Big companies are acting in the open.  Both Dell via Crowbar and VMware via Cloud Foundry are contributing their source and working on it in the open.

Stay tuned for that “how to create a barclamp” post (or check out the barclamp rake task).

For more information:

PaaS Simplified: an application architecture that responds to load

handoff

In addition to attending the great sessions at the OpenStack Design Conference, our Dell team realized that we’ve been making Platform as a Service (PaaS) much more complex.  Stripping away the detritus is important because it looks like “What is a PaaS” is changing on a daily basis so boiling it down to the must fundamental is essential.

At its core, a PaaS is an application that changes its architecture based on the load.   That’s it no further definition is required.

I’ve been playing with this definition since April and am finding that it’s a much more productive definition of PaaS than any that I’ve used so far.  The reason is that it’s

  1. application focused,
  2. not language or services bound and
  3. captures the business use cases

Of course, I’m going to have to provide more backup in future posts.  I want to invite discussion about this perspective on PaaS.  I’m especially interesting in seeing how recent offerings from VMware (OpenPaaS/CloudFoundry) or Amazon (Elastic Beanstalk) measure against this concept.

Virtualizing #OpenStack Nova: looking at the many ways to skin the CAcTus (#KVM v #XenServer v #ESX)

<service bulletin> Server virtualization is not cloud: it is a commonly used technology that creates convenient  resource partitions for cloud operations and infrastructure as a service providers. </service bulletin>

OpenStack claims support for nearly every virtualization platform on the market.  While the basics of “what is virtualization” are common across all platforms, there are important variances in how these platforms are deployed.   It is important to understand these variances to make informed choices about virtualization platforms. 

Your virtualization model choice will have deep implications on your server/networking choice, deployment methodology and operations infrastructure.

My focus is on architecture not specific hypervisors so I’m generalizing to just three to make the each architecture description more concrete:

  1. KVM (open source) is highly used by developers and single host systems
  2. XenServer (open/freemium) leads public cloud infrastructure (Amazon EC2, Rackspace Cloud, and GoGrid)
  3. ESX/vCenter (licensed) leads enterprise virtualized infrastructure

Of course, there are many more hypervisors and many different ways to deploy the three I’m referencing.

This picture shows all three options as a single system.  In practice, only operators wishing to avoid exposure to RESTful recreational activities would implement multiple virtualization architectures in a single system.   Let’s explore the three options:

OS + Hypervisor (KVM) architecture deploys the hypervisor a free standing application on top of an operating system (OS).  In this model, the service provider manages the OS and the hypervisor independently.  This means that the OS needs to be maintained, but is also allows the OS to be enhanced to better manage the cloud or add other functions (share storage).  Because they are least restricted, free standing hypervisors lead the virtualization innovation wave.

Bare Metal Hypervisor (XenServer) architecture integrates the hypervisor and the OS as a single unit.  In this model, the service provider manages the hypervisor as a single unit.  This makes it easier to support and maintain the hypervisor because the platform can be tightly controlled; however, it limits the operator’s ability to extend or multi-purpose the server.   In this model, operators may add agents directly to the individual hypervisor but would not make changes to the underlying OS or resource allocation.

Clustered Hypervisor (ESX + vCenter) architecture integrates multiple servers into a single hypervisor pool.  In this model, the service provider does not manage the individual hypervisor; instead, they operate the environment through the cluster supervisor.  This makes it easier to perform resource balancing and fault tolerance within the domain of the cluster; however, the operator must rely on the supervisor because directly managing the system creates a multi-master problem.  Lack of direct management improves supportability at the cost of flexibility.  Scale is also a challenge for clustered hypervisors because their span of control is limited to practical resource boundaries: this means that large clouds add complexity as they deal with multiple clusters.

Clearly, choosing a virtualization architecture is difficult with significant trade-offs that must be considered.  It would be easy to get lost in the technical weeds except that the ultimate choice seems to be more stylistic.

Ultimately, the choice of virtualization approach comes down to your capability to manage and support cloud operations.  The Hypervisor+OS approach maximum flexibility and minimum cost but requires an investment to build a level competence.  Generally, this choice pervades an overall approach to embrace open cloud operations.  Selecting more controlled models for virtualization reduces risk for operations and allows operators to leverage (at a price, of course) their vendor’s core competencies and mature software delivery timelines.

While all of these choices are seeing strong adoption in the general market, I have been looking at the OpenStack community in particular.  In that community, the primary architectural choice is an agent per host instead of clusters.  KVM is favored for development and is the hypervisor of NASA’s Nova implementation.  XenServer has strong support from both Citrix and Rackspace. 

Choice is good: know thyself.

McCrory lays out VMware vision

Props are due to Dave McCrory for his fine investigative work reading the VMware cloudy tea leaves.  Over the weekend, he posted a series of articles about VMware’s Open PaaS and VMforce offerings.  This is a significant write-up based on information gleaned from their public code check-ins that he validated with them after the fact.

I have not had time to digest it yet – check back later for actual commentary.