AWS Ops patterns set the standard: embrace that and accelerate

RackN creates infrastructure agnostic automation so you can run physical and cloud infrastructure with the same elastic operational patterns.  If you want to make infrastructure unimportant then your hybrid DevOps objective is simple:

Create multi-infrastructure Amazon equivalence for ops automation.

Ecosystem View of AWSEven if you are not an AWS fan, they are the universal yardstick (15 minute & 40 minute presos) That goes for other clouds (public and private) and for physical infrastructure too. Their footprint is simply so pervasive that you cannot ignore “works on AWS” as a need even if you don’t need to work on AWS.  Like PCs in the late-80s, we can use vendor competition to create user choice of infrastructure. That requires a baseline for equivalence between the choices. In the 90s, the Windows’ monopoly provided those APIs.

Why should you care about hybrid DevOps? As we increase operational portability, we empower users to make economic choices that foster innovation.  That’s valuable even for AWS locked users.

We’re not talking about “give me a VM” here! The real operational need is to build accessible, interconnected systems – what is sometimes called “the underlay.” It’s more about networking, configuration and credentials than simple compute resources. We need consistent ways to automate systems that can talk to each other and static services, have access to dependency repositories (code, mirrors and container hubs) and can establish trust with other systems and administrators.

These “post” provisioning tasks are sophisticated and complex. They cannot be statically predetermined. They must be handled dynamically based on the actual resource being allocated. Without automation, this process becomes manual, glacial and impossible to maintain. Does that sound like traditional IT?

Side Note on Containers: For many developers, we are adding platforms like Docker, Kubernetes and CloudFoundry, that do these integrations automatically for their part of the application stack. This is a tremendous benefit for their use-cases. Sadly, hiding the problem from one set of users does not eliminate it! The teams implementing and maintaining those platforms still have to deal with underlay complexity.

I am emphatically not looking for AWS API compatibility: we are talking about emulating their service implementation choices.  We have plenty of ways to abstract APIs. Ops is a post-API issue.

In fact, I believe that red herring leads us to a bad place where innovation is locked behind legacy APIs.  Steal APIs where it makes sense, but don’t blindly require them because it’s the layer under them where the real compatibility challenge lurk.  

Side Note on OpenStack APIs (why they diverge): Trying to implement AWS APIs without duplicating all their behaviors is more frustrating than a fresh API without the implied AWS contracts.  This is exactly the problem with OpenStack variation.  The APIs work but there is not a behavior contract behind them.

For example, transitioning to IPv6 is difficult to deliver because Amazon still relies on IPv4. That lack makes it impossible to create hybrid automation that leverages IPv6 because they won’t work on AWS. In my world, we had to disable default use of IPv6 in Digital Rebar when we added AWS. Another example? Amazon’s regional AMI pattern, thankfully, is not replicated by Google; however, their lack means there’s no consistent image naming pattern.  In my experience, a bad pattern is generally better than inconsistent implementations.

As market dominance drives us to benchmark on Amazon, we are stuck with the good, bad and ugly aspects of their service.

For very pragmatic reasons, even AWS automation is highly fragmented. There are a large and shifting number of distinct system identifiers (AMIs, regions, flavors) plus a range of user-configured choices (security groups, keys, networks). Even within a single provider, these options make impossible to maintain a generic automation process.  Since other providers logically model from AWS, we will continue to expect AWS like behaviors from them.  Variation from those norms adds effort.

Failure to follow AWS without clear reason and alternative path is frustrating to users.

Do you agree?  Join us with Digital Rebar creating real a hybrid operations platform.

Hidden costs of Cloud? No surprises, it’s still about complexity = people cost

Last week, Forbes and ZDnet posted articles discussing the cost of various cloud (451 source material behind wall) full of dollar per hour costs analysis.  Their analysis talks about private infrastructure being an order of magnitude cheaper (yes, cheaper) to own than public cloud; however, the open source price advantages offered by OpenStack are swallowed by added cost of finding skilled operators and its lack of maturity.

At the end of the day, operational concerns are the differential factor.

The Magic 8 Cube

The Magic 8 Cube

These articles get tied down into trying to normalize clouds to $/vm/hour analysis and buried the lead that the operational decisions about what contributes to cloud operational costs.   I explored this a while back in my “magic 8 cube” series about six added management variations between public and private clouds.

In most cases, operations decisions is not just about cost – they factor in flexibility, stability and organizational readiness.  From that perspective, the additional costs of public clouds and well-known stacks (VMware) are easily justified for smaller operations.  Using alternatives means paying higher salaries and finding talent that requires larger scale to justify.

Operational complexity is a material cost that strongly detracts from new platforms (yes, OpenStack – we need to address this!)

Unfortunately, it’s hard for people building platforms to perceive the complexity experienced by people outside their community.  We need to make sure that stability and operability are top line features because complexity adds a very real cost because it comes directly back to cost of operation.

In my thinking, the winners will be solutions that reduce BOTH cost and complexity.  I’ve talked about that in the past and see the trend accelerating as more and more companies invest in ops automation.

Talking Functional Ops & Bare Metal DevOps with vBrownBag [video]

Last Wednesday (3/11/15), I had the privilege of talking with the vBrownBag crowd about Functional Ops and bare metal deployment.  In this hour, I talk about how functional operations (FuncOps) works as an extension of ready state.  FuncOps is a critical concept for providing abstractions to scale heterogeneous physical operations.

Timing for this was fantastic since we’d just worked out ESXi install capability for OpenCrowbar (it will exposed for work starting on Drill, the next Crowbar release cycle).

Here’s the brown bag:

If you’d like to see a demo, I’ve got hours of them posted:

Video Progression

Crowbar v2.1 demo: Visual Table of Contents [click for playlist]

VMware Integrated OpenStack (VIO) is smart move, it’s like using a Volvo to tow your ski boat

I’m impressed with VMware’s VIO (beta) play and believe it will have a meaningful positive impact in the OpenStack ecosystem.  In the short-term, it paradoxically both helps enterprises stay on VMware and accelerates adoption of OpenStack.  The long term benefit to VMware is less clear.

From VWVortex

Sure, you can use a Volvo to tow a boat

Why do I think it’s good tactics?  Let’s explore an analogy….

My kids think owning a boat will be super fun with images of ski parties and lazy days drifting at anchor with PG13 umbrella drinks; however, I’ve got concerns about maintenance, cost and how much we’d really use it.  The problem is not the boat: it’s all of the stuff that goes along with ownership.  In addition to the boat, I’d need a trailer, a new car to pull the boat and driveway upgrades for parking.  Looking at that, the boat’s the easiest part of the story.

The smart move for me is to rent a boat and trailer for a few months to test my kids interest.  In that case, I’m going to be towing the boat using my Volvo instead of going “all in” and buying that new Ferd 15000 (you know you want it).  As a compromise, I’ll install a hitch in my trusty sedan and use it gently to tow the boat.  It’s not ideal and causes extra wear to the transmission but it’s a very low risk way to explore the boat owning life style.

Enterprise IT already has the Volvo (VMware vCenter) and likely sees calls for OpenStack as the illusion of cool ski parties without regard for the realities of owning the boat.  Pulling the boat for a while (using OpenStack on VMware) makes a lot of sense to these users.  If the boat gets used then they will buy the truck and accessories (move off VMware).  Until then, their still learning about the open source boating life style.

Putting open source concerns aside.  This helps VMware lead the OpenStack play for enterprises but may ultimately backfire if they have not setup their long game to keep the customers.

Dell Crowbar Project: Open Source Cloud Deployer expands into the Community

Note: Cross posted on Dell Tech Center Blogs.

Background: Crowbar is an open source cloud deployment framework originally developed by Dell to support our OpenStack and Hadoop powered solutions.  Recently, it’s scope has increased to include a DevOps operations model and other deployments for additional cloud applications.

It’s only been a matter of months since we open sourced the Dell Crowbar Project at OSCON in June 2011; however, the progress and response to the project has been over whelming.  Crowbar is transforming into a community tool that is hardware, operating system, and application agnostic.  With that in mind, it’s time for me to provide a recap of Crowbar for those just learning about the project.

Crowbar started out simply as an installer for the “Dell OpenStack™-Powered Cloud Solution” with the objective of deploying a cloud from unboxed servers to a completely functioning system in under four hours.  That meant doing all the BIOS, RAID, Operations services (DNS, NTP, DHCP, etc.), networking, O/S installs and system configuration required creating a complete cloud infrastructure.  It was a big job, but one that we’d been piecing together on earlier cloud installation projects.  A key part of the project involved collaborating with Opscode Chef Server on the many system configuration tasks.  Ultimately, we met and exceeded the target with a complete OpenStack install in less than two hours.

In the process of delivering Crowbar as an installer, we realized that Chef, and tools like it, were part of a larger cloud movement known as DevOps.

The DevOps approach to deployment builds up systems in a layered model rather than using packaged images.  This layered model means that parts of the system are relatively independent and highly flexible.  Users can choose which components of the system they want to deploy and where to place those components.  For example, Crowbar deploys Nagios by default, but users can disable that component in favor of their own monitoring system.  It also allows for new components to identify that Nagios is available and automatically register themselves as clients and setup application specific profiles.  In this way, Crowbar’s use of a DevOps layered deployment model provides flexibility for BOTH modularized and integrated cloud deployments.

We believe that operations that embrace layered deployments are essential for success because they allow our customers to respond to the accelerating pace of change.  We call this model for cloud data centers “CloudOps.”

Based on the flexibility of Crowbar, our team decided to use it as the deployment model for our Apache™ Hadoop™ project (“Dell | Apache Hadoop Solution”).  While a good fit, adding Hadoop required expanding Crowbar in several critical ways.

  1. We had to make major changes in our installation and build processes to accommodate multi-operating system support (RHEL 5.6 and Ubuntu 10.10 as of Oct 2011).
  2. We introduced a modularization concept that we call “barclamps” that package individual layers of the deployment infrastructure.  These barclamps reach from the lowest system levels (IPMI, BIOS, and RAID) to the highest (OpenStack and Hadoop).

Barclamps are a very significant architecture pattern for Crowbar:

  1. They allow other applications to plug into the framework and leverage other barclamps in the solution.  For example, VMware created a Cloud Foundry barclamp and Dream Host has created a Ceph barclamp.  Both barclamps are examples of applications that can leverage Crowbar for a repeatable and predictable cloud deployment.
  2. They are independent modules with their own life cycle.  Each one has its own code repository and can be imported into a live system after initial deployment.  This allows customers to expand and manage their system after initial deployment.
  3. They have many components such as Chef Cookbooks, custom UI for configuration, dependency graphs, and even localization support.
  4. They offer services that other barclamps can consume.  The Network barclamp delivers many essential services for bootstrapping clouds including IP allocation, NIC teaming, and node VLAN configuration.
  5. They can provide extensible logic to evaluate a system and make deployment recommendations.  So far, no barclamps have implemented more than the most basic proposals; however, they have the potential for much richer analysis.

Making these changes was a substantial investment by Dell, but it greatly expands the community’s ability to participate in Crowbar development.  We believe these changes were essential to our team’s core values of open and collaborative development.

Most recently, our team moved Crowbar development into the open.  This change was reflected in our work on OpenStack Diablo (+ Keystone and Dashboard) with contributions by Opscode and Rackspace Cloud Builders.  Rather than work internally and push updates at milestones, we are now coding directly from the Crowbar repositories on Github.  It is important to note that for licensing reasons, Dell has not open sourced the optional BIOS and RAID barclamps.  This level of openness better positions us to collaborate with the crowbar community.

For a young project, we’re very proud of the progress that we’ve made with Crowbar.  We are starting a new chapter that brings new challenges such as expanding community involvement, roadmap transparency, and growing Dell support capabilities.  You will also begin to see optional barclamps that interact with proprietary and licensed hardware and software.  All of these changes are part of growing Crowbar in framework that can support a vibrant and rich ecosystem.

We are doing everything we can to make it easy to become part of the Crowbar community.  Please join our mailing list, download the open source code or ISO, create a barclamp, and make your voice heard.  Since Dell is funding the core development on this project, contacting your Dell salesperson and telling them how much you appreciate our efforts goes a long way too.

Collaboration between Dell Crowbar & VMware Cloud Foundry – unleashes your inner cloud

Sometimes a single sprint can deliver magic: when I signed up to document how to create a Crowbar module (aka a barclamp) two weeks ago, I had no idea that it would add a new flavor to Crowbar .

I’m proud to announce that the first public non-Dell Crowbar module will be supporting the VMware Cloud Foundry Open PaaS project.

Development is still in progress (on the Crowbar “CF” branch) and you’ll be able to watch us (even help!) collaborate on this project.  Initially, the deployment will be to single server but we’re hoping to quickly expand to a distributed install that fully leverages the capabilities of both projects.

By creating a Crowbar module, Cloud Foundry™ is able to leverage the cloud deployment capabilities that allow it to be setup on any physical or virtualized data center.  This is core to the Crowbar message: the value of a cloud solution can best be realized when it’s coupled with open practices for deploying it.

There are many significant aspects of this collaboration:

  1. Cloud Foundry is taking the right approach to PaaS.  Their team’s perspective on PaaS mirrors my own: A PaaS is a collection of application services.  That approach makes it extensible and flexible.  Plus, they are also multi-language and multi-platform.
  2. Crowbar is proving our breadth of support.  Last week we announced coming RHEL support and now adding Cloud Foundry is a natural extension.  We did not design Crowbar to be a one-trick pony.  It’s modular design makes it easy to extend while leveraging the existing body of work.
  3. Big companies are acting like start-ups.  Both Crowbar and Cloud Foundry are projects that focus on putting the core functionality out quickly to prove their value proposition, get feedback, and change the game.  This collaboration is positive proof of these companies being Agile and starting a project Lean.
  4. Big companies are acting in the open.  Both Dell via Crowbar and VMware via Cloud Foundry are contributing their source and working on it in the open.

Stay tuned for that “how to create a barclamp” post (or check out the barclamp rake task).

For more information:

Video of Crowbar Install & Introduction (two 15 min videos)

These Crowbar videos are the first two in a series of how to setup and use your own local Crowbar dev environment (here’s more info & the ISO).   I used VMware Workstation, but any virtual hosts that support Ubuntu 10.10 will work fine.  We use ESX, KVM and Xen for testing too.

Creating this environment is the basis for learning Crowbar, experimenting with OpenStack and creating your own barclamps.  It’s also a handy way to play with Opscode Chef since it includes a stand alone Chef server.

Some items from the install video that I want to re-emphasize:

  • The admin server REQUIRES >1 GB of RAM (more is better)
  • All other nodes REQUIRE > 512 MB of RAM (more depending on what you install)
  • At least TWO NICS are required.
  • You must disable DHCP on the virtual network because Crowbar has a DHCP server and they will conflict.
  • The login for the Crowbar admin server is openstack/openstack.  You must “sudo -i” before you run the install script.
  • The default url for Crowbar is http://192.168.124.10:3000  (crowbar/crowbar)

Installing Crowbar (15 mins)

Introduction to Crowbar (15 mins)

9/30 Bonus Video: How to change the default networking

Continue reading

PaaS Simplified: an application architecture that responds to load

handoff

In addition to attending the great sessions at the OpenStack Design Conference, our Dell team realized that we’ve been making Platform as a Service (PaaS) much more complex.  Stripping away the detritus is important because it looks like “What is a PaaS” is changing on a daily basis so boiling it down to the must fundamental is essential.

At its core, a PaaS is an application that changes its architecture based on the load.   That’s it no further definition is required.

I’ve been playing with this definition since April and am finding that it’s a much more productive definition of PaaS than any that I’ve used so far.  The reason is that it’s

  1. application focused,
  2. not language or services bound and
  3. captures the business use cases

Of course, I’m going to have to provide more backup in future posts.  I want to invite discussion about this perspective on PaaS.  I’m especially interesting in seeing how recent offerings from VMware (OpenPaaS/CloudFoundry) or Amazon (Elastic Beanstalk) measure against this concept.

McCrory lays out VMware vision

Props are due to Dave McCrory for his fine investigative work reading the VMware cloudy tea leaves.  Over the weekend, he posted a series of articles about VMware’s Open PaaS and VMforce offerings.  This is a significant write-up based on information gleaned from their public code check-ins that he validated with them after the fact.

I have not had time to digest it yet – check back later for actual commentary.

Alert the villagers, it’s Frankencloud!

I’m growing more and more concerned about the preponderance of Frankencloud offerings that I see being foisted into the market place (no, my employer, Dell, is not guiltless).  Frankenclouds are “cloud solutions” that are created by using duct tape, twine, wishful marketing brochures, and at least 4 marginally cloud enabled products.

The official Frankencloud recipe goes like this:

  • Take 1 product that includes server virtualization (substitutions to VMware at your own risk)
  • Take 1 product that does storage virtualization (substitutions to SAN at your own risk)
  • Take 1 product that does network virtualization (substitutions to VLANs at your own risk)
  • Take 1 product that does IT orchestration (your guess is as good as any)
  • Take 1 product that does IT monitoring
  • Take 1 product that does Virtualization monitoring
  • Recommended: an unlimited Pizza budget for your IT Ops team

Combine the ingredients at high voltage in a climate conditioned environment.  Stir in a seriously large amounts of consulting services, training, and Red Bull.  At the end of this process, you will have your very own Frankencloud!

Frankenclouds are notoriously difficult to maintain because each part has its own version life cycle.  More critically, they also lack a brain.

Unfortunately, there are few alternatives to the Frankencloud today.  I think that the alternatives will rewrite the rules that Ops uses to create clouds.  Here are the rules that I think help drive a wooden stake through the heart of the Frankencloud (yeah, I mixed monsters):

  • not assume that server virtualization == cloud. 
  • simple, simple and simpler than that
  • focus on applications (need to write more about DevOps)
  • start with networking, not computation
  • assume that software containers are replaced, not upgraded

What do you think we can do to defeat Frankenclouds?