Container Migration 101: Cloudcast.net & Lachlan Evenson

Last week, the CloudCast.net interviewed Lachlan Evenson (now at Deis!).  I highly recommend listening to the interview because he has a unique and deep experience with OpenStack, Kubernetes and container migration.

15967I had the good fortune of lunching with Lachie just before the interview aired.  We got compare notes about changes going on in the container space.  Some of those insights will end up in my OpenStack Barcelona talk “Will it Blend? The Joint OpenStack Kubernetes Environment.”

There’s no practical way to rehash our whole lunch discussion as a post; however, I can point you to some key points [with time stamps] in his interview that I found highly insightful:

  • [7:20] In their pre-containers cloud pass, they’d actually made it clunky for the developers and it hurt their devops attempts.
  • [17.30] Developers advocating for their own use and value is a key to acceptance.  A good story follows…
  • [29:50] We’d work with the app dev teams and if it didn’t fit then we did not try to make it fit.

Overall, I think Lachie does a good job reinforcing that containers create real value to development when there’s a fit between the need and the technology.

Also, thanks Brian and Aaron for keeping such a great podcast going!

 

 

yes, we are papering over Container ops [from @TheNewStack #DockerCon]

thenewstackIn this brief 7 minute interview made at DockerCon 16, Alex Williams and I cover a lot of ground ranging from operations’ challenges in container deployment to the early seeds of the community frustration with Docker 1.12 embedding swarm.

I think there’s a lot of pieces we’re still wishing away that aren’t really gone. (at 4:50)

Rather than repeat TheNewStack summary; I want to highlight the operational and integration gaps that we continue to ignore.

It’s exciting to watch a cluster magically appear during a keynote demo, but those demos necessarily skip pass the very real provisioning, networking and security work needed to build sustained clusters.

These underlay problems are general challenges that we can address in composable, open and automated ways.  That’s the RackN goal with Digital Rebar and we’ll be showcasing how that works with some new Kubernetes automation shortly.

Here is the interview on SoundCloud or youtube:

 

Why Fork Docker? Complexity Wack-a-Mole and Commercial Open Source

Monday, The New Stack broke news about a possible fork of the Docker Engine and prominently quoted me saying “Docker consistently breaks backend compatibility.”  The technical instability alone is not what’s prompting industry leaders like Google, Red Hat and Huawei to take drastic and potentially risky community action in a central project.

So what’s driving a fork?  It’s the intersection of Cash, Complexity and Community.

hamsterIn fact, I’d warned about this risk over a year ago: Docker is both a core infrastucture technology (the docker container runner, aka Docker Engine) and a commercial company that manages the Docker brand.  The community formed a standard, runC, to try and standardize; however, Docker continues to deviate from (or innovate faster) that base.

It’s important for me to note that we use Docker tools and technologies heavily.  So far, I’ve been a long-time advocate and user of Docker’s innovative technology.  As such, we’ve also had to ride the rapid release roller coaster.

Let’s look at what’s going on here in three key areas:

1. Cash

The expected monetization of containers is the multi-system orchestration and support infrastructure.  Since many companies look to containers as leading the disruptive next innovation wave, the idea that Docker is holding part of their plans hostage is simply unacceptable.

So far, the open source Docker Engine has been simply included without payment into these products.  That changed in version 1.12 when Docker co-mingled their competitive Swarm product into the Docker Engine.  That effectively forces these other parties to advocate and distribute their competitors product.

2. Complexity

When Docker added cool Swarm Orchestration features into the v1.12 runtime, it added a lot of complexity too.  That may be simple from a “how many things do I have to download and type” perspective; however, that single unit is now dragging around a lot more code.

In one of the recent comments about this issue, Bob Wise bemoaned the need for infrastructure to be boring.  Even as we look to complex orchestration like Swarm, Kubernetes, Mesos, Rancher and others to perform application automation magic, we also need to reduce complexity in our infrastructure layers.

Along those lines, operators want key abstractions like containers to be as simple and focused as possible.  We’ve seen similar paths for virtualization runtimes like KVM, Xen and VMware that focus on delivering a very narrow band of functionality very well.  There is a lot of pressure from people building with containers to have a similar experience from the container runtime.

This approach both helps operators manage infrastructure and creates a healthy ecosystem of companies that leverage the runtimes.

Note: My company, RackN, believes strongly in this need and it’s a core part of our composable approach to automation with Digital Rebar.

3. Community

Multi-vendor open source is a very challenging and specialized type of community.  In these communities, most of the contributors are paid by companies with a vested (not necessarily transparent) interest in the project components.  If the participants of the community feel that they are not being supported by the leadership then they are likely to revolt.

Ultimately, the primary difference between Docker and a fork of Docker is the brand and the community.  If there companies paying the contributors have the will then it’s possible to move a whole community.  It’s not cheap, but it’s possible.

Developers vs Operators

One overlooked aspect of this discussion is the apparent lock that Docker enjoys on the container developer community.  The three Cs above really focus on the people with budgets (the operators) over the developers.  For a fork to succeed, there needs to be a non-Docker set of tooling that feeds the platform pipeline with portable application packages.

In Conclusion…

The world continues to get more and more heterogeneous.  We already had multiple container runtimes before Docker and the idea of a new one really is not that crazy right now.  We’ve already got an explosion of container orchestration and this is a reflection of that.

My advice?  Worry less about the container format for now and focus on automation and abstractions.

 

Notes from OSCON Container Podcast: Dan Berg, Phil Estes and Rob Hirschfeld

At OSCON, I had the pleasure of doing a IBM Dojo Podcast with some deep experts in the container and data center space: Dan Berg (@DanCBerg) and Phil Estes (@estesp).

ibm-dojo-podcast-show-art-16x9-150x150We dove into a discussion around significant trends in the container space, how open technology relates to containers and looked toward the technology’s future. We also previewed next month’s DockerCon, which is set for June 19-21 in Seattle.

Highlights!  We think containers will be considered MORE SECURE next year and also have some comments about the linguistic shift from Docker to CONTAINERS.”

Here are my notes from the recording with time stamps if you want to skip ahead:

  • 00:35 – What are the trends in Containers?
    • Rob: We are still figuring out how to make them work in terms of networking & storage
    • Dan: There are still a lot of stateful work moving into containers that need storage
    • Phil: We need to use open standards to help customers navigate options
  • 2:45 – Are the changes keeping people from moving forward?
    • Phil: Not if you start with the right guidelines and architecture
    • Dan: It’s OK to pick one and keep going because you need to build expertise
    • Rob: RackN experience changed Digital Rebar to microservices was an iterative experience
  • 5:00 Dan likes that there is so much experimentation that’s forcing us to talk about how applications are engineered
  • 5:45  Rob points out that we got 5 minutes in without saying “Docker”
    • There are a lot of orchestration choices but there’s confusion between Docker and the container ecosystem.
  • 7:00 We’re at OSCON, how far has the technology come in being open?
    • Phil thinks that open container initiative (OCI) is helping bring a lot of players to the field.
    • Dan likes that IBM is experimenting in community and drive interactions between projects.
    • Rob is not sure that we need to get everyone on the same page: open source allows people to pursue their own path.
  • 10:50 We have to figure out how to compensate companies & individuals for their work
    • Dan: if you’ve got any worthwhile product, you’ve got some open source component of it.  There are various ways to profit around that.
  • 13:00 What are we going to be talking about this time next year?
    • Rob (joking) we’ll say containers are old and microkernels are great!
    • Rob wants to be talking about operations but knows that it’s never interesting
    • Phil moving containers way from root access into more secure operations
    • Dan believes that we’ll start to consider containers as more secure than what we have today.  <- Rob strongly agrees!
  • 17:20 What is the impact of Containers on Ops?  Aka DevOps
    • Dan said “Impact is HUGE!”  > Developers are going to get Ops & Capabilities for free
    • Rob brings up impact of Containers on DevOps – the discussion has really gone underground
  • 19:30 Role of Service Registration (Consul & Etcd)
    • Life cycle management of Containers has really changed (Dan)
    • Rob brings up the importance of Service Registration in container management
  • 20:30 2016.Dockercon Docket- what are you expecting?
    • Phil is speaking there on the contribute track & OCI.
    • Rob is doing the hallway track and looking to talk about the “underlay” ops and the competitive space around Docker and Container.
    • Dan will be talking to customers and watching how the community is evolving and experimenting
    • Rob & Dan will be at Open Cloud Technology Summit, June 22 in Seattle

 

OpenStack is caught in a snowstorm – it’s status quo for ops implementations to be snowflakes

OpenStack got into exactly the place we expected: operations started with fragmented and divergent data centers (aka snowflaked) and OpenStack did nothing to change that. Can we fix that? Yes, but the answer involves relying on Amazon as our benchmark.

In advance of my OpenStack Summit Demo/Presentation (video!) [slides], I’ve spent the last few weeks mapping seven (and counting) OpenStack implementations into the cloud provider subsystem of the Digital Rebar provisioning platform. Before I started working on adding OpenStack integration, RackN already created a hybrid DevOps baseline. We are able to run the same Kubernetes and Docker Swarm provisioning extensions on multiple targets including Amazon, Google, Packet and directly on physical systems (aka metal).

Before we talk about OpenStack challenges, it’s important to understand that data centers and clouds are messy, heterogeneous environments.

These variations are so significant and operationally challenging that they are the fundamental design driver for Digital Rebar. The platform uses a composable operational approach to isolate and then chain automation tasks together. That allows configurations, like networking, from infrastructure specific functions to be passed into common building blocks without user intervention.

Composability is critical because it allows operators to isolate variations into modular pieces and the expose common configuration elements. Since the pattern works successfully for crossing other clouds and metal, I anticipated success with OpenStack.

The challenge is that there is not “one standard OpenStack” implementation.  This issue is well documented under OpenStack as Project Shade.

If you only plan to operate a mono-cloud then these are not concerns; however, everyone I’ve met is using at least AWS and one other cloud. This operational fact means that AWS provides the common service behavior baseline. This is not an API statement – it’s about being able to operate on the systems delivered by the API.

While the OpenStack API worked consistently on each tested cloud (win for DefCore!), it frequently delivered systems that could not be deployed or were unusable for later steps.

While these are not directly OpenStack API concerns, I do believe that additional metadata in the API could help expose material configuration choices. The challenge becomes defining those choices in a reference architecture way. The OpenStack principle of leaving implementation choices open makes it challenging to drive these options to a narrow set of choices. Unfortunately, it means it is difficult to create an intra-OpenStack hybrid automation without hard-coded vendor identities or exploding configuration flags.

As series of individually reasonable options dominoes together to make to these challenges.  These are real issues that I made the integration difficult.

  • No default of externally accessible systems. I have to assign floating IPs (an anti-pattern for individual VMs) or be on the internal networks. No consistent naming pattern for networks, types (flavors) or starting images.  In several cases, the “private” network is the publicly accessible one and the “external” network is visible but unusable.
  • No consistent naming for access user accounts.  If I want to ssh to a system, I have to fail my first login before I learn the right user name.
  • No data to determine which networks provide which functions.  And there’s no metadata about which networks are public or private.  
  • Incomplete post-provisioning processes because they are left open to user customization.

There is a defensible and logical reason for each example above; sadly, those reasons do nothing to make OpenStack more operationally accessible.  While intra-OpenStack interoperability is helpful, I believe that ecosystems and users benefit from Amazon-like behavior.

What should you do?  Help broaden the OpenStack discussions to seek interoperability with the whole cloud ecosystem.

 

At RackN, we will continue to refine and adapt to these variations.  Creating a consistent experience that copes with variability is the raison d’etre for our efforts with Digital Rebar. That means that we ultimately use AWS as the yardstick for configuration of any infrastructure from physical, OpenStack and even Amazon!

 

SIG-ClusterOps: Promote operability and interoperability of Kubernetes clusters

Originally posted on Kubernetes Blog.  I wanted to repost here because it’s part of the RackN ongoing efforts to focus on operational and fidelity gap challenges early.  Please join us in this effort!

openWe think Kubernetes is an awesome way to run applications at scale! Unfortunately, there’s a bootstrapping problem: we need good ways to build secure & reliable scale environments around Kubernetes. While some parts of the platform administration leverage the platform (cool!), there are fundamental operational topics that need to be addressed and questions (like upgrade and conformance) that need to be answered.

Enter Cluster Ops SIG – the community members who work under the platform to keep it running.

Our objective for Cluster Ops is to be a person-to-person community first, and a source of opinions, documentation, tests and scripts second. That means we dedicate significant time and attention to simply comparing notes about what is working and discussing real operations. Those interactions give us data to form opinions. It also means we can use real-world experiences to inform the project.

We aim to become the forum for operational review and feedback about the project. For Kubernetes to succeed, operators need to have a significant voice in the project by weekly participation and collecting survey data. We’re not trying to create a single opinion about ops, but we do want to create a coordinated resource for collecting operational feedback for the project. As a single recognized group, operators are more accessible and have a bigger impact.

What about real world deliverables?

We’ve got plans for tangible results too. We’re already driving toward concrete deliverables like reference architectures, tool catalogs, community deployment notes and conformance testing. Cluster Ops wants to become the clearing house for operational resources. We’re going to do it based on real world experience and battle tested deployments.

Connect with us.

Cluster Ops can be hard work – don’t do it alone. We’re here to listen, to help when we can and escalate when we can’t. Join the conversation at:

The Cluster Ops Special Interest Group meets weekly at 13:00PT on Thursdays, you can join us via the video hangout and see latest meeting notes for agendas and topics covered.

12 Predictions for ’16: mono-cloud ambitions die as containers drive more hybrid IT

I expect 2016 to be a confusing year for everyone in IT.  For 2015, I predicted that new uses for containers are going to upset cloud’s apple cart; however, the replacement paradigm is not clear yet.  Consequently, I’m doing a prognostication mix and match: five predictions and seven items on a “container technology watch list.”

TL;DR: In 2016, Hybrid IT arrives on Containers’ wings.

Considering my expectations below, I think it’s time to accept that all IT is heterogeneous and stop trying to box everything into a mono-cloud.  Accepting hybrid as current state unblocks many IT decisions that are waiting for things to settle down.

Here’s the memo: “Stop waiting.  It’s not going to converge.”

2016 Predictions

  1. Container Adoption Seen As Two Stages:  We will finally accept that Containers have strength for both infrastructure (first stage adoption) and application life-cycle (second stage adoption) transformation.  Stage one offers value so we will start talking about legacy migration into containers without shaming teams that are not also rewriting apps as immutable microservice unicorns.
  2. OpenStack continues to bump and grow.  Adoption is up and open alternatives are disappearing.  For dedicated/private IaaS, OpenStack will continue to gain in 2016 for basic VM management.  Both competitive and internal pressures continue to threaten the project but I believe they will not emerge in 2016.  Here’s my complete OpenStack 2016 post?
  3. Amazon, GCE and Azure make everything else questionable.  These services are so deep and rich that I’d question anyone who is not using them.  At least one of them simply have to be part of everyone’s IT strategy for financial, talent and technical reasons.
  4. Cloud API becomes irrelevant. Cloud API is so 2011!  There are now so many reasonable clients to abstract various Infrastructures that Cloud APIs are less relevant.  Capability, interoperability and consistency remain critical factors, but the APIs themselves are not interesting.
  5. Metal aaS gets interesting.  I’m a big believer in the power of operating metal via an API and the RackN team delivers it for private infrastructure using Digital Rebar.  Now there are several companies (Packet.net, Ubiquity Hosting and others) that offer hosted metal.

2016 Container Tech Watch List

I’m planning posts about all these key container ecosystems for 2016.  I think they are all significant contributors to the emerging application life-cycle paradigm.

  1. Service Containers (& VMs): There’s an emerging pattern of infrastructure managed containers that provide critical host services like networking, logging, and monitoring.  I believe this pattern will provide significant value and generate it’s own ecosystem.
  2. Networking & Storage Services: Gaps in networking and storage for containers need to get solved in a consistent way.  Expect a lot of thrash and innovation here.
  3. Container Orchestration Services: This is the current battleground for container mind share.  Kubernetes, Mesos and Docker Swarm get headlines but there are other interesting alternatives.
  4. Containers on Metal: Removing the virtualization layer reduces complexity, overhead and cost.  Container workloads are good choices to re-purpose older servers that have too little CPU or RAM to serve as VM hosts.  Who can say no to free infrastructure?!  While an obvious win to many, we’ll need to make progress on standardized scale and upgrade operations first.
  5. Immutable Infrastructure: Even as this term wins the “most confusing” concept in cloud award, it is an important one for container designers to understand.  The unfortunate naming paradox is that immutable infrastructure drives disciplines that allow fast turnover, better security and more dynamic management.
  6. Microservices: The latest generation of service oriented architecture (SOA) benefits from a new class of distribute service registration platforms (etcd and consul) that bring new life into SOA.
  7. Paywall Registries: The important of container registries is easy to overlook because they seem to be version 2.0 of package caches; however, container layering makes these services much more dynamic and central than many realize.  (more?  Bernard Golden and I already posted about this)

What two items did not make the 2016 cut?  1) Special purpose container-focused operating systems like CoreOS or RancherOS.  While interesting, I don’t think these deployment technologies have architectural level influence.  2) Container Security via VMs. I’m seeing patterns where containers may actually be more secure than VMs.  This is FUD created by people with a vested interest in virtualization.

Did I miss something? I’d love to know what you think I got right or wrong!