Hey Dockercon, let’s get Physical!

IMG_20170419_121918Overall, Dockercon did a good job connecting Docker users with information.  In some ways, it was a very “let’s get down to business” conference without the open source collaboration feel of previous events.  For enterprise customers and partners, that may be a welcome change.

Unlike past Dockercons, the event did not have major announcements or a lot of non-Docker ecosystem buzz.  That said, I miss that the event did not have major announcements or a lot of non-Docker ecosystem buzz.

One item that got me excited was an immutable operating system called LinuxKit which is powered by a Packer-like utility called Moby (ok, I know it does more but that’s still fuzzy to me).

RackN CTO, Greg Althaus, was able to turn around a working LinuxKit Kubernetes demo (VIDEO) overnight.  This short video explains Moby & LinuxKit plus uses the new Digital Rebar Provision in an amazing integration.

Want to hear more about immutable operating systems?  Check out our post on RackN’s site about three challenges of running things like LinuxKit, CoreOS Container Linux and RancherOS on metal.

Oh, and YES, that was my 15-year-old daughter giving a presentation at Dockercon about workplace diversity.  I’ll link the video when they’ve posted them.

https://www.slideshare.net/KateHirschfeld/slideshelf

What does it take to Operate Open Platforms? Answers in Datanaughts 72

Did I just let OpenStack ops off the hook….?  Kubernetes production challenges…?  

ix34grhy_400x400I had a lot of fun in this Datanaughts wide ranging discussion with unicorn herders Chris Wahl and Ethan Banks.  I like the three section format because it gives us a chance to deep dive into distinct topics and includes some out-of-band analysis by the hosts; however, that means you need to keep listening through the commercial breaks to hear the full podcast.

Three parts?  Yes, Chris and Ethan like to save the best questions for last.

In Part 1, we went deep into the industry operational and business challenges uncovered by the OpenStack project. Particularly, Chris and I go into “platform underlay” issues which I laid out in my “please stop the turtles” post. This was part of the build-up to my SRE series.

In Part 2, we explore my operations-focused view of the latest developments in container schedulers with a focus on Kubernetes. Part of the operational discussion goes into architecture “conceits” (or compromises) that allow developers to get the most from cloud native design patterns. I also make a pitch for using proven tools to run the underlay.

In Part 3, we go deep into DevOps automation topics of configuration and orchestration. We talk about the design principles that help drive “day 2” automation and why getting in-place upgrades should be an industry priority.  Of course, we do cover some Digital Rebar design too.

Take a listen and let me know what you think!

On Twitter, we’ve already started a discussion about how much developers should care about infrastructure. My opinion (posted here) is that one DevOps idea where developers “own” infrastructure caused a partial rebellion towards containers.

SRE role with DevOps for Enterprise [@HPE podcast]

sre-series

My focus on SRE series continues… At RackN, we see a coming infrastructure explosion in both complexity and scale. Unless our industry radically rethinks operational processes, current backlogs will escalate and stability, security and sharing will suffer.

Yes, DevOps and SRE are complementary

In this short 16 minute podcast, HPE’s Stephen Spector and I discuss how DevOps and SRE thinking overlaps and where are the differences.  We also discuss how Enterprises should be evaluating Site Reliability Engineering as a function and where it fits in their organization.

Beyond Expectations: OpenStack via Kubernetes Helm (Fully Automated with Digital Rebar)

RackN revisits OpenStack deployments with an eye on ongoing operations.

I’ve been an outspoken skeptic of a Joint OpenStack Kubernetes Environment (my OpenStack BCN presoSuper User follow-up and BOS Proposal) because I felt that the technical hurdles of cloud native architecture would prove challenging.  Issues like stable service positioning and persistent data are requirements for OpenStack and hard problems in Kubernetes.

I was wrong: I underestimated how fast these issues could be addressed.

youtube-thumb-nail-openstackThe Kubernetes Helm work out of the AT&T Comm Dev lab takes on the integration with a “do it the K8s native way” approach that the RackN team finds very effective.  In fact, we’ve created a fully integrated Digital Rebar deployment that lays down Kubernetes using Kargo and then adds OpenStack via Helm.  The provisioning automation includes a Ceph cluster to provide stateful sets for data persistence.  

This joint approach dramatically reduces operational challenges associated with running OpenStack without taking over a general purpose Kubernetes infrastructure for a single task.

sre-seriesGiven the rise of SRE thinking, the RackN team believes that this approach changes the field for OpenStack deployments and will ultimately dominate the field (which is already  mainly containerized).  There is still work to be completed: some complex configuration is required to allow both Kubernetes CNI and Neutron to collaborate so that containers and VMs can cross-communicate.

We are looking for companies that want to join in this work and fast-track it into production.  If this is interesting, please contact us at sre@rackn.com.

Why should you sponsor? Current OpenStack operators facing “fork-lift upgrades” should want to find a path like this one that ensures future upgrades are baked into the plan.  This approach provide a fast track to a general purpose, enterprise grade, upgradable Kubernetes infrastructure.

Closing note from my past presentations: We’re making progress on the technical aspects of this integration; however, my concerns about market positioning remain.

“Why SRE?” Discussion with Eric @Discoposse Wright

sre-series My focus on SRE series continues… At RackN, we see a coming infrastructure explosion in both complexity and scale. Unless our industry radically rethinks operational processes, current backlogs will escalate and stability, security and sharing will suffer.

ericewrightI was a guest on Eric “@discoposse” Wright of the Green Circle Community #42 Podcast (my previous appearance).

LISTEN NOW: Podcast #42 (transcript)

In this action-packed 30 minute conversation, we discuss the industry forces putting pressure on operations teams.  These pressures require operators to be investing much more heavily on reusable automation.

That leads us towards why Kubernetes is interesting and what went wrong with OpenStack (I actually use the phrase “dumpster fire”).  We ultimately talk about how those lessons embedded in Digital Rebar architecture.

(re)OpenStack for 2017 – board voting week starts this Monday

[1/19 Update: I placed 9th in the results (or 6th if you go only by popular vote instead of total votes).  There are 8 seats, so I was not elected.]

The OpenStack Project needs a course correction and I’m asking for your community vote to put me back on the 2017 Board to help drive it.  As a start-up CEO, I’m neutral, yet I also have the right technical, commercial and community influence to make this a reality.

Vote Now!Your support is critical because OpenStack fills a very real need and should have a solid future; however, it needs to adapt to market realities to achieve that.

I want the Board to acknowledge and adapt to stumbles in ecosystem success including being dropped or re-prioritized by key sponsors.  This should include tightening the mission so the project can collaborate more freely with both open and proprietary platforms.  In 2016, I’ve been deeply involved OpenStack alternatives including Kubernetes and hybrid Cloud automation with Amazon and Google.

OpenStack must adjust to being one of several alternatives including AWS, Google and container platforms like Kubernetes.

That means focusing on our IaaS strengths and being unambiguous about core function like SDN and storage integration.   It also means ensuring that commercial members of the ecosystem can both profit and compete.  The Board has both the responsibility and authority to make these changes if the members are willing to act.

What’s my background?  I’ve been an active and vocal member of the OpenStack community since the very beginning of the project especially around Operator and Product Management issues.  I was elected to the board four times and played critical roles including launching the DefCore efforts and pushing for more definition of the Big Tent concept (which I believe has hurt the project).

In a great field of candidates!  Like other years, there are many very strong candidates whom I have worked with in a number of roles.  I always recommend distributing your eight votes to multiple people and limited “affinity voting” for your own company or geography.   While all candidates would serve the board, this year, I’d like to call attention specifically to  Shamail Tahair as a candidate who has invested significant time in helping with Product Management and Enterprise Readiness for OpenStack.

2016 Infrastructure Revolt makes 2017 the “year of the IT Escape Clause”

Software development technology is so frothy that we’re developing collective immunity to constant churn and hype cycles. Lately, every time someone tells me that they have hot “picked technology Foo” they also explain how they are also planning contingencies for when Foo fails. Not if, when.

13633961301245401193crawfish20boil204-mdRequired contingency? That’s why I believe 2017 is the year of the IT Escape Clause, or, more colorfully, the IT Crawfish.

When I lived in New Orleans, I learned that crawfish are anxious creatures (basically tiny lobsters) with powerful (and delicious) tails that propel them backward at any hint of any danger. Their ability to instantly back out of any situation has turned their name into a common use verb: crawfish means to back out or quickly retreat.

In IT terms, it means that your go-forward plans always include a quick escape hatch if there’s some problem. I like Subbu Allamaraju’s description of this as Change Agility.  I’ve also seen this called lock-in prevention or contingency planning. Both are important; however, we’re reaching new levels for 2017 because we can’t predict which technology stacks are robust and complete.

The fact is the none of them are robust or complete compared to historical platforms. So we go forward with an eye on alternatives.

How did we get to this state? I blame the 2016 Infrastructure Revolt.

Way, way, way back in 2010 (that’s about bronze age in the Cloud era), we started talking about developers helping automate infrastructure as part of deploying their code. We created some great tools for this and co-opted the term DevOps to describe provisioning automation. Compared to the part, it was glorious with glittering self-service rebellions and API-driven enlightenment.

In reality, DevOps was really painful because most developers felt that time fixing infrastructure was a distraction from coding features.

In 2016, we finally reached a sufficient platform capability set in tools like CI/CD pipelines, Docker Containers, Kubernetes, Serverless/Lambda and others that Developers had real alternatives to dealing with infrastructure directly. Once we reached this tipping point, the idea of coding against infrastructure directly become unattractive. In fact, the world’s largest infrastructure company, Amazon, is actively repositioning as a platform services company. Their re:Invent message was very clear: if you want to get the most from AWS, use our services instead of the servers.

For most users, using platform services instead of infrastructure is excellent advice to save cost and time.

The dilemma is that platforms are still evolving rapidly. So rapidly that adopters cannot count of the services to exist in their current form for multiple generations. However, the real benefits drive aggressive adoption. They also drive the rise of Crawfish IT.

As they say in N’Awlins, laissez les bon temps rouler!

Related Reading on the Doppler: Your Cloud Strategy Must Include No-Cloud Options