Deep Thinking & Tech + Great Guests – L8ist Sh9y podcast relaunched

I love great conversations about technology – especially ones where the answer is not very neatly settled into winners and losers (which is ALL of them in IT).  I’m excited that RackN has (re)launched the L8ist Sh9y (aka Latest Shiny) podcast around this exact theme.

Please check out the deep and thoughtful discussion I just had with Mark Thiele (notes) of Aperca where we covered Mark’s thought on why public cloud will be under 20% of IT and culture issues head on.

Spoiler: we have David Linthicum coming next, SO SUBSCRIBE.

I’ve been a guest on some great podcasts (Cloudcast, gcOnDemand, Datanauts, IBM Dojo, HPEFoodfight) and have deep respect for critical work they do in industry.

We feel there’s still room for deep discussions specifically around automated IT Operations in cloud, data center and edge; consequently, we’re branching out to start including deep interviews in addition to our initial stable of IT Ops deep technical topics like Terraform, Edge Computing, GartnerSYM review, Kubernetes and, of course, our own Digital Rebar.

Soundcloud Subscription Information

 

Let’s DevOps IRL: my SRE postings on RackN!

I’m investing in these Site Reliability Engineering (SRE) discussions because I believe operations (and by extension DevOps) is facing a significant challenge in keeping up with development tooling.   The links below have been getting a lot of interest on twitter and driving some good discussion.

datanauts_logo_300

15967

Addressing this Ops debt is our primary mission at my company, RackN: we believe that integrated system level tooling is required.  We also believe that new tools should not disrupt environments so we work very hard to adapt to requirements of individual sites.

SRE is urgent because it provides a pragmatic path and rationale for investment.

Even if you don’t agree with Google’s term or all their practices, I think fundamental concepts of system thinking, status/pay, automation investment and developer collaboration are essential.  It should come as no surprise that these are all Lean/DevOps concepts; however, SRE has the pragmatic side of being a job function.

Here are some recent relevant discussions I’ve been having about SREs with links to both the audio and my text show notes.

Of course, RackN is also doing a WEEKLY SRE update that captures general interest items.  Check that out and subscribe.

Packet Pushers 333: Orchestration v Automation < YES, this is what we are doing!

Iix34grhy_400x400 highly recommend catching Packet Pushers 333 “Automation & Orchestration In Networking” by Drew Conry-Murray and guests Pete Lumbis and Michael Damkot.

While the discussion is all about NETWORK DevOps, they do a good job of decrying WHY current state of system orchestration is so sad – in a word: heterogeneity.  It’s not going away because the alternative is lock-in.  They also do a good job of describing the difference between automation and orchestration; however, I think there’s a middle tier  of resource “scheduling” that better describes OpenStack and Kubernetes.

Around 5:00 minutes into the podcast, they effectively describe the composable design of Digital Rebar and the rationale for the way that we’ve abstracted interfaces for automation.  If you guys really do want to cash in by consulting with it (at 10 minutes), just give me a call.

It’s great to hear acknowledgement of both the complexity and need for solving these problems.   Thanks for the great podcast Drew, Pete and Michael!

Oh… and I’m going to be presenting at Interop ITX also.  Hopefully, I’ll get a chance to talk 1×1 with Drew.

What does it take to Operate Open Platforms? Answers in Datanauts 72

Did I just let OpenStack ops off the hook….?  Kubernetes production challenges…?  

ix34grhy_400x400I had a lot of fun in this Datanauts wide ranging discussion with unicorn herders Chris Wahl and Ethan Banks.  I like the three section format because it gives us a chance to deep dive into distinct topics and includes some out-of-band analysis by the hosts; however, that means you need to keep listening through the commercial breaks to hear the full podcast.

Three parts?  Yes, Chris and Ethan like to save the best questions for last.

In Part 1, we went deep into the industry operational and business challenges uncovered by the OpenStack project. Particularly, Chris and I go into “platform underlay” issues which I laid out in my “please stop the turtles” post. This was part of the build-up to my SRE series.

In Part 2, we explore my operations-focused view of the latest developments in container schedulers with a focus on Kubernetes. Part of the operational discussion goes into architecture “conceits” (or compromises) that allow developers to get the most from cloud native design patterns. I also make a pitch for using proven tools to run the underlay.

In Part 3, we go deep into DevOps automation topics of configuration and orchestration. We talk about the design principles that help drive “day 2” automation and why getting in-place upgrades should be an industry priority.  Of course, we do cover some Digital Rebar design too.

Take a listen and let me know what you think!

On Twitter, we’ve already started a discussion about how much developers should care about infrastructure. My opinion (posted here) is that one DevOps idea where developers “own” infrastructure caused a partial rebellion towards containers.

Why we can’t move past installers to talk about operations – the underlay gap

20 minutes.  That’s the amount of time most developers are willing to spend installing a tool or platform that could become the foundation for their software.  I’ve watched our industry obsess on the “out of box” experience which usually translates into a single CLI command to get started (and then fails to scale up).

Secure, scalable and robust production operations is complex.  In fact, most of these platforms are specifically designed to hide that fact from developers.  

That means that these platforms intentionally hide the very complexity that they themselves need to run effectively.  Adding that complexity, at best, undermines the utility of the platform and, at worst, causes distractions that keep us forever looping on “day 1” installation issues.

I believe that systems designed to manage ops process and underlay are different than the platforms designed to manage developer life-cycle.  This is different than the fidelity gap which is about portability. Accepting that allows us to focus on delivering secure, scalable and robust infrastructure for both users.

In a pair of DevOps.com posts, I lay out my arguments about the harm being caused by trying to blend these concepts in much more detail:

  1. It’s Time to Slay the Universal Installer Unicorn
  2. How the Lure of an ‘Easy Button’ Installer Traps Projects

5 Key Aspects of High Fidelity DevOps [repost from DevOps.com]

For all our cloud enthusiasm, I feel like ops automation is suffering as we increase choice and complexity.  Why is this happening?  It’s about loss of fidelity.

Nearly a year ago, I was inspired by a mention of “Fidelity Gaps” during a Cloud Foundry After Dark session.  With additional advice from DevOps leader Gene Kim, this narrative about the why and how of DevOps Fidelity emerged.

As much as we talk about how we should have shared goals spanning Dev and Ops, it’s not nearly as easy as it sounds. To fuel a DevOps culture, we have to build robust tooling, also.

That means investing up front in five key areas: abstraction, composability, automation, orchestration, and idempotency.

Together, these concepts allow sharing work at every level of the pipeline. Unfortunately, it’s tempting to optimize work at one level and miss the true system bottlenecks.

Creating production-like fidelity for developers is essential: We need it for scale, security and upgrades. It’s not just about sharing effort; it’s about empathy and collaboration.

But even with growing acceptance of DevOps as a cultural movement, I believe deployment disparities are a big unsolved problem. When developers have vastly different working environments from operators, it creates a “fidelity gap” that makes it difficult for the teams to collaborate.

Before we talk about the costs and solutions, let me first share a story from back when I was a bright-eyed OpenStack enthusiast…

Read the Full Article on DevOps.com including my section about Why OpenStack Devstack harms the project and five specific ways to improve DevOps fidelity.

Are VMs becoming El Caminos? Containers & Metal provide new choices for DevOps

I released “VMS ARE DEAD” this post two weeks ago on DevOps.com.  My point here is that Ops Automation (aka DevOps) is FINALLY growing beyond Cloud APIs and VMs.  This creates a much richer ecosystem of deployment targets instead of having to shoehorn every workload into the same platform.

In 2010, it looked as if visualization had won. We expected all servers to virtualize workloads and the primary question was which cloud infrastructure manager would dominate. Now in 2015, the picture is not as clear. I’m seeing a trend that threatens the “virtualize all things” battle cry.

IMG_20150301_170558985Really, it’s two intersecting trends: metal is getting cheaper and easier while container orchestration is advancing on rockets. If metal can truck around the heavy stable workloads while containers zip around like sports cars, that leaves VMs as a strange hybrid in the middle.

What’s the middle? It’s the El Camino, that notorious discontinued half car, half pick-up truck.

The explosion of interest in containerized workloads (I know, they’ve been around for a long time but Docker made them sexy somehow) has been creating secondary wave of container orchestration. Five years ago, I called that Platform as a Service (PaaS) but this new generation looks more like a CI/CD pipeline plus DevOps platform than our original PaaS concepts. These emerging pipelines obfuscate the operational environment differently than virtualized infrastructure (let’s call it IaaS). The platforms do not care about servers or application tiers, their semantic is about connecting services together. It’s a different deployment paradigm that’s more about SOA than resource reservation.

On the other side, we’ve been working hard to make physical ops more automated using the same DevOps tool chains. To complicate matters, the physics of silicon has meant that we’ve gone from scale up to scale out. Modern applications are so massive that they are going to exceed any single system so economics drives us to lots and lots of small, inexpensive servers. If you factor in the operational complexity and cost of hypervisors/clouds, an small actual dedicated server is a cost-effective substitute for a comparable virtual machine.

I’ll repeat that: a small dedicated server is a cost-effective substitute for a comparable virtual machine.

I am not speaking against virtualize servers or clouds. They have a critical role in data center operations; however, I hear from operators who are rethinking the idea that all servers will be virtualized and moving towards a more heterogeneous view of their data center. Once where they have a fleet of trucks, sports cars and El Caminos.

Of course, I’d be disingenuous if I neglected to point out that trucks are used to transport cars too. At some point, everything is metal.

Want more metal friendly reading?  See Packet CEO Zac Smith’s thinking on this topic.