DRP v3.11 PROVISIONS WITHOUT REBOOTING

Some features are worth SHOUTING about, so it’s with great pride that I get to announce DRP v3.11.

The latest Digital Rebar release (v3.11) does the impossible: PROVISION WITHOUT REBOOTING.  Combined with image-based deploy and our unique multi-boot workflows, this capability makes server operations 10x faster than traditional net install processes.

But it’s not enough to have a tiny golang utility that can drive any hardware and install any operating system (we added MacOS netboot to this release).   RackN has been adding enterprise integrations to core platforms like Ansible Tower, Terraform, Active Directory, Remedy, Run Book and Logstash.

Oh!  And checkout our open zero-touch, HA Kubernetes installer (KRIB) based on kubeadm.  We just added advanced Helm features for automatic Istio and Rook Ceph examples.

To see more: https://github.com/digitalrebar/provision/releases/tag/v3.11.0

Podcast – Jordan Rinke on Open Source, Kubernetes, and Edge Computing

Joining us this week is Jordan Rinke, Principal Software Engineer, Walmart Labs. Jordan offers his views on various technologies and open source projects as it relates to the scale and connectivity issues faced by Walmart.

Highlights

  • Technical Gaps in Kubernetes Technologies and Installer Issues
  • Tooling and Orchestration Focus for Kubernetes and Other Tools
  • Core OS Model for Bootstrapping Kubernetes
  • Discussion on Immutability: Middle Ground for Jordan
  • Edge Computing – Emerging markets lead to disconnected edge sites
  • Data location challenges in edge and cloud services
  • Skills issues for medium sized clusters

Topic                                                                                    Time (Minutes.Seconds)

Introduction                                                                            0.0 – 1.08
Jet and Walmart Integration                                                1.08 – 1.57
Open Source & Walmart                                                      1.57 – 3.18
Kubernetes Challenge & Opportunities                             3.18 – 6.25
Open Source Installation Tool Sprawl                                6.25 – 9.53
Kubernetes to Bootstrap Kubernetes (CoreOS Model)   9.53 – 12.28
Ephemeral Hardware and Immutability                            12.28 – 15.30
Edge Computing                                                                   15.30 – 20.18
Dynamic Data Locations                                                      20.18 – 22.44
Medium Scale Clusters                                                        22.44 – 26.39 (On-Prem Kubernetes)
Wrap Up (OpenStack Bus Tour)                                          26.39 – END

Podcast Guest: Jordan Rinke

Technically inclined executive with 7 years of team leadership and startup growth experience:
Leading teams from 4 to 20 people in size on highly technical tactical and responsive issues. Managing the teams that have helped a number of startups secure funding from $50k to $1.5MM+ and effectively utilizing that investment to grow a sustainable energetic culture and product portfolio.

Before that I accrued 10 years of dev/eng experience (6 years of fortune 50 company experience, 4 years at one of the world’s largest cloud providers) doing OS deployment (DevOps before it was a buzz word) and driver integration for environments with over 150,000 devices giving me a unique perspective on large scale deployment scenarios.

Deep Thinking & Tech + Great Guests – L8ist Sh9y podcast relaunched

I love great conversations about technology – especially ones where the answer is not very neatly settled into winners and losers (which is ALL of them in IT).  I’m excited that RackN has (re)launched the L8ist Sh9y (aka Latest Shiny) podcast around this exact theme.

Please check out the deep and thoughtful discussion I just had with Mark Thiele (notes) of Aperca where we covered Mark’s thought on why public cloud will be under 20% of IT and culture issues head on.

Spoiler: we have David Linthicum coming next, SO SUBSCRIBE.

I’ve been a guest on some great podcasts (Cloudcast, gcOnDemand, Datanauts, IBM Dojo, HPEFoodfight) and have deep respect for critical work they do in industry.

We feel there’s still room for deep discussions specifically around automated IT Operations in cloud, data center and edge; consequently, we’re branching out to start including deep interviews in addition to our initial stable of IT Ops deep technical topics like Terraform, Edge Computing, GartnerSYM review, Kubernetes and, of course, our own Digital Rebar.

Soundcloud Subscription Information

 

October 6 – Weekly Recap of All Things Digital Rebar and RackN

Welcome to the weekly post of the RackN blog recap of all things Digital Rebar, RackN, SRE, and DevOps. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

Items of the Week

RackN

RackN Beta Program Launch

Blog Post: Fast, Simple, Open Provisioning – Rethinking Infrastructure w/ Cloud Centric-Automation 

Operating hardware is too hard today. And too expensive.  Let’s fix that.

The problem with physical ops is not that it’s hard, complex or fragile. Okay, it is and those ARE problems, but they are compounded by the lack of shared management software and practices missing from this layer.  When the RackN team set out to solve these physical challenges, we knew the software had to be very focused to replace the current Cobbler and Foreman environments. It also had to be flexible and composable for heterogeneous environments or we’d be right back into snowflake custom DevOps.

We’re talking about a platform that finally addresses full lifecycle control at the hardware layer with open software.  That’s complex stuff automated in a reusable way.

Read More

Podcast

To participate in the beta please email us at beta@rackn.com, add your email on the RackN Beta Program website, or contact us twitter at @rackngo.

Digital Rebar 

Next Week – Digital Rebar Community Meetup #2

October 10 at 11:00am PST

Proposed outline agenda:

  • Welcome and recap from v001 meetup
  • demo: Kubernetes deployment via DRP / packet.net
  • demo: Injecting passwords and SSH keys
  • demo: Content Loading – demo and information
  • Weekly / or every-other-weekly meetups? https://www.meetup.com/digitalrebar/polls/1255504/
  • Release planning and features for v3.2.0

More Information at https://www.meetup.com/digitalrebar/events/243490128/

New Digital Rebar Provision Videos:

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com

If you are attending any of these events please reach out to Rob Hirschfeld to setup time to learn more about our solutions or discuss the latest industry trends.

OTHER NEWSLETTERS

What makes ops hard? SRE/DevOps challenge & imperative [from Cloudcast 301]

TL;DR: Operators (DevOps & SREs) have a hard job, we need to make time and room for them to redefine their jobs in a much more productive way.

Cloudcast-Logo-2015-Banner-BlueThe Cloudcast.net by Brian Gracely and Aaron Delp brings deep experience and perspective into their discussions based on their impressive technology careers and understanding of the subject matter.  Their podcasts go deep quickly with substantial questions that get to the heart of the issue.  This was my third time on the show (previous notes).

In episode 301, we go deeply into the meaning and challenges for Site Reliability Engineering (SRE) functions.  We also cover some popular technologies that are of general interest.

Author’s Note; For further information about SREs, listen to my discussion about “SRE vs DevOps vs Cloud Native” on the Datanauts podcast #89.  (transcript pending)

Here are my notes from Cloudcast 301. with bold added for emphasis:

  • 2:00 Rob defines SRE (more resources on RackN.com site).
    • 2:30 Google’s SRE book gave a name, even changed the definition, to what I’ve been doing my whole career. Evolved name from being just about sites to a full system perspective.  
    • 3:30 SRE and DevOps are aligned at the core.  While DevOps is about process and culture, SRE is more about the function and “factory.”
    • 4:30 Developers don’t want to be shoving coal into the engine, but someone, SREs, have to make sure that everything keeps running
  • 5:15 Brian asks about impedance mismatch between Dev and Ops.  How do we fix that?
    • 6:30 Rob talks about the crisis brewing for operations innovation gap (link).  Digital Rebar is designed to create site-to-site automation so Operators can share repeatable best practices.
    • 7:30 OpenStack ran aground because Operators because we never created a the practices that could be repeated.  “Managed service as the required pattern is a failure of building good operational software.”
    • 8:00 RackN decomposes operations into isolated units so that individual changes don’t break the software on top

  • 9:20 Brian talks about the increasing rate of releases means that operations doesn’t have the skills to keep up with patching.
    • 10:10 That’s “underlay automation” and even scarier because software is composited with all sorts of parts that have their own release cycles that are not synchronized.
    • 11:30 We need to get system level patch/security.update hygiene to be automatic
    • 12:20 This is really hard!

  • 13:00 Brian asks what are the baby steps?
    • 13:20 We have to find baby steps where there are nice clean boundaries at every layer from the very most basic.  For RackN, that’s DHCP and PXE and then upto Kubernetes.
    • 15:15 Rob rants that renaming Ops teams as SRE is a failure because SRE has objectives like job equity that need to be included.
    • 16:00 Org silos get in the way of automation that have antibodies that make it difficult for SREs and DevOps to succeed.
    • 17:10 Those people have to be empowered to make change
    • 17:40 The existing tools must be pluggable or you are hurting operators.  There’s really no true greenfield, so we help people by making things work in existing data centers.
    • 19:00 Scripts may have technical debt but that does not mean they should just be disposed.
    • 19:20 New and shiney does not equal better.  For example, Container Linux (aka CoreOS) does not solve all problems.  
    • 20:10 We need to do better creating bridges between existing and new.
    • 20:40 How do we make Day 2 compelling?

  • 21:15 Brian asks about running OpenStack on Kubernetes.
    • 22:00 Rob is a fan of Kubernetes on Metal, but really, we don’t want metal and vms to be different.  That means that Kubernetes can be a universal underlay which is threatening to OpenStack.
    • 23:00 This is no longer a JOKE: “Joint OpenStack Kubernetes Environments”
    • 23:30 Running things on Kubernetes (or OpenStack) is great because the abstractions hide complexity of infrastructure; however, at the physical layer you need something that exposes that complexity (which is what RackN does).

  • 25:00 Brian asks at what point do you need to get past the easy abstractions
    • 25:30 You want to never care ever.  But sometimes you need the information for special cases.
    • 26:20 We don’t want to make the core APIs complex just to handle the special cases.
    • 27:00 There’s still a class of people who need to care about hardware.  These needs should not be embedded into the Kubernetes (or OpenStack) API.

  • 28:00 Brian summarizes that we should not turn 1% use cases into complexity for everyone.  We need to foster the skill of coding for operators
    • 28:45 For SREs, turning Operators into coding & automation is essential.  That’s a key point in the 50% programming statement for SREs.
    • In the closing, Rob suggested checking out Digital Rebar Provision as a Cobbler replacement.

We’re very invested in talking about SRE and want to hear from you! How is your company transforming operations work to make it more sustainable, robust and human?We want to hear your stories and questions.

The mess and success of building open leadership (notes from Kubernetes Leadership Summit)

TL;DR: Working on building open governance that is both inclusive and able to make hard decisions.

building-joy-planning-plansThree weeks ago, Kubernetes leaders met for a very busy day to reflect and plan how the community was being growing.  I was humbled to be part of the Kubernetes Leadership Summit due to my work as the Cluster Ops SIG co-chair.    Please join us every other Thursday at 1 PT to share stories about running or planning to run Kubernetes.

This event had to thread a delicate balance for an open project:  we needed to limit attendance to focus discussions while ensuring that the community was represented.  Our notes (captured in Google Docs) are being transcribed to markdown here.

Here are some key topics that shaped the day from my perspective:

  • A consensus that core needed to focus on paying down debt and getting smaller.  The core project is seen as a bottleneck to growth.  The comes from number of people trying to interact in the repo and from having too much technical debt,  As a group, we agreed that paying this debt was very important; however, we did not define or authorize specific action to address it.  I felt that just acknowledging this focus by a show of hands was a positive action.
  • Moving forward on formation of a Steering Committee.  The bootstrapping committee reviewed their Steering Committee proposal.  The concepts here are to design a governing body that intentionally delegates their authority.  I think it’s an interesting approach that will help to empower more people in the project.  This design is different than a corporate board that’s focused on supervision.  Here’s the draft document we reviewed as input into the next phase proposal.
  • Continue using SIGs to divide work.  A consequence of the governance design is that we are (ab)using special interest groups (SIG) to organize the coding and feature work for Kubernetes.  They also carry the load for releases, product management and operations.  The push from the meeting was to have all SIGs with specific deliverables.  I think that works well for some SIGs, but more user/operator focused groups (like Cluster Ops) will feel that it’s harder to find the right engagement models.

Overall, the event was very positive with lively group discussions.  This group is focused on building Kubernetes, so there was very little vendor, marketing, user or operator focus.  As the project grows, I believe these other focus areas will be important to manage.  Likely, those concerns cannot be addressed until the Steering Committee is formed.

RackN is committed to helping make Kubernetes operable and improve the operator experience.  I’m interested in hearing about your remote or local impressions of this event.  What items should have gotten more discussion?  What is the project missing?

June 16 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rack ngo)

SRE Items of the Week

The Cloudcast #301 – SRE and Infrastructure Operations
http://www.thecloudcast.net/2017/06/the-cloudcast-301-sre-and.html

Description: Brian talks with Rob Hirschfeld (@zehicle, Founder/CEO of @RackN) about the concepts of SRE (Site Reliability Engineering), the challenges of maintaining infrastructure software, emerging tools and the next-generation of operations.

Show Notes:

  • Topic 1 – Welcome back to the show. Let’s start by talking about the concept of SRE (Site Reliability Engineering). Give us the basics and maybe explain how it differs from what people define in DevOps.
  • Topic 2 – Application development has been moving faster for quite a while (agile development, etc.). But now infrastructure/operations teams have to deal with faster software – especially around updates (e.g. Kubernetes releases every 3 months). How are companies managing this?
  • Topic 3 – Given that this pace of operations change may not slow down, how do you think about the challenge in terms of process/operations versus technology/tools?
  • Topic 4 – What are some of the steps that companies take to better prepare for this type of operational model? Tools, process, skills, etc.
  • Topic 5 – Do you see SRE as being a progression for existing infrastructure/operations people, or is this more focused on sysadmins or developers that want to get away from building applications?

_____________

DevOps Enterprise Summit London: Tales of courage and community
https://techbeacon.com/devops-enterprise-summit-london-tales-courage-community

After spending two amazing days with 700 of my closest DevOps cohorts from Europe, the Middle East, Africa, and beyond, I learned all about the latest and greatest IT and technology transformation reports at the DevOps Enterprise Summit London. With substantial growth in attendance from the first year, in 2016, the buzz around the show was palpable. And, what a location! From the venue, the QEII Centre, we had 360-degree views of central London, from Big Ben to the London Eye and beyond.

Read more from Steve Brodie, CEO of Electric Cloud @stbrodie
_____________

.IO! .IO! It’s off to a Service Mesh you should go [Gluecon 2017 notes]
http://bit.ly/2rjw4We  

Gluecon turned out to be all about a microservice concept called a “service mesh” which was being promoted by Buoyant with Linkerd and IBM/Google/Lyft with Istio.  This class of services is a natural evolution of the rush to microservices and something that I’ve written microservice technical architecture on TheNewStack about in the past. READ MORE
_____________

A few things I’ve learned about Kubernetes
https://jvns.ca/blog/2017/06/04/learning-about-kubernetes/

I’ve been learning about Kubernetes at work recently. I only started seriously thinking about it maybe 6 months ago – my partner Kamal has been excited about Kubernetes for a few years (him: “julia! you can run programs without worrying what computers they run on! it is so cool!“, me: “I don’t get it, how is that even possible”), but I understand it a lot better now.

This isn’t a comprehensive explanation or anything, it’s some things I learned along the way that have helped me understand what’s going on.

Read more from Julia Evans @b0rk
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

  • 2017 New York Venture Summit – LINK

OTHER NEWSLETTERS