Podcast – Install Kubernetes with Ansible, Kubespray and Digital Rebar Provision

Posted on September 7, 2017 by Rob H

Stephen Spector, HPE Cloud Evangelist talks with Rob Hirschfeld, Co-Founder and CEO RackN about the installation process for Kubernetes using Kubespray, Ansible, and Digital Rebar Provisioning. Additional commentary on the overviews of Kubernetes, Containers, and Installation in this podcast.

More info on Digital Rebar Provisioning

Follow the RackN L8ist Sh9y Podcast

Home Page
RSS Feed
iTunes Feed Coming Soon!

Podcast – Terraform and Digital Rebar Provision Bare Metal

Posted on August 31, 2017 by Rob H

In this podcast, Stephen Spector, HPE Cloud Evangelist and Greg Althaus, Co-Founder and CTO RackN, talk about the integration point for Digital Rebar Provisioning with the Terraform solution. The specific focus is on delivering bare metal provisioning to users of Terraform.

About Terraform (LINK)

Terraform enables you to safely and predictably create, change, and improve production infrastructure. It is an open source tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.

More info on Digital Rebar Provisioning

Follow the RackN L8ist Sh9y Podcast

Home Page
RSS Feed
iTunes Feed Coming Soon!

August 25 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Posted on August 25, 2017 by Rob H

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

Image from @lasseoe

What is “Site Reliability Engineering?”
https://landing.google.com/sre/interview/ben-treynor.html

In this interview, Ben Treynor shares his thoughts with Niall Murphy about what Site Reliability Engineering (SRE) is, how and why it works so well, and the factors that differentiate SRE from operations teams in industry. READ MORE

Podcast: A Nice Mix of Ansible and Digital Rebar
http://bit.ly/2vkBYEe

Follow our new L8ist Sh9y Podcast on SoundCloud at https://soundcloud.com/user-410091210.

Digital Rebar Mascot Naming

Next week the Digital Rebar community will be finalizing the name for our mascot

Several possible names are listed on a recent blog post for your consideration. Please tweet to @DigitalRebar any ideas you have as we will be choosing a name next week via a Twitter poll.

Digital Rebar v3 Provision
http://rebar.digital/

Digital Rebar is the open, fast and simple data center provisioning and control scaffolding designed with a cloud native architecture.

Our extensible stand-alone DHCP/PXE/IPXE service has minimal overhead so it can be installed and provisioning in under 5 minutes on a laptop, RPi or switch. From there, users can add custom or pre-packaged workflows for full life-cycle automation using our API and CLI or a community UX.

A cloud native bare metal approach provides API-driven infrastructure-as-code automation without locking you into a specific hardware platform, operating system or configuration model.

For physical infrastructure provisioning, Digital Rebar replaces Cobbler, Foreman, MaaS or similar with the added bonus of being able to include simple control workflows for RAID, IPMI and BIOS configuration. We also provide event driven actions via websockets API and a simple plug-in model. By design, Digital Rebar is not opinionated about scripting tools so you can mix and match Chef, Puppet, Ansible, SaltStack and even Bash.

Next version: release of v3.1 is anticipated on 9/4/2017.

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

DevOpsDays Dallas – August 29 – 30: Rob Hirschfeld Talk
OpenDev Conf – Sept 7 – 8 : FAQ
DevOps Summit – Oct 31 – Nov 2: Rob Hirschfeld Talk

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly) – Issue #85
The DevOps/WebOps Marketing Geek – LINK from @LukasHertig
Julie Evans Blog – LINK

August 11 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Posted on August 11, 2017 by Rob H

SRE Items of the Week

Report: DevOps is still considered a new phenomenon
http://sdtimes.com/report-devops-still-considered-new-phenomenon

While companies have grasped that DevOps leads to an increase in innovation, DevOps adoption and implementation still remains a challenge for many. Logz.io, an AI-powered log analytics company, released its DevOps Pulse 2017 survey in time for today’s SysAdmin Day, highlighting some of the challenges and benefits to DevOps.

The DevOps Pulse report this year was based on data from a survey of 700 companies, with an additional section on DevOps culture because, according to Logz.io, it’s one topic that wasn’t researched enough. READ MORE

Immutable Infrastructure Deployment Challenges for DevOps
http://bit.ly/2vFAWq1

Rob Hirschfeld and Gareth Rushgrove (@garethr) discuss the issues.

DevOps vs SRE vs Cloud Native Talk at DevOps Summit
http://news.sys-con.com/node/4134816

In his session at @DevOpsSummit at 21st Cloud Expo, Rob Hirschfeld, CEO and co-founder of RackN, will explore this trend and discuss concrete ways to cope with the coming changes. He’ll look at the reasons why SRE is attractive and get specific about ways that teams can bootstrap their efforts and keep their DevOps Fu strong.

Meet the Digital Rebar Mascot
http://bit.ly/2fvnrT7

The Digital Rebar project is pleased to announce our new mascot; however, she doesn’t have a name. We are looking for ideas and you can reach us at @digitalrebar, @zehicle, or comment on this blog. READ MORE
_____________

UPCOMING EVENTS

DevOpsDays Dallas – August 29 – 30: Rob Hirschfeld Talk
DevOps Summit – Oct 31 – Nov 2: Rob Hirschfeld Talk

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly) – Issue #84
The DevOps/WebOps Marketing Geek – LINK from @LukasHertig
Julie Evans Blog – LINK

Meet the Digital Rebar Mascot

Posted on August 10, 2017 by Rob H

The Digital Rebar project is pleased to announce our new mascot; however, she doesn’t have a name. We are looking for ideas and you can reach us at @digitalrebar, @zehicle, or comment on this blog.

Current ideas:

Digital Rebear
Rebear
Rebare
Baremetal
Bootstrap (the bear)
Skids ~ work boots
Beamer ~ tie-off point that is portable and affixes to a steel beam
Grand Pappy ~ (cable) lager multi-conductor feeder cable
Lumberg ~ guy that walks around with a cup of coffee all day and does nothing
Hue Phi (for UEFI)

If you need help finding a “bear” name, try this interesting name generator based on animal type: http://www.fantasynamegenerators.com/pet-bear-names.php. I also found a site with construction worker slang: http://www.theunionbootpro.com/slang/.

We look forward to hearing your ideas and are quickly working on stickers and other assorted gear with the new mascot.

July 7 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Posted on July 7, 2017 by Rob H

SRE Items of the Week

Presidential Campaigns & Immutable Infrastructure by @danielbryantuk
https://www.infoq.com/news/2017/06/presidential-infrastructure

At QCon New York 2017 Michael Fisher presented “Presidential Campaigns & Immutable Infrastructure” and discussed the implementation and challenges of provisioning infrastructure for the Hillary for America (HFA) campaign that ran during the 2015-2016 US regional and national elections. Immutable infrastructure was key to the technical success of the campaign – the team moved quickly, but were resilient against failure for the majority of the time. It can take more effort to apply the principle of immutability to everything being deployed, but it is beneficial and developers “like the handshake between SRE and dev”. READ MORE

So you want to be a SRE? by Ingo Averdunk‏ @ingoa
https://hackernoon.com/so-you-want-to-be-an-sre-34e832357a8c

About 9 months ago I set out to leave my teaching career of six years to pursue a career as a Software Engineer. I attended a 3 month Programming Bootcamp called Hackbright Academy during which I not only learned the fundamentals of programming, but more importantly, the fundamentals of what type of work excites me. I realized that I loved design. I loved data-model design, user experience design, architectural design, system design… The list goes on, I love design. Because of this, I thought the best place for me would be as a Front End Engineer, boy was I wrong. READ MORE

LinkedIn Releases Open Source Tools
https://www.martechadvisor.com/news/search-social-ads/linkedin-releases-opensource-tools/

The social networking service for professionals, LinkedIn, has announced that it will be releasing a couple of key tools that will be available as open source projects. These have been primarily created to help businesses deal with issues regarding website outages. The new tools will also be enabling organizations to automatically connect with engineers whenever their applications fail. READ MORE
___________

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/
_____________

UPCOMING EVENTS

2017 New York Venture Summit – LINK

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly) – Issue #79
The DevOps/WebOps Marketing Geek – LINK from @LukasHertig
Julie Evans Blog – LINK

June 30 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Posted on June 30, 2017 by Rob H

SRE Items of the Week

Site Reliability Engineering at Dropbox with Tammy Butow @tammybutow

The mess and success of building open leadership (notes from Kubernetes Leadership Summit)
http://bit.ly/2tMTzEy

Three weeks ago, Kubernetes leaders met for a very busy day to reflect and plan how the community was being growing. I was humbled to be part of the Kubernetes Leadership Summit due to my work as the Cluster Ops SIG co-chair. READ MORE

Ops integration will be scary, proceed with haste
http://bit.ly/2u2Wfhq

As CEO of RackN, I talk to a lot of operations teams who have big aspirations for automation that are faltering due to internal resistance. Generally, we’re talking to the SREs on the team. Sadly, those SREs are often stymied by narrowly scoped teams and house-of-cards technical debt. READ MORE

The Case for Ops Engineering Pay Equity with Charity Majors
http://bit.ly/2tZBjYD

Charity Majors is one of my DevOps and SRE heroes* so it was great fun to be able to debate SRE with her at Gluecon this spring. Encouraged by Mike Maney to retell the story, we got to recapture our disagreement about “Is SRE is Good Term?” from the evening before. READ MORE

Datanauts #89 Dives Deep on SRE Approach and Urgency
http://bit.ly/2tqmbGl

In Datanauts 089, Chris Wahl and Ethan Banks help me break down the concepts from my “DevOps vs SRE vs Cloud Native” presentation from DevOpsDays Austin last spring. They do a great job exploring the tough topics and concepts from the presentation. It’s almost like an extended Q&A so you may want to review the slides or recording before diving into the podcast.

Here are my notes from the podcast READ MORE

5 Laws every aspiring Devops engineer should know by @ChrisShort
https://opensource.com/open-organization/17/5/5-devops-laws

“A good engineer is a lazy engineer,” some will say. And to a certain extent, it’s true: Laziness is a great quality if you’re automating repetitive tasks. But laziness flies in the face of learning new technologies and getting new work done. Somewhere between Junior Systems Administrator and Senior DevOps Engineer, laziness no longer becomes an advantage.

Let’s discuss the five laws aspiring DevOps engineers should follow if they want to become great DevOps engineers. READ MORE
___________

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/
____________

UPCOMING EVENTS

2017 New York Venture Summit – LINK

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly) – Issue #78
The DevOps/WebOps Marketing Geek – LINK from @LukasHertig
Julie Evans Blog – LINK

What makes ops hard? SRE/DevOps challenge & imperative [from Cloudcast 301]

Posted on June 27, 2017 by Rob H

TL;DR: Operators (DevOps & SREs) have a hard job, we need to make time and room for them to redefine their jobs in a much more productive way.

The Cloudcast.net by Brian Gracely and Aaron Delp brings deep experience and perspective into their discussions based on their impressive technology careers and understanding of the subject matter. Their podcasts go deep quickly with substantial questions that get to the heart of the issue. This was my third time on the show (previous notes).

In episode 301, we go deeply into the meaning and challenges for Site Reliability Engineering (SRE) functions. We also cover some popular technologies that are of general interest.

Author’s Note; For further information about SREs, listen to my discussion about “SRE vs DevOps vs Cloud Native” on the Datanauts podcast #89. (transcript pending)

Here are my notes from Cloudcast 301. with bold added for emphasis:

2:00 Rob defines SRE (more resources on RackN.com site).
- 2:30 Google’s SRE book gave a name, even changed the definition, to what I’ve been doing my whole career. Evolved name from being just about sites to a full system perspective.
- 3:30 SRE and DevOps are aligned at the core. While DevOps is about process and culture, SRE is more about the function and “factory.”
- 4:30 Developers don’t want to be shoving coal into the engine, but someone, SREs, have to make sure that everything keeps running

5:15 Brian asks about impedance mismatch between Dev and Ops. How do we fix that?

- 6:30 Rob talks about the crisis brewing for operations innovation gap (link). Digital Rebar is designed to create site-to-site automation so Operators can share repeatable best practices.
- 7:30 OpenStack ran aground because Operators because we never created a the practices that could be repeated. “Managed service as the required pattern is a failure of building good operational software.”
- 8:00 RackN decomposes operations into isolated units so that individual changes don’t break the software on top
9:20 Brian talks about the increasing rate of releases means that operations doesn’t have the skills to keep up with patching.

- 10:10 That’s “underlay automation” and even scarier because software is composited with all sorts of parts that have their own release cycles that are not synchronized.
- 11:30 We need to get system level patch/security.update hygiene to be automatic
- 12:20 This is really hard!
13:00 Brian asks what are the baby steps?

- 13:20 We have to find baby steps where there are nice clean boundaries at every layer from the very most basic. For RackN, that’s DHCP and PXE and then upto Kubernetes.
- 15:15 Rob rants that renaming Ops teams as SRE is a failure because SRE has objectives like job equity that need to be included.
- 16:00 Org silos get in the way of automation that have antibodies that make it difficult for SREs and DevOps to succeed.
- 17:10 Those people have to be empowered to make change
- 17:40 The existing tools must be pluggable or you are hurting operators. There’s really no true greenfield, so we help people by making things work in existing data centers.
- 19:00 Scripts may have technical debt but that does not mean they should just be disposed.
- 19:20 New and shiney does not equal better. For example, Container Linux (aka CoreOS) does not solve all problems.
- 20:10 We need to do better creating bridges between existing and new.
- 20:40 How do we make Day 2 compelling?
21:15 Brian asks about running OpenStack on Kubernetes.

- 22:00 Rob is a fan of Kubernetes on Metal, but really, we don’t want metal and vms to be different. That means that Kubernetes can be a universal underlay which is threatening to OpenStack.
- 23:00 This is no longer a JOKE: “Joint OpenStack Kubernetes Environments”
- 23:30 Running things on Kubernetes (or OpenStack) is great because the abstractions hide complexity of infrastructure; however, at the physical layer you need something that exposes that complexity (which is what RackN does).
25:00 Brian asks at what point do you need to get past the easy abstractions

- 25:30 You want to never care ever. But sometimes you need the information for special cases.
- 26:20 We don’t want to make the core APIs complex just to handle the special cases.
- 27:00 There’s still a class of people who need to care about hardware. These needs should not be embedded into the Kubernetes (or OpenStack) API.
28:00 Brian summarizes that we should not turn 1% use cases into complexity for everyone. We need to foster the skill of coding for operators

- 28:45 For SREs, turning Operators into coding & automation is essential. That’s a key point in the 50% programming statement for SREs.
- In the closing, Rob suggested checking out Digital Rebar Provision as a Cobbler replacement.

We’re very invested in talking about SRE and want to hear from you! How is your company transforming operations work to make it more sustainable, robust and human?We want to hear your stories and questions.

The mess and success of building open leadership (notes from Kubernetes Leadership Summit)

Posted on June 26, 2017 by Rob H

TL;DR: Working on building open governance that is both inclusive and able to make hard decisions.

building-joy-planning-plans Three weeks ago, Kubernetes leaders met for a very busy day to reflect and plan how the community was being growing. I was humbled to be part of the Kubernetes Leadership Summit due to my work as the Cluster Ops SIG co-chair. Please join us every other Thursday at 1 PT to share stories about running or planning to run Kubernetes.

This event had to thread a delicate balance for an open project: we needed to limit attendance to focus discussions while ensuring that the community was represented. Our notes (captured in Google Docs) are being transcribed to markdown here.

Here are some key topics that shaped the day from my perspective:

A consensus that core needed to focus on paying down debt and getting smaller. The core project is seen as a bottleneck to growth. The comes from number of people trying to interact in the repo and from having too much technical debt, As a group, we agreed that paying this debt was very important; however, we did not define or authorize specific action to address it. I felt that just acknowledging this focus by a show of hands was a positive action.
Moving forward on formation of a Steering Committee. The bootstrapping committee reviewed their Steering Committee proposal. The concepts here are to design a governing body that intentionally delegates their authority. I think it’s an interesting approach that will help to empower more people in the project. This design is different than a corporate board that’s focused on supervision. Here’s the draft document we reviewed as input into the next phase proposal.
Continue using SIGs to divide work. A consequence of the governance design is that we are (ab)using special interest groups (SIG) to organize the coding and feature work for Kubernetes. They also carry the load for releases, product management and operations. The push from the meeting was to have all SIGs with specific deliverables. I think that works well for some SIGs, but more user/operator focused groups (like Cluster Ops) will feel that it’s harder to find the right engagement models.

Overall, the event was very positive with lively group discussions. This group is focused on building Kubernetes, so there was very little vendor, marketing, user or operator focus. As the project grows, I believe these other focus areas will be important to manage. Likely, those concerns cannot be addressed until the Steering Committee is formed.

RackN is committed to helping make Kubernetes operable and improve the operator experience. I’m interested in hearing about your remote or local impressions of this event. What items should have gotten more discussion? What is the project missing?

June 23 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Posted on June 23, 2017 by Rob H

SRE Items of the Week

Datanauts 089: SRE vs Cloud Native vs DevOps
http://bit.ly/2txPXWV

Rob Hirschfeld joins the Datanauts to talk about the term Site Reliability Engineer (SRE) and what it means for IT operations.

Rob explores how the SRE designation is an effort to put operations teams on a more equal footing with developers within an organization. Rob and the Datanauts also discuss how SREs line up with other industry trends such as the cloud native and DevOps movements. LISTEN HERE

Why Does DevOps Require a New Operating Model? By Mustafa Kapadia @MKapadiaTweets
https://devops.com/why-should-cios-redesign-their-organizations/

For many, redesigning the operating model is table stakes for a successful DevOps transformation. But have you ever wondered why? Popular wisdom will have you believe that the main reason for operating model redesign are to…

“Improve collaboration between business and IT”
“Realign metrics”
“Take full advantage of the new tools”
“And even jump start culture change”

While these are all good reasons, frankly they miss the point. Experience suggests there is a more practical reason – match ownership with desired output.

What do we mean by that? Well first, let’s look at how the current model works. READ MORE

What can developers learn from being on call? By Julia Evans @b0rk http://jvns.ca/blog/2017/06/18/operate-your-software/

We often talk about being on call as being a bad thing. For example, the night before I wrote this my phone woke me up in the middle of the night because something went wrong on a computer. That’s no fun! I was grumpy.

In this post, though, we’re going to talk about what you can learn from being on call and how it can make you a better software engineer!. And to learn from being on call you don’t necessarily need to get woken up in the middle of the night. By “being on call”, here, I mean “being responsible for your code when it breaks”. It could mean waking up to issues that happened overnight and needing to fix them during your workday! READ MORE

Kargo Ansible Playbooks foster Collaborative Kubernetes Ops
http://bit.ly/2qENw3I

Why Kargo?
Making Kubernetes operationally strong is a widely held priority and I track many deployment efforts around the project. The incubated Kargo project is of particular interest for me because it uses the popular Ansible toolset to build robust, upgradable clusters on both cloud and physical targets. I believe using tools familiar to operators grows our community.

We’re excited to see the breadth of platforms enabled by Kargo and how well it handles a wide range of options like integrating Ceph for StatefulSet persistence and Helm for easier application uploads. Those additions have allowed us to fully integrate the OpenStack Helm charts (demo video). READ MORE

newsletter

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/

UPCOMING EVENTS

2017 New York Venture Summit – LINK

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly) – Issue #77
The DevOps/WebOps Marketing Geek – LINK from @LukasHertig
Julie Evans Blog – LINK

Rob Hirschfeld

On Computing, Containers, Cloud & Tech Culture

Category Archives: Open Source

Podcast – Install Kubernetes with Ansible, Kubespray and Digital Rebar Provision

Podcast – Terraform and Digital Rebar Provision Bare Metal

August 25 – Weekly Recap of All Things Site Reliability Engineering (SRE)

August 11 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Meet the Digital Rebar Mascot

July 7 – Weekly Recap of All Things Site Reliability Engineering (SRE)

June 30 – Weekly Recap of All Things Site Reliability Engineering (SRE)

What makes ops hard? SRE/DevOps challenge & imperative [from Cloudcast 301]

The mess and success of building open leadership (notes from Kubernetes Leadership Summit)