Week in Review – Ansible Integration and Community Welcome Guide Released

Welcome to the RackN and Digital Rebar Weekly Review. You will find the latest news related to Edge, DevOps, SRE and other relevant topics.

Infrastructure Provisioning for Ansible with Digital Rebar Provision

Digital Rebar Provision (DRP) is the perfect technology partner for Ansible customers looking to automate their infrastructure setup before applying their orchestration tool of choice. DRP automatically generates a complete inventory of provisioned infrastructure which is a must have for Ansible to complete its orchestration. In addition, DRP is able to deploy Kubernetes clusters using Ansible and Kubespray as part of its standard Workflow automated process.

READ MORE

Digital Rebar Community Welcome Guide Released

This document contains information about the open source Digital Rebar community building Digital Rebar Provision. It is especially useful for new users, developers or other interested parties in understanding the community, its methodologies, communication channels, and community standards.

Read Welcome Guide


News

RackN

  • RackN Trial – 30-day access to RackN technology with support and training from RackN team – Register Today
  • NEW YouTube Videos
    • Digital Rebar Inventory Stage (RackN Task-Library) – Listen (17 min 06 sec)
    • Digital Rebar Provision v3.8 Workflows – Listen (17 min 51 sec)
    • Terraform Digital Rebar Provider with Workflows – Listen (11 min 49 sec)
  • Summer Events
    • Still Working on Plan ~ Stay Tuned

Digital Rebar Community

L8ist Sh9y Podcast

Social Media

Infrastructure Provisioning for Ansible with Digital Rebar Provision

Digital Rebar Provision (DRP) is the perfect technology partner for Ansible customers looking to automate their infrastructure setup before applying their orchestration tool of choice. DRP automatically generates a complete inventory of provisioned infrastructure which is a must have for Ansible to complete its orchestration. In addition, DRP is able to deploy Kubernetes clusters using Ansible and Kubespray as part of its standard Workflow automated process.

Together, DRP and Ansible enhance the automation, repeatability, and transparency for DevOps teams.

Learn more about DRP and Ansible in the demonstration video and podcasts below.

Demonstration Video of Kubespray with Digital Rebar Provision:

Learn more about Digital Rebar Provision and Ansible in this podcast:

Learn more about Digital Rebar Provision, Ansible and Kubernetes installation with Kubespray in this podcast:

Breaking the Silicon Floor – Digital Rebar v3.2 unlocks full life-cycle control for hardware provisioning

The difficulty in fully automating physical infrastructure environments, especially for distributed edge, adds significant cost, complexity and delay when building IT infrastructure. We’ve called this “underlay” or “ready state” in the past but “last mile” may be just as apt. The simple fact is that underlay is the foundation for everything you build above it so mistakes there are amplified.

Historically, simple systems still required manual or custom steps while complex systems where fragile and hard to learn. This dichotomy drives operators to add a cloud abstraction layer as a compromise because the cloud adds simple provisioning APIs at the prices of hidden operational complexity.

What if we had those simple APIs directly against the metal? Without the operational complexity?

That’s exactly what we’ve achieved in the latest Digital Rebar release. In this release, the RackN team refined the Digital Rebar control flows introduced in v3.1 based on customer and field experience. These flow are simple to understand, composable to build and amazingly fast in execution.

For example, you can build workflows that handle discovering machines with burn-in and inventory stages that install ssh keys that automatically register themselves for Terraform consumption. Our Terraform provider can then take those machines and make new workflow requests like “install CentOS” and tell me when it’s ready. When you’re finished, another workflow will teardown the system and scrub the data. That’s very cloud like behavior but directly on metal.

These workflows are designed to drive automatic behavior (like joining a Kubernetes cluster), simplify API requests (like target state for Terraform), or prepare environments for orchestration (like dynamic inventory for Ansible). They reflect our design goal to ensure that Digital Rebar integrates upstack easily.

Our point with Digital Rebar is to drive full automation down into the physical layer. By fixing the underlay, our approach accelerates and simplifies orchestration and platform layers above. We’re excited about the progress and invite you take 5 minutes to try our quick start.

Follow the Digital Rebar Community:

Digital Rebar Releases V3.2 – Stage Workflow

In v3.2, Digital Rebar continues to refine the groundbreaking provisioning workflow introduced in v3.1. Updates to the workflow make it easier to consume by external systems like Terraform. We’ve also improved the consistency and performance of both the content and service.

Note: we are accelerating the release schedule for Digital Rebar with a target of 4 to 6 weeks per release. The goal is to incrementally capture new features in stable releases so there is not a lengthy delay before fixes and features are available.

Here’s a list of features for the v3.2 release.

  • Promoted stage automation to release status in open source – these were RackN content during beta
  • Plugins now include content layers – they don’t require separate content and versioning is easier
  • Feature flags on endpoint and content – allows automation to detect if needed requirements are in place before attempting to use them
  • Improve exit codes from jobs – improves coordination and consistency in jobs
  • Allow runner to continue processing into new installed OS – helps with Terraform handoff and direct disk imaging
  • Add tooling for direct image deploy to sledgehammer – self explanatory
  • Change CLI to use Server models instead of swagger generated code – improves consistency and maintainability of the CLI
  • Machine Inventory (gohai utility) – collects machine information (in Golang!) so that automation can make decisions based on configuration
  • General bug fixes and performance enhancements – this was a release theme
  • Make it easier to export content from an endpoint – user requested feature
  • Improve how tokens and secrets are handed by the server – based on audit

The release of workflow and the addition of inventory means that Digital Rebar v3 effectively replaces all key functions of v2 with a significantly smaller footprint, minimal learning curve and improved performance. One v2 major feature, multi-node coordination, is not on any roadmap for v3 because we believe those use case are well serviced by upstack integrations like Terraform and Ansible.

Follow the Digital Rebar Community:

Digital Rebar v3.1 Release Annoucement

We’ve made open network provisioning radically simpler.  So simple, you can install in 5 minutes and be provisioning in under 30.  That’s a bold claim, but it’s also an essential deliverable for us to bridge the Ops execution gap in a way that does not disrupt your existing tool chains.

We’ve got a remarkable list of feature additions between Digital Rebar Provision (DRP) v3.0 and v3.1 that take it from basic provision into a powerful distributed infrastructure automation tool.

But first, we need to put v3.1 into a broader perspective: the new features are built from hard learned DevOps lessons.  The v2 combination of integrated provisioning and orchestration meant we needed a lot of overhead like Docker, Compose, PostgreSQL, Consul and RAILS.  That was needed for complex “one-click” cluster builds; however it’s overkill for users of Ansible, Terraform and immutable infrastructure flows.  

The v3 mantra is about starting simple and allowing users to grow automation incrementally.  RackN has been building advanced automation packages and powerful UX management to support that mission.

So what’s in the release?  The v3.0 release focused on getting core Provision infrastructure APIs, process and patterns working as a stand alone service. The v3.1 release targeted major architectural needs to streamline content management, event notification and add out-of-band actions.  

Key v3.1 Features

  • New Mascot and Logo!  We have a cloud native bare metal bear.  DRP fans should ask about stickers and t-shirts. Name coming soon! 
  • Layered Storage System. DRP storage model allows for layered storage tiers to support the content model and a read only base layer. These features allow operators to distribute content in a number of different ways and make field upgrades and multi-site synchronization possible.
  • Content packaging system.  DRP contents API allows operators to manage packages of other models via a single API call.  Content bundles are read-only and versioned so that field upgrades and patches can be distributed.
  • Plug-in system.  DRP allows API extensions and event listeners that are in the same process space as the DRP server.  This enables IPMI extensions and slack notifiers.
  • Stages, Tasks & Jobs.  DRP has a simple work queue system in which tasks are stored and tracked on machines during stages in their boot sequences.  This feature combines server and DRP client actions to create fast, simple and flexible workflows that don’t require agents or SSH access.
  • Websocket API for event subscription.  DRP clients can subscribe to system events using a long term websocket interface.  Subscriptions include filters so that operators can select very narrow notification scopes.
  • Removal of the minimal embedded UI (moving to community hosted UX).   DRP decoupled the user interface from the service API.  This allows features to be added to the UX without having to replace the Service.  This also allows community members to create their own UX.  RackN has agreed to support community users at no cost on a limited version of our commercial UX.

All of these features enable DRP to perform 100% of the hardware provision workflows that our customers need to run a fully autonomous, CI/CD enabled data center.  RackN has been showing examples of Ansible, Kubernetes, and Terraform to Metal integration as a reference implementations.

Getting the physical layer right is critical to closing your infrastructure execution gaps.  DRP v3.1 goes beyond getting it right – it makes it fast, simple and open.  Take a test drive of the open source code or give RackN a call to see our advanced automation demos.

September 8 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

Nora Jones on Establishing, Growing, and Maturing a Chaos Engineering Practice
https://www.infoq.com/podcasts/nora-jones-chaos-engineering

Nora Jones, a senior software engineer on Netflix’ Chaos Team, talks with Wesley Reisz about what Chaos Engineering means today. She covers what it takes to build a practice, how to establish a strategy, defines cost of impact, and covers key technical considerations when leveraging chaos engineering. Read more and listen to podcast

SRE Jobs

I ran a job search on LinkedIn to find the # of available SRE positions currently open; there are 854 positions available as of this morning. Dice.com listed 30,665 positions based on a search. In comparison, DevOps only had 2,975 positions on Dice.com.

Podcast on Ansible, Kubernetes, Kubespray and Digital Rebar

Stephen Spector, HPE Cloud Evangelist talks with Rob Hirschfeld, Co-Founder and CEO RackN about the installation process for Kubernetes using Kubespray, Ansible, and Digital Rebar Provisioning. Additional commentary on the overviews of Kubernetes, Containers, and Installation in this podcast.

_____________

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS

Podcast – Install Kubernetes with Ansible, Kubespray and Digital Rebar Provision

Stephen Spector, HPE Cloud Evangelist talks with Rob Hirschfeld, Co-Founder and CEO RackN about the installation process for Kubernetes using Kubespray, Ansible, and Digital Rebar Provisioning. Additional commentary on the overviews of Kubernetes, Containers, and Installation in this podcast.

More info on Digital Rebar Provisioning

Follow the RackN L8ist Sh9y Podcast

 

August 25 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week


Image from 

What is “Site Reliability Engineering?” 
https://landing.google.com/sre/interview/ben-treynor.html 

In this interview, Ben Treynor shares his thoughts with Niall Murphy about what Site Reliability Engineering (SRE) is, how and why it works so well, and the factors that differentiate SRE from operations teams in industry. READ MORE

Podcast: A Nice Mix of Ansible and Digital Rebar
http://bit.ly/2vkBYEe 

Follow our new L8ist Sh9y Podcast on SoundCloud at https://soundcloud.com/user-410091210.

Digital Rebar Mascot Naming 

Next week the Digital Rebar community will be finalizing the name for our mascot

Several possible names are listed on a recent blog post for your consideration. Please tweet to @DigitalRebar any ideas you have as we will be choosing a name next week via a Twitter poll.

Digital Rebar v3 Provision
http://rebar.digital/

Digital Rebar is the open, fast and simple data center provisioning and control scaffolding designed with a cloud native architecture.

Our extensible stand-alone DHCP/PXE/IPXE service has minimal overhead so it can be installed and provisioning in under 5 minutes on a laptop, RPi or switch. From there, users can add custom or pre-packaged workflows for full life-cycle automation using our API and CLI or a community UX.

A cloud native bare metal approach provides API-driven infrastructure-as-code automation without locking you into a specific hardware platform, operating system or configuration model.

For physical infrastructure provisioning, Digital Rebar replaces CobblerForemanMaaS or similar with the added bonus of being able to include simple control workflows for RAID, IPMI and BIOS configuration. We also provide event driven actions via websockets API and a simple plug-in model. By design, Digital Rebar is not opinionated about scripting tools so you can mix and match Chef, Puppet, Ansible, SaltStack and even Bash.

Next version: release of v3.1 is anticipated on 9/4/2017.

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS

Podcast – A Nice Mix of Ansible and Digital Rebar

Rob Hirschfeld, CEO and Co-Founder, RackN talks with Stephen Spector, HPE Cloud Evangelist about the recent uptake in Ansible news as well as how Digital Rebar Provision assists Ansible users.

Listen to the 9 minute podcast here:

As this is the launch of L8ist Sh9y Podcast from RackN we encourage you to visit our site at https://soundcloud.com/user-410091210 or subscribe to the RSS Feed. We will also be publishing on iTunes as well shortly.

Spiraling Ops Debt & the SRE coding imperative

This post is part of an SRE series grounded in the ideas inspired by the Google SRE book.

2/13 Update: You can hear an INTERACTIVE DISCUSSION based on this post with Eric Wright on his podcast, GC Online.

Every Ops team I know is underwater and doesn’t have the time to catch their breath.

Why does the load increase and leave Ops behind?  It’s because IT is increasingly fragmented and siloed by both new tech and past behaviors.  Many teams simply step around their struggling compatriots and spin up yet more Ops work adding to the backlog. Dashing off yet another Ansible playbook to install on AWS is empowering but ultimately adds to the Ops sustaining backlog.

c2wfuvaveaaronn

Ops Tsunami

That terrifying observation two years ago led me to create this graphic showing how operations is getting swamped by new demand for infrastructure.

It’s not just the amount of infrastructure: we’ve got an unbounded software variation problem too.

It’s unbounded because we keep rapidly evolving new platforms and those platforms are build on rapidly evolving components.  For example, Kubernetes has a 3 month release cycle.  That’s really fast; however, it built on other components like Docker, SDN and operating systems that also have fast release cycles.  That means that even your single Kubernetes infrastructure has many moving parts that may not be consistent in your own organization.  For example, cloud deploys may use CoreOS while internal ones use a Corporate approved Centos.

And the problem will get worse because infrastructure is cheap and developer productivity is improving.

Since then, we’ve seen an container fueled explosion in developer productivity and AI driven-rise in new hardware-flavored instances. Both are power drivers of infrastructure consumption; however, we have not seen a matching leap in operations tooling (that’s a future post topic!).

That’s why the Google SRE teams require a 50% automation vs Ops ratio.  

If the ratio is >50 then the team slowly sinks under growing operational load.  If you are not actively decreasing the load via automation then your teams get underwater and basic ops hygiene fails.

This is not optional – if you are behind now then it will just get worse!

The escape from the cycle is to get help.  Stop writing automation that you can buy or re-use.  Get help running it.  Don’t waste time solving problems that other people have solved.  That may mean some upfront learning and investment but if you aren’t getting out of your own way then you’ll be run over.