September 8 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

Nora Jones on Establishing, Growing, and Maturing a Chaos Engineering Practice
https://www.infoq.com/podcasts/nora-jones-chaos-engineering

Nora Jones, a senior software engineer on Netflix’ Chaos Team, talks with Wesley Reisz about what Chaos Engineering means today. She covers what it takes to build a practice, how to establish a strategy, defines cost of impact, and covers key technical considerations when leveraging chaos engineering. Read more and listen to podcast

SRE Jobs

I ran a job search on LinkedIn to find the # of available SRE positions currently open; there are 854 positions available as of this morning. Dice.com listed 30,665 positions based on a search. In comparison, DevOps only had 2,975 positions on Dice.com.

Podcast on Ansible, Kubernetes, Kubespray and Digital Rebar

Stephen Spector, HPE Cloud Evangelist talks with Rob Hirschfeld, Co-Founder and CEO RackN about the installation process for Kubernetes using Kubespray, Ansible, and Digital Rebar Provisioning. Additional commentary on the overviews of Kubernetes, Containers, and Installation in this podcast.

_____________

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS

September 1 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)


Image from @DevOpsDaysDFW

10 Essential Skills of a Site Reliability Engineer (SRE) by AppDynamics
https://cloud.kapostcontent.net/pub/1418185e-b325-49d3-b65c-de338e45cb6f/ebook-10-essential-skills-of-a-site-reliability-engineer-sre.pdf

Almost overnight, it seems that Site Reliability Engineer (SRE) has become one of the hottest job titles across the IT Industry. So why all the sudden buzz and momentum around the SRE role? READ MORE

DevOps Tool Market Size Applications 2017 to 2022
http://www.tradecalls.org/2017-08-31-devops-tool-market

Global DevOps Tool Market Research Report 2017 to 2022 presents an in-depth assessment of the DevOps Tool Market including enabling technologies, key trends, market drivers, challenges, standardization, regulatory landscape, deployment models, operator case studies, opportunities, future roadmap, value chain, ecosystem player profiles and strategies. The report also presents forecasts for DevOps Tool Market investments from 2017 till 2022.

READ REPORT

Don’t be ageist: In the DevOps era, experience matters by @Jenz514
https://techbeacon.com/dont-be-ageist-devops-era-experience-matters

When it comes to attitudes toward age, DevOps is a lot like IT in general, but possibly more so. Defenders of an IT workforce that skews young have always noted that technology changes quickly, skills must be updated rapidly, business demands evolve fast, and long workdays just don’t appeal to professionals who have families to go home to. All of that may ratchet up even higher in DevOps culture. READ MORE

L8ist Sh9y Podcast : Digital Rebar and Terraform Provisioning
Blog Link http://bit.ly/2xPILHb 

Stephen Spector, HPE Cloud Evangelist talks with Greg Althaus, CTO and Co-Founder of RackN about how the Digital Rebar Provisioning solution provides bare metal server support for the HashiCorp Terraform Solution.

_____________

Subscribe to our new daily DevOps, SRE, & Operations Newsletter
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com

OTHER NEWSLETTERS

 

August 25 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week


Image from 

What is “Site Reliability Engineering?” 
https://landing.google.com/sre/interview/ben-treynor.html 

In this interview, Ben Treynor shares his thoughts with Niall Murphy about what Site Reliability Engineering (SRE) is, how and why it works so well, and the factors that differentiate SRE from operations teams in industry. READ MORE

Podcast: A Nice Mix of Ansible and Digital Rebar
http://bit.ly/2vkBYEe 

Follow our new L8ist Sh9y Podcast on SoundCloud at https://soundcloud.com/user-410091210.

Digital Rebar Mascot Naming 

Next week the Digital Rebar community will be finalizing the name for our mascot

Several possible names are listed on a recent blog post for your consideration. Please tweet to @DigitalRebar any ideas you have as we will be choosing a name next week via a Twitter poll.

Digital Rebar v3 Provision
http://rebar.digital/

Digital Rebar is the open, fast and simple data center provisioning and control scaffolding designed with a cloud native architecture.

Our extensible stand-alone DHCP/PXE/IPXE service has minimal overhead so it can be installed and provisioning in under 5 minutes on a laptop, RPi or switch. From there, users can add custom or pre-packaged workflows for full life-cycle automation using our API and CLI or a community UX.

A cloud native bare metal approach provides API-driven infrastructure-as-code automation without locking you into a specific hardware platform, operating system or configuration model.

For physical infrastructure provisioning, Digital Rebar replaces CobblerForemanMaaS or similar with the added bonus of being able to include simple control workflows for RAID, IPMI and BIOS configuration. We also provide event driven actions via websockets API and a simple plug-in model. By design, Digital Rebar is not opinionated about scripting tools so you can mix and match Chef, Puppet, Ansible, SaltStack and even Bash.

Next version: release of v3.1 is anticipated on 9/4/2017.

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS

August 18 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

Beyond Google SRE: What is Site Reliability Engineering like at Medium?
https://blog.netsil.com/beyond-google-sre-what-is-site-reliability-engineering-like-at-medium-71c65bd35f4e


We had the opportunity to sit down with Nathaniel Felsen, DevOps Engineer at Medium and the author of “Effective DevOps with AWS”. We are happy to share some practical insights from Nathaniel’s extensive experience as a seasoned DevOps and SRE practitioner.

While we hear a lot about these experiences from Google, Netflix, etc., we wanted to gather perspectives on DevOps and SRE life with other easily relatable companies. From tech-stack challenges to organization structure, Nathaniel provides a wide range of practical insights that we hope will be valuable in improving DevOps practices at your organization. READ MORE

GitHub seeks to spur innovation with Kubernetes migration
http://www.zdnet.com/article/github-seeks-to-spur-innovation-with-kubernetes-migration/

GitHub on Wednesday is sharing the details of the massive technical endeavor its engineers went through to migrate the infrastructure that powers github.com and api.github.com — some of its most critical workloads — from a set of manually-configured physical servers to Kubernetes clusters that run application containers.

GitHub is confident the move will allow for faster innovation on the online code sharing and development platform. READ MORE

SRE Thinking: Reframing Dev + Ops
http://bit.ly/2w2I53F

Last month, Eric Wright and I were able to complete a discussion the inspired my guest post for CapitalOne “How Platforms and SREs Change the DevOps Contract.” While our conversation ranged widely over the challenges of building and integration of IT processes, the key message is simple: we need to make investments in operations. READ MORE

Coal or Diamonds? Configuration Management is Under Pressure
http://bit.ly/2uTvADN

Cloud Native thinking is thankfully changing the way we approach traditional IT infrastructure.  These profound changes in how we build applications with 12-factor design and containers has deep implications on how we manage configuration and the tools we use to do it.  These are not cloud only impacts – the changes impact every corner of IT data centers. READ MORE

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/

_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS

SRE Thinking : Reframing Dev + Ops

Last month, Eric Wright and I were able to complete a discussion the inspired my guest post for CapitalOne “How Platforms and SREs Change the DevOps Contract.” While our conversation ranged widely over the challenges of building and integration of IT processes, the key message is simple: we need to make investments in operations.

This podcast explains why I’ve been using Site Reliability Engineering (SRE) as a proxy for this DevOps inspired rethinking of operations.

I hope you’ll take the time to listen to this deep conversation about very real IT issues. Eric and I are not shy about expressing our opinions, but we’re also anti-shaming. The simple reality is that building infrastructure is hard and we all make difficult choices. My hope is that we can start sharing the fixes and helping each other out.

Podcast Episode 50 – SRE Revisited plus the Challenges of Ops and more with Rob Hirschfeld (@zehicle) 

Do these topics inspire you? Creating data center automation for SREs is our mission at RackN. We believe that well run infrastructure requires building APIs from the ground up and keeping them simple. I hope that you’ll take 5 minutes to try our latest offering, Digital Rebar Provision and join us on the quest drive excellence in operations.

 

July 28 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

This week, we launched our new RackN website to provide more information on our solutions and services as well as provide customer examples. Click over to our new site and let us know your thoughts.

SRE Items of the Week

Site Reliability Engineer: Don’t fall victim to the bias blind spot
http://sdtimes.com/site-reliability-engineer-dont-fall-victim-to-the-bias-blind-spot/

To ensure websites and applications deliver consistently excellent speed and availability, some organizations are adopting Google’s Site Reliability Engineering (SRE) model. In this model, a Site Reliability Engineer (SRE) – usually someone with both development and IT Ops experience – institutes clear-cut metrics to determine when a website or application is production-ready from a user performance perspective. This helps reduce friction that often exists between the “dev” and “ops” sides of organizations. More specifically, metrics can eliminate the conflict between developers’ desire to “Ship it!” and operations desire to not be paged when they are on-call. If performance thresholds aren’t met, releases cannot move forward. READ MORE

Episode 50 – SRE Revisited plus the Challenge of Ops and more with Rob Hirschfeld
http://podcast.discoposse.com/e/ep-50-sre-revisited-plus-the-challenges-of-ops-and-more-with-rob-hirschfeld-zehicle/

This fun chat expands on what we started talking about in episode 42 (http://podcast.discoposse.com/e/ep-42-spiraling-ops-debt-sre-solutions-and-rackn-chat-with-rob-hirschfeld-zehicle/) as we dive into the challenges and potential solutions for thinking and acting with the SRE approach. Big thansk to Rob Hirschfeld from @RackN for sharing his thoughts and experiences from the field on this very exciting subject. LISTEN HERE

Site Reliability Engineering – Operators and Developers Working Together
http://bit.ly/2u7eSmm 

Rob Hirschfeld, Co-Founder and CEO of RackN provides his thoughts on how operators are equivalent to developers and work together to accomplish the critical task of keep the infrastructure running and available with constant changes in the data center

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS

July 14 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

Teradata Acquires San Diego-based Start-up StackIQ to Strengthen Teradata Everywhere and IntelliCloud Capabilities
http://prn.to/2vicpUb

SAN DIEGO, July 13, 2017 /PRNewswire/ — Teradata (NYSE:  TDC), the leading data and analytics company, today announced the acquisition of StackIQ, developers of one of the industry’s fastest bare metal software provisioning platforms which has managed the deployment of cloud and analytics software at millions of servers in data centers around the globe. The deal will leverage StackIQ’s expertise in open source software and large cluster provisioning to simplify and automate the deployment of Teradata Everywhere. Offering customers the speed and flexibility to deploy Teradata solutions across hybrid cloud environments, allows them to innovate quickly and build new analytical applications for their business.

How Platforms and SREs Change the DevOps Contract on  CapitalOne DevExchange
http://bit.ly/2uVXekf

capitalone
DevOps struggles under a “fully shared responsibility” contract for Developers and Operations that drives a futile search for elusive “full-stack engineers.” It’s time to revisit how to Dev and Ops are going to collaborate because these jobs often have different priorities.
READ MORE

RackN Introduction Video
Rob Hirschfeld, CEO and Co-Founder introduces RackN in 48 seconds

Kubernauts Worldwide Meetup
This video is from our first Kubernauts Worldwide Meetup covering the new features in Kubernetes 1.7 presented by Ihor Dvoretskyi, Kubernetes Pain Points and Upgrade presented by Rob Hirschfeld and about Kubernauts Training presented by Des Drury. Arash Kaffamanesh moderated the online meetup and provided a short overview about what Kubernauts are about.

Rob starts at 38 minute 50 seconds

Video Series w/ Packet.net
Three videos showing how to use Packet.net custom IPXE option with Digital Rebar IPXE provisioning

http://bit.ly/2t54J65      (Video 1 of 3)
http://bit.ly/2tO5WCy   (Video 2 of 3)
http://bit.ly/2vi5dXZ     (Video 3 of 3)

Let’s DevOps IRL: My SRE Postings on RackN by Rob Hirschfeld
http://bit.ly/2tzCvnj  

I’m investing in these Site Reliability Engineering (SRE) discussions because I believe operations (and by extension DevOps) is facing a significant challenge in keeping up with development tooling.   The links below have been getting a lot of interest on twitter and driving some good discussion. READ MORE

newsletter

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS