Meet with RackN Next Week at GlueCon

Look for the RackN team next week in Colorado at GlueCon 2018. As a Bronze sponsor, we have a small booth for attendees to meet with our team and talk DevOps, Bare Metal, Immutability, Edge Computing and other topics. In addition, be sure to attend Rob Hirschfeld’s session on Wednesday.

If you are interested in meeting with us during the event please contact me to setup a meeting time.

Session

Making Bare Metal Go Cloud Native: The Power of Immutable Deploys
Speaker: Rob Hirschfeld, Co-Founder/CEO RackN
Wednesday May 16 from 2:50 – 3:30pm
Breakout 2 Track: DevOps

The benefits of automated cloud deployments for speed, reliability and security are undeniable.  The cornerstone of this approach, immutable deployment, promotes the idea of continuously rolling safe, stable images instead of trying to keep up with managing a fixed pool of machines,  If this pattern is so great, shouldn’t we bring it into the physical layer too?

In this talk, we’ll explore the immutable infrastructure pattern and how to use continuous deployment and continuous integration (CI/CD) process to build and manage server images for any platform.  Then we’ll show how automate deploying these images quickly and reliability with open DevOps tools like Terraform and Digital Rebar.  Not only is this approach fast, it’s also more secure and robust for operators.

If you are running physical infrastructure, this talk will change how you think about your job in profound ways.

The Case for Ops Engineering Pay Equity w/ Charity Majors

TL;DR: Operators need pay/status equity to succeed.

Charity Majors is one of my DevOps and SRE heroes* so it was great fun to be able to debate SRE with her at Gluecon this spring.  Encouraged by Mike Maney to retell the story, we got to recapture our disagreement about “Is SRE is Good Term?” from the evening before.

While it’s hard to fully recapture with adult beverages, we were able to recreate the key points.

First, we both strongly agree that we need status and pay equity for operators.  That part of the SRE message is essential regardless of the name of the department.

Then it get’s more nuanced. Charity, whose more of a Silicon Valley insider, believes that SRE is tainted by the “Google for Everyone” cargo cult.  She has trouble separating the term SRE from the specific Google practices that helped define it.  

As someone who simply commutes to Silicon Valley, I do not see that bias in the discussions I’ve been having.  I do agree that companies that try to simply copy Google (or other unicorns) in every way is a failure pattern.

Charity: “I don’t want get paid to keep someone else’s shit site alive”

I think Google did a good job with the book by defining the term for a broad audience. Charity believes this signals that SRE means you are working for a big org.  Charity suggested several better alternatives, Operations Engineer.  At the end, the danger seems to be when Dev and Ops create silos instead of collaborating.

Consensus: Job Title?  Who cares.  The need to to make operations more respected and equal.

What did you think of the video?  How is your team defining Operations titles and teams?

(*) yes, I’m working on an actual list – stay tuned.

June 16 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rack ngo)

SRE Items of the Week

The Cloudcast #301 – SRE and Infrastructure Operations
http://www.thecloudcast.net/2017/06/the-cloudcast-301-sre-and.html

Description: Brian talks with Rob Hirschfeld (@zehicle, Founder/CEO of @RackN) about the concepts of SRE (Site Reliability Engineering), the challenges of maintaining infrastructure software, emerging tools and the next-generation of operations.

Show Notes:

  • Topic 1 – Welcome back to the show. Let’s start by talking about the concept of SRE (Site Reliability Engineering). Give us the basics and maybe explain how it differs from what people define in DevOps.
  • Topic 2 – Application development has been moving faster for quite a while (agile development, etc.). But now infrastructure/operations teams have to deal with faster software – especially around updates (e.g. Kubernetes releases every 3 months). How are companies managing this?
  • Topic 3 – Given that this pace of operations change may not slow down, how do you think about the challenge in terms of process/operations versus technology/tools?
  • Topic 4 – What are some of the steps that companies take to better prepare for this type of operational model? Tools, process, skills, etc.
  • Topic 5 – Do you see SRE as being a progression for existing infrastructure/operations people, or is this more focused on sysadmins or developers that want to get away from building applications?

_____________

DevOps Enterprise Summit London: Tales of courage and community
https://techbeacon.com/devops-enterprise-summit-london-tales-courage-community

After spending two amazing days with 700 of my closest DevOps cohorts from Europe, the Middle East, Africa, and beyond, I learned all about the latest and greatest IT and technology transformation reports at the DevOps Enterprise Summit London. With substantial growth in attendance from the first year, in 2016, the buzz around the show was palpable. And, what a location! From the venue, the QEII Centre, we had 360-degree views of central London, from Big Ben to the London Eye and beyond.

Read more from Steve Brodie, CEO of Electric Cloud @stbrodie
_____________

.IO! .IO! It’s off to a Service Mesh you should go [Gluecon 2017 notes]
http://bit.ly/2rjw4We  

Gluecon turned out to be all about a microservice concept called a “service mesh” which was being promoted by Buoyant with Linkerd and IBM/Google/Lyft with Istio.  This class of services is a natural evolution of the rush to microservices and something that I’ve written microservice technical architecture on TheNewStack about in the past. READ MORE
_____________

A few things I’ve learned about Kubernetes
https://jvns.ca/blog/2017/06/04/learning-about-kubernetes/

I’ve been learning about Kubernetes at work recently. I only started seriously thinking about it maybe 6 months ago – my partner Kamal has been excited about Kubernetes for a few years (him: “julia! you can run programs without worrying what computers they run on! it is so cool!“, me: “I don’t get it, how is that even possible”), but I understand it a lot better now.

This isn’t a comprehensive explanation or anything, it’s some things I learned along the way that have helped me understand what’s going on.

Read more from Julia Evans @b0rk
_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

  • 2017 New York Venture Summit – LINK

OTHER NEWSLETTERS

 

 

.IO! .IO! It’s off to a Service Mesh you should go [Gluecon 2017 notes]

TL;DR: If you are containerizing your applications, you need to be aware of this “service mesh” architectural pattern to help manage your services.

Gluecon turned out to be all about a microservice concept called a “service mesh” which was being promoted by Buoyant with Linkerd and IBM/Google/Lyft with Istio.  This class of services is a natural evolution of the rush to microservices and something that I’ve written microservice technical architecture on TheNewStack about in the past.

servicemeshA service mesh is the result of having a dependency grid of microservices.  Since we’ve decoupled the application internally, we’ve created coupling between the services.  Hard coding those relationships causes serious failure risks so we need to have a service that intermediates the services.  This pattern has been widely socialized with this zipkin graphic (Srdan Srepfler’s microservice anatomy presentation)

IMHO, it’s healthy to find service mesh architecturally scary.

One of the hardest things about scaling software is managing the dependency graph.  This challenge is unavoidable from early days of Windows “DLL Hell” to the mixed joy/terror of working with Ruby Gem, Python Pip and Node.js NPM.  We get tremendous acceleration from using external modules and services, but we also pay a price to manage those dependencies.

For microservice and Cloud Native designs, the service mesh is that dependency management price tag.

A service mesh is not just a service injected between services.  It’s simplest function is to provide a reverse proxy so that multiple services can be consolidated under a single end-point.  That quickly leads to needing load balancers, discovery and encrypted back-end communication.  From there, we start thinking about circuit breaker patterns, advanced logging and A/B migrations.  Another important consideration is that service meshes are for internal services and not end-user facing, that means layers of load balancers.

It’s easy to see how a service mesh becomes a very critical infrastructure component.

If you are working your way through containerization then these may seem like very advanced concepts that you can postpone learning.  That blissful state will not last for long and I highly suggest being aware of the pattern before your development teams start writing their own versions of this complex abstraction layer.  Don’t assume this is a development concern: the service mesh is deeply tied to infrastructure and operations.

The service mesh is one of those tricky dev/ops intersections and should be discussed jointly.

Has your team been working with a service mesh?  We’d love to hear your stories about it!

Related Reading:

June 2 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

RackN and our Co-Founder and CEO Rob Hirschfeld openly called for a significant change to the OpenStack and Kubernetes communities in his VMBlog.com post, How is OpenStack so dead AND yet so very alive to SREs?

“We’re going to keep solving problems in and around the OpenStack community.  I’m excited to see the Foundation embracing that mission.  There are still many hard decisions to make.  For example, I believe that Kubernetes as an underlay is compelling for operators and will drive the OpenStack code base into a more limited role as a Kubernetes workload (check out my presentation about that at Boston).  While that may refocus the coding efforts, I believe it expands the relevance of the open infrastructure community we’ve been building.

Building infrastructure software is hard and complex.  It’s better to do it with friends so please join me in helping keep these open operations priorities very much alive.”

To provide more information on this idea, Rob posted a new blog, OpenStack’s Big Pivot: our suggestion to drop everything and focus on being a Kubernetes VM management workload.

“Sometimes paradigm changes demand a rapid response and I believe unifying OpenStack services under Kubernetes has become an such an urgent priority that we must freeze all other work until this effort has been completed.”

This proposal has caused significant readership for a typical RackN blog as well as on social media so Rob has posted a 2nd post to further the proposal. (re)Finding an Open Infrastructure Plan: Bridging OpenStack & Kubernetes.

It’s essential to solve these problems in an open way so that we can work together as a community of operators.”

As you would expect, RackN is very interested in your thoughts on this proposal and its impact not only on the OpenStack and Kubernetes communities but also how it can transform the ability of IT infrastructure teams to deploy complex technologies in a reliable and scalable manner.

Please contact @zehicle and @rackngo to join the conversation.
_____________

Using Containers and Kubernetes to Enable the Development of Object-Oriented Infrastructure: Brendan Burns GlueCon Presentation

Is SRE a Good Term?
Interview with Rob Hirschfeld (RackN) and Charity Majors (Honeycomb) at Gluecon 2017


_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

Velocity : June 19 – 20 in San Jose, CA

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly)Issue #74

May 26 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

booth.PNG
Co-Founders of RackN Rob Hirschfeld and Greg Althaus at GlueCon

Reuven Cohen and Rob Hirschfeld Chat at GlueCon17
Reuven Cohen (@ruv) and Rob Hirschfeld discuss data center infrastructure trends concerning provisioning, automation and challenges. Rob highlights his company RackN and the open source project Digital Rebar sponsored by RackN.


_____________

Is SRE a Good Term?
Interview with Rob Hirschfeld (RackN) and Charity Majors (Honeycomb) at Gluecon 2017


_____________

How Google Runs its Production Systems – Get the Book
http://www.techrepublic.com/article/want-to-understand-how-google-runs-its-production-systems-read-this-free-book/

The book Site Reliability Engineering helps readers understand how some Googlers think: It contains the ideas of more than 125 authors. The four editors, Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy, managed to weave all of the different perspectives into a unified work that conveys a coherent approach to managing distributed production systems.

Site Reliability Engineering delivers 34 chapters—totaling more than 500 printed pages from O’Reilly Media—that encompass the principles and practices that keep Google’s production systems working. The entire book is available online at https://landing.google.com/sre/book.html, along with links to other talks, interviews, publications, and events.

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

Velocity : June 19 – 20 in San Jose, CA

OTHER NEWSLETTERS

SRE Weekly (@SREWeekly)Issue #73