August 18 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

Beyond Google SRE: What is Site Reliability Engineering like at Medium?
https://blog.netsil.com/beyond-google-sre-what-is-site-reliability-engineering-like-at-medium-71c65bd35f4e


We had the opportunity to sit down with Nathaniel Felsen, DevOps Engineer at Medium and the author of “Effective DevOps with AWS”. We are happy to share some practical insights from Nathaniel’s extensive experience as a seasoned DevOps and SRE practitioner.

While we hear a lot about these experiences from Google, Netflix, etc., we wanted to gather perspectives on DevOps and SRE life with other easily relatable companies. From tech-stack challenges to organization structure, Nathaniel provides a wide range of practical insights that we hope will be valuable in improving DevOps practices at your organization. READ MORE

GitHub seeks to spur innovation with Kubernetes migration
http://www.zdnet.com/article/github-seeks-to-spur-innovation-with-kubernetes-migration/

GitHub on Wednesday is sharing the details of the massive technical endeavor its engineers went through to migrate the infrastructure that powers github.com and api.github.com — some of its most critical workloads — from a set of manually-configured physical servers to Kubernetes clusters that run application containers.

GitHub is confident the move will allow for faster innovation on the online code sharing and development platform. READ MORE

SRE Thinking: Reframing Dev + Ops
http://bit.ly/2w2I53F

Last month, Eric Wright and I were able to complete a discussion the inspired my guest post for CapitalOne “How Platforms and SREs Change the DevOps Contract.” While our conversation ranged widely over the challenges of building and integration of IT processes, the key message is simple: we need to make investments in operations. READ MORE

Coal or Diamonds? Configuration Management is Under Pressure
http://bit.ly/2uTvADN

Cloud Native thinking is thankfully changing the way we approach traditional IT infrastructure.  These profound changes in how we build applications with 12-factor design and containers has deep implications on how we manage configuration and the tools we use to do it.  These are not cloud only impacts – the changes impact every corner of IT data centers. READ MORE

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/

_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS

Update a pull request from git command line

Hand up

Sometimes we just need to feed the SEO gods… in this case, I could not find the simple git command line to update a pull request that I had in flight.

I was looking for the following:

git push -f personal HEAD:[pull branch]

Github.com happily gave me instructions from the pull branch but not the CLI version of the command.  The trick is that you need to know your remote (git remote) for the command so it’s not perfectly generic.  In the example above, my personal repo is named “personal”.

Deconstructing the command: you are pushing your to your personal clone from the local HEAD commit against the branch created for the pull request.  That’s because the pull request creates a branch from your clone to be pulled into the upstream repo.  That’s why it’s a PULL request not a push.

Ultimately, this is pretty basic git.  My experience with git is that the definition of “pretty basic” is a binary function.  Once you know how git works, everything in git is pretty basic.  Until then it’s completely opaque.

Side note: this is my 301st post on this blog!

8/1/2013 Post Script from Crowbar Contributor Adam Spiers

He noticed that I should include the -f in the git push instruction.  Read more at about that on his blog.

Hadoop Crowbar released to open source! (plus AN HOUR of videos!)

I’m proud to announce that my team at Dell has open sourced our Apache Hadoop barclamps!  This release follows our Dell | Cloudera Hadoop Solution open source commitment from Hadoop World earlier this month.

As part of this release, we’ve created nearly AN HOUR of video content showing the Hadoop Barclamps in action, installing Crowbar (on CentOS), building Crowbar ISOs in the cloud and specialized developer focused builds.

If you want to talk to the Crowbar team.  We’re attending events in Boston 11/29, Seattle 11/30, and Austin 12/8.

Here are links to the videos:

More Hadoop perspectives from Dell:  Joseph George on what it means and  Barton George‘s backgrounder about barclamps.

Dell Crowbar to deploy OpenStack Diablo Cloud

Direction in the Cloud

Photo by JB George

This week, some of the Crowbar/Dell OpenStack-Powered Cloud team, plus Matt Ray from Opscode, have been working with our partners at Rackspace in San Antonio (see Opscode post about collaboration). Our target is to have Crowbar deliver a core Diablo deployment by the October 2011 design conference (sponsored in part by Dell). This is a collaborative effort and we invite community participation – we are trying to be open and communicative (via the Crowbar listserv) while also respecting that there is a mountain of work if we are to meet deadlines.

We are doing the work in the open on the Crowbar Github so you have access to the very latest capabilities and it also means that the head the Crowbar may be unstable while we add capabilities. We feel like this is an important trade off because it allows us to keep up with the rapid pace of development in OpenStack (and other projects). This is the motivation for the recent modularization work and will continue to be a feature driver for Crowbar enhancements because it allows Crowbar users to easily bring in updated bits.

 

Crowbar modularized: latest changes that make clouds even easier to create, update, and maintain

In the last week, my team at Dell completed a major refactoring of Crowbar that significantly improves our ability to bring in community contributions and field customizations.  Today, we merged it into Crowbar’s public repo(s).

From the very first versions, our objective for Crowbar was to create the fastest and most reliable cloud deployments. Along the way, we realized Crowbar’s true potential lay in embracing DevOps as an operational model for maintaining clouds. That meant building up cloud deployments in layers from pieces that we call barclamps (extensions of Chef cookbooks). Our first version, centered on OpenStack Cactus, leveraged barclamps but was still created as a single system. This unified system was a huge step forward in cloud deployments, but did not live up to our CloudOps vision of continuous delivery.

In this version, each Crowbar barclamp is an independent delivery unit that can be integrated before, while or after installing Crowbar.

The core of the change is each barclamp, including the most core ones, are stored in independent code repositories. Putting the code into distinct repos means that each barclamp can have its own life cycle, its own maintainer site and its own dependency tree. This modularization allows customers to manage their Crowbar deployments with a very fine brush: they may choose to customize parts of the system, they could lock components to specific tag and they can bring in barclamps from other vendors.

While the core barclamps are automatically integrated into the Crowbar build using git submodules; other barclamps are installed into the system as needed. This allows you to pull in the suite of OpenStack barclamps at build time or to wait until your Crowbar system is running before installing. Once you install a barclamp, you are able to retrieve an updated barclamp and reapply it to the system.

This feature gives you the ability to 1) choose exactly what you want to include and 2) perform field updates to a live Crowbar system.

Let’s look at some examples:

  1. The Cloud Foundry barclamp can be sourced Cloud Foundry instead of bundled into the Crowbar repository. This allows the team working on the cloud application to take ownership for their own deployment. As a continuous delivery proponent, I believe strongly that the development team should be responsible for ensuring that their code is deployable (refer to my OpenStack “Deployer API” blue print attempting to codify this).
  2. DreamHost, maintainers of Ceph Storage, can maintain their own local barclamp repos for OpenStack that are cloned from our community Swift barclamp. This allows them to innovate and customize OpenStack deployments for their business and choose which updates to merge back to the community.
  3. Rackspace Cloud Builders can work on the most leading edge OpenStack features and maintaining workable deployments on branches. As the code stabilizes, they simply merge in their changes.
  4. Dell BIOS and RAID barclamps only support the PowerEdge C line today. When we offer PowerEdge R support, you will be able to install or update the barclamps to add that capability. If another hardware vendor creates a barclamp for their hardware then you can install that into your existing system.

I believe that these changes to Crowbar are a huge step forwards on our journey of creating a community supportable Open Operations framework. I hope that you are as excited as I am about these changes.

I encourage you to take the first step by trying out Crowbar and, ultimately, writing your own barclamps.

Post Scripts:

  • In addition to the modularization, the updated code includes RHEL as a deployment platform. At present, you must choose to be either RHEL or Ubuntu at build time.
  • We have enhanced the network barclamp to describe connections as more abstract connections, called conduits, between nodes. This is a powerful change, but requires some understanding before you start making changes.
  • We have only begun testing the change as of 9/12, we expect the system to be fully stabilized by 10/3. If you are not willing to deal with bugs then I recommend building the Crowbar “v1.0” tag (or using the ISOs from our July launch).

Crowbar modularization work begins

I shared the following with the Crowbar listserv and wanted to post it for the larger audience.  If you want the latest on Crowbar then subscribe!

We’ve been getting questions and defects (thanks Matt Ray) about how we are going to allow you to update and add barclamps to Crowbar.  We’re working on that exact issue right now – you can watch me on the “modules” branch of the github.

NOTE TO CROWBAR FOLLOWERS: we are moving some items around in the repo!  There are “cactus” and “v1.0” tags in place so you can still build the current trees after we start the refactor.

We’ve got some big plans that I’ll outline on the list and earlier posts.

Right now, we’re working to modularize barclamps so that each one is in its own github repo.   This will allow you to pull in barclamps at build time or live on site. We’re also creating import/update routines that work for live systems to make it easier to develop barclamps.  Once again, that’s on the github modules branch. These will be exposed as rake barclamp:create[“foo”] and rake barclamp:install[“../foo”] type commands and I’ve committed to create some “how to make barclamps” videos.

That work is a prelude for a hard push on OpenStack Diablo before the design conference.  All that work will also be done in the github but the Diablo barclamps will be in independent repos from the Crowbar framework.

If you want to get started early.  80% of a barclamp effort is around the Chef Cookbooks.  Keith Hudgins with DTO did a great job writing up barclamps here: http://kb.dtosolutions.com/wiki/Deploying_the_cloudfoundry_barclamp.  We’re changing some of it to make it much easier and more modular.

Crowbar build, build, run notes on project Github Wiki

Now that Crowbar has a Dell sponsored listserv and Wiki, I’m encouraging people to use those resources.

We are still adding to the wiki, but it’s got the basics covered.

Here are the links to get started:

Crowbar build using Ubuntu 10.10 vm on Rackspace Cloud from Github Repo

Our OpenStack team at Dell (especially Victor Lowthor) has been working hard with the public Crowbar repos to make it possible for the community to build their own version of a Crowbar ISO.   When you build the ISO, you’ll be downloading a whole bunch (that’s the technical term) of open source licensed components to make it work: we’re trying to maintain a list of licenses on the Github wiki.

To make sure that it was possible for mortals, I signed up for a Ubuntu 10.10 VM (512 Mb RAM, $0.03/hr) at RackSpace Cloud.  I did this from a non-Dell to ensure that it was as independent from our source as possible.

Once I had my vm, there were just a few steps to follow (these are NOT verbatim):

  • apt-get install debootstrap, mkisofs, git, build-essential packages
  • git clone git://github.com/dellcloudedge/crowbar.git
  • Got the results from a sledgehammer build (a fresh sledgehammer tarball) and extracted it into $HOME/.crowbar-build-cache/tftpboot, which is where build_crowbar.sh expects to find it cached.
    • NOTE: I’m not ready to document sledgehammer builds yet, but I will tell you that you’d need a CentOS VM.
  • In the crowbar directory, ran ./build_crowbar.sh
  • The build will pull down all the packages that you need and cache them to the VM.  Subsequent builds will be much faster!

The end result of the build is an “openstack-dev.iso” that will install Crowbar with the OpenStack barclamps (here’s how to do it on VMs).  Just for fun, I copied _my build_ output ISO off the build VM and to my web server.

Please let me know if you have problems with this process, we want people to try Crowbar!

$$ Note: Turn off your VM when you’re done so you don’t incur extra expenses.  Since this process only took about 2 hours, the whole build cost me less than a dime.  Which is good, since I was building it on “my own dime” anyway.

Crowbar source released, includes OpenStack Cloud install

I’m delighted to announce (official version) that my team at Dell has opened the Crowbar source under the Apache 2 license. This action is part of the broader Dell OpenStack Cloud Solution which includes OpenStack install packages, Crowbar, reference hardware architectures, and services/consulting to support deployments.

There are two important components to this news:

  1. Dell is officially offering our OpenStack Solution and helping advance the community’s ability to implement OpenStack quickly and consistently.
  2. Dell is releasing the Crowbar code (which is included in the solution) as open source.

Both are significant items; however, my focus here is on the Crowbar release.

Crowbar started as a Dell OpenStack installer project and then grew beyond that in scope.  Now it can be extended to work with other vendors’ kits and other solutions bits.

We are contributing Crowbar to the community because we believe that everyone benefits by sharing in the operational practices that Crowbar embodies. These are rooted in Opscsode Chef (which Crowbar tightly integrates with) and the cloud & hyper-scale proven DevOps practices that are reflected in our deployment model.

Where to get it?

What’s included?

  • A comprehensive set of barclamps to set up an OpenStack cloud.
  • Crowbar UI and Remote APIs to make it easy to set up your cloud
  • Automated testing scripts for community members doing continuous integration with OpenStack.
  • Build scripts so you can create your own Crowbar install ISO
  • Switch discovery so you can create Chef Cookbooks that are network aware.
  • Open source Chef server that powers much of Crowar’s functionality

What’s not included?

  • Non-open source license components (BIOS+RAID config) that we could not distribute under the Apache 2 license.  We are working to address this and include them in our release.  They are available in the Dell Licensed version of Crowbar.
  • Dell Branded Components (skin + overview page).   Crowbar has an OpenSource skin with identical functionality.
  • Pre-built ISOs with install images (you must download the open source components yourself, we cannot redistribute them to you as a package)

Important notes:

  • Crowbar uses Chef Server as its database and relies on cookbooks for node deployments.  It is installed (using Chef Solo) automatically as part of the Crowbar install.
  • Crowbar has a modular architecture so individual components can be removed, extended, and added. These components are known individually as barclamps.
  • Each barclamp has its own Chef configuration, UI sub-component, deployment configuration, and documentation.

On the project roadmap:

  • Hadoop support
  • Additional operating system support (specifically RHEL)
  • Barclamp version repository
  • Network configuration
  • We’d like suggestions!  Please comment!

Sites for more information: Joseph George, Barton George (launch day), Dell