Unknown's avatar

About Rob H

A Baltimore transplant to Austin, Rob thinks about ways of building scale infrastructure for the clouds using Agile processes. He sat on the OpenStack Foundation board for four years. He co-founded RackN enable software that creates hyperscale converged infrastructure.

DNS is critical – getting physical ops integrations right matters

Why DNS? Maintaining DNS is essential to scale ops.  It’s not as simple as naming servers because each server will have multiple addresses (IPv4, IPv6, teams, bridges, etc) on multiple NICs depending on the systems function and applications. Plus, Errors in DNS are hard to diagnose.

Names MatterI love talking about the small Ops things that make a huge impact in quality of automation.  Things like automatically building a squid proxy cache infrastructure.

Today, I get to rave about the DNS integration that just surfaced in the OpenCrowbar code base. RackN CTO, Greg Althaus, just completed work that incrementally updates DNS entries as new IPs are added into the system.

Why is that a big deal?  There are a lot of names & IPs to manage.

In physical ops, every time you bring up a physical or virtual network interface, you are assigning at least one IP to that interface. For OpenCrowbar, we are assigning two addresses: IPv4 and IPv6.  Servers generally have 3 or more active interfaces (e.g.: BMC, admin, internal, public and storage) so that’s a lot of references.  It gets even more complex when you factor in DNS round robin or other common practices.

Plus mistakes are expensive.  Name resolution is an essential service for operations.

I know we all love memorizing IPv4 addresses (just wait for IPv6!) so accurate naming is essential.  OpenCrowbar already aligns the address 4th octet (Admin .106 goes to the same server as BMC .106) but that’s not always practical or useful.  This is not just a Day 1 problem – DNS drift or staleness becomes an increasing challenging problem when you have to reallocate IP addresses.  The simple fact is that registering IPs is not the hard part of this integration – it’s the flexible and dynamic updates.

What DNS automation did we enable in OpenCrowbar?  Here’s a partial list:

  1. recovery of names and IPs when interfaces and systems are decommissioned
  2. use of flexible naming patterns so that you can control how the systems are registered
  3. ability to register names in multiple DNS infrastructures
  4. ability to understand sub-domains so that you can map DNS by region
  5. ability to register the same system under multiple names
  6. wild card support for C-Names
  7. ability to create a DNS round-robin group and keep it updated

But there’s more! The integration includes both BIND and PowerDNS integrations. Since BIND does not have an API that allows incremental additions, Greg added a Golang service to wrap BIND and provide incremental updates and deletes.

When we talk about infrastructure ops automation and ready state, this is the type of deep integration that makes a difference and is the hallmark of the RackN team’s ops focus with RackN Enterprise and OpenCrowbar.

StackEngine Docker on Metal via RackN Workload for OpenCrowbar

6/19: This was CROSS POSTED WITH STACKENGINE

In our quest for fast and cost effective container workloads, RackN and StackEngine have teamed up to jointly develop a bare metal StackEngine workload for the RackN Enterprise version of OpenCrowbar.  Want more background on StackEngine?  TheNewStack.io also did a recent post covering StackEngine capabilities.

While this work is early, it is complete enough for field installs.  We’d like to include potential users in our initial integration because we value your input.

Why is this important?  We believe that there are significant cost, operational and performance benefits to running containers directly on metal.  This collaboration is a tangible step towards demonstrating that value.

What did we create?  The RackN workload leverages our enterprise distribution of OpenCrowbar to create a ready state environment for StackEngine to be able to deploy and automate Docker container apps.

In this pass, that’s a pretty basic Centos 7.1 environment that’s hardware and configured.  The workload takes your StackEngine customer key as the input.  From there, it will download and install StackEngine on all the nodes in the system.  When you choose which nodes also manage the cluster, the workloads will automatically handle the cross registration.

What is our objective?  We want to provide a consistent and sharable way to run directly on metal.  That accelerates the exploration of this approach to operationalizing container infrastructure.

What is the roadmap?  We want feedback on the workload to drive the roadmap.  Our first priority is to tune to maximize performance.  Later, we expect to add additional operating systems, more complex networking and closed-loop integration with StackEngine and RackN for things like automatic resources scheduling.

How can you get involved?  If you are interested in working with a tech-preview version of the technology, you’ll need to a working OpenCrowbar Drill implementation (via Github or early access available from RackN), a StackEngine registration key and access to the RackN/StackEngine workload (email info@rackn.com or info@stackengine.com for access).

exploring Docker Swarm on Bare Metal for raw performance and ops simplicity

As part of our exploration of containers on metal, the RackN team has created a workload on top of OpenCrowbar as the foundation for a Docker Swarm on bare metal cluster.  This provides a second more integrated and automated path to Docker Clusters than the Docker Machine driver we posted last month.

It’s really pretty simple: The workload does the work to deliver an integrated physical system (Centos 7.1 right now) that has Docker installed and running.  Then we build a Consul cluster to track the to-be-created Swarm.  As new nodes are added into the cluster, they register into Consul and then get added into the Docker Swarm cluster.  If you reset or repurpose a node, Swarm will automatically time out of the missing node so scaling up and down is pretty seamless.

When building the cluster, you have the option to pick which machines are masters for the swarm.  Once the cluster is built, you just use the Docker CLI’s -H option against the chosen master node on the configured port (defaults to port 2475).

This work is intended as a foundation for more complex Swarm and/or non-Docker Container Orchestration deployments.  Future additions include allowing multiple network and remote storage options.

You don’t need metal to run a quick test of this capability.  You can test drive RackN OpenCrowbar using virtual machines and then expand to the full metal experience when you are ready.

Contact info@rackn.com for access to the Docker Swarm trial.   For now, we’re managing the subscriber base for the workload.  OpenCrowbar is a pre-req and ungated.  We’re excited to give access to the code – just ask.

Leading vs. Directing: Digital Managers must learn the difference [post 5 of 8]

Fifth IN AN 8 POST SERIESBRAD SZOLLOSE AND ROB HIRSCHFELD INVITE YOU TO SHARE IN OUR DISCUSSION ABOUT FAILURES, FIGHTS AND FRIGHTENING TRANSFORMATIONS GOING ON AROUND US AS DIGITAL WORK CHANGES WORKPLACE DELIVERABLES, PLANNING AND CULTURE.

On the shouldersDigital Management has a challenging deep paradox: digital workers resist direct management but require that their efforts fit into a larger picture.

If you believe the next generation companies we discussed in post #4, then the only way to unlock worker potential is enable self-motivated employees and remove all management. In Zappos case, they encouraged 14% of their workers to simply leave the company because they don’t believe in extreme self-management.

Companies like W. L. Gore & Associates, the makers of GORE-TEX, operate and thrive very well in a team-driven environment… This apparently loosey-goosey management style has brought about hundreds of major multibillion-dollar ideas and made W. L. Gore a leading incubator of consistently great ideas and products for more than fifty years. To an outside observer it looks as though the focus is on having fun. But to the initiated, it is about hiring intense self-starters who contribute wholeheartedly to what they are doing and to the team, and most important, who can self-manage their time and skill sets.

— Liquid Leadership by Brad Szollose, page 154

Frankly, both of us—Brad and Robare skeptical. We believe that these tactics do enhance productivity, but gloss over the essential ingredient in their success: a shared set of goals.

Like our Jazz analogy, the performance is the sum of the parts and the players need to understand how their work fits into the bigger picture. A traditional management structure, with controlling leadership and über clear, micromanaged direction, backfires because it restricts the workers’ ability to interpret and adapt; however, that does not mean we are advocates of “no management whatsoever” zones.  

The trendy word is Holacracy.  That loosely translates into removal of management hierarchy and power while redistributing it throughout the organization.  Are you scared of that free-fall model?  If workers reject traditional management then what are the alternatives?

We need a way to manage today’s independent thinking workforce.

According to Forbes, digital workers have an even higher need to understand the purpose of their work than previous generations. If you are a Baby Boomer (Conductor of a Symphony), then this last statement may cause you to roll your eyes in disagreement.

Directing a Jazz ensemble requires a different type of leadership. One that hierarchy junkies —orchestra members who need a conductor—would call ambiguous…IF they didn’t truly know what was happening.

Great musicians don’t join mediocre bands; they purposely seek out other teams that are challenging them, a shared set of goals and standards that produce results and success. This may require a shift in mindset for some of our readers.

Freedom in jazz improvisation comes from understanding structure. When people listen to jazz, they often believe that the soloist is “doing whatever they want.” If fact, as experienced improvisers will tell you, the soloist is rarely “doing whatever they want”.  An improvisational soloist is always following a complicated set of rules and being creative within the context of those rules.  From Jazzpath.com

In the past generation, there was no need to communicate a shared vision: you either did what you were told, OR just told people what to do. And people obeyed. Mostly out of fear of losing your job. But, in the digital workforce, shared goals are what makes the work fit together. Players participate of their own will. Not fear.

Putting this into generational terms: if you were born after 1977 (aka Gen X to the Millennials) then you were encouraged to see ALL adults as peers.  In the public school system, this trend continued as the generation was encouraged to speak up, speak out and make as many mistakes as possible…after all, THAT is how you learn. And the fear of screwing up and making mistakes was actually encouraged, as teachers also became friends and mentors.  Video games simply reinforced the same iterative learning lessons at home.

Thousands of years of social programming were flipped over in favor of iterative learning and flattened hierarchy.  Those skills showed up just in time to enable us to survive the chaos of the digital work / social media revolution.

But survival is not enough, we are looking for a way to lead and win.

Since hierarchy is flat, it’s become critical to replace directing action with building a common mission.  In individual-centric digital work, there are often multiple right ways to accomplish the team objective (our topic for post 7).  While having a clear shared goals will not help pick the right option, it will help the team accept that 1) the team has to choose and 2) the team is still on track even if some some individuals have to change direction.

Just listen to the most complex work out there that has been influenced by Jazz; the late Jeff Porcaro, pop rock drummer and cofounder of Toto admits to being influenced by Bo Diddley for his drum riffs on the song Rosanna. Or if you are a RUSH fan you know that songs like La Villa Strangiato owe the syncopated rhythms, chord changes and drum riffs to Jazz.

Or the modern artist Piet Mondrian who invented neoplasticism, was inspired by listening incessantly to a particular type of jazz called “Boogie-Woogie.”

Participants in this type of performance do not tune out and wait for direction. They must be present, bring 100% of themselves to each performance, and let go of what they did in the last concert because each new performance is customized.

You have until our next post to cry in your beer while whining that digital managers have it too hard.  In the next post, we’ll lay out 12 very concrete actions that you should be taking as a leader in the digital workforce.

PS: Brad some important insights about how their childhood experience shapes digital natives’ behavior.  We felt that topic was important but external to the primary narrative so Rob included them here:

Continue reading

Curious about SDN & OpenStack? We discuss at Open Networking Summit Panel (next Thursday)

Next Thursday (6/18), I’m on a panel at the SJC Open Networking Summit with John Zannos (Canonical), Mark Carroll (HP), Mark McClain (VMware).  Our topic is software defined networking (SDN) and OpenStack which could go anywhere in discussion.
OpenStack is clearly driving a lot of open innovation around SDN (and NFV).
I have no idea of what other’s want to bring in, but I was so excited about the questions that I suggested that I thought to just post them with my answers here as a teaser.

1) Does OpenStack require an SDN to be successful?

Historically, no.  There were two networking modes.  In the future, expect that some level of SDN will be required via the Neutron part of the project.

More broadly, SDN appears to be a critical component to broader OpenStack success.  Getting it right creates a lock-in for OpenStack.

2) If you have an SDN for OpenStack, does it need to integrate with your whole datacenter or can it be an island around OpenStack?

On the surface, you can create an Island and get away with it.  More broadly, I think that SDN is most interesting if it provides network isolation throughout your data center or your hosting provider’s data center.  You may not run everything on top of OpenStack but you will be connecting everything together with networking.

SDN has the potential to be the common glue.

3) Of the SDN approaches, which ones seem to be working?  Why?

Overall, the overlay networking approaches seem to be leading.  Anything that requires central control and administration will have to demonstrate it can scale.  Anything that actually requires re-configuring the underlay networking quickly is also going to have to make a lot of progress.

Networking is already distributed.  Anything that breaks that design pattern has an uphill battle.

4) Are SDN and NFV co-dependent?  Are they driving each other?

Yes.  The idea of spreading networking functions throughout your data center to manage east-west or individual tenant requirements (my definition of NFV) requires a way to have isolated traffic (one of the uses for SDN).

5) Is SDN relevant outside of OpenStack?  If so, in what?

Yes.  SDN on containers will become increasingly important.  Also, SDN termination to multi-user systems (like a big database) also make sense.

6) IPv6?  A threat or assistance to SDN?

IPv6 is coming, really.  I think that IPv6 has isolation and encryption capabilities that compete with SDN as an overlay.  Widespread IPv6 adoption could make SDN less relevant.  It also does a better job for multi-cloud networking since it’s neutral and you don’t have to worry about which SDN tech your host is using.

Ceph in an hour? Boring! How about Ceph hardware optimized with advanced topology networking & IPv6?

This is the most remarkable deployment that I’ve had the pleasure to post about.

The RackN team has refreshed the original OpenCrowbar Ceph deployment to take advantage of the latest capabilities of the platform.  The updated workload (APL2) requires first installing RackN Enterprise or OpenCrowbar.

The update provides five distinct capabilities:

1. Fast and Repeatable

You can go from nothing to a distributed Ceph cluster in an hour.  Need to rehearse on VMs?  That’s even faster.  Want to test and retune your configuration?  Make some changes, take a coffee break and retest.  Of course, with redeploy that fast, you can iterate until you’ve got it exactly right.

2. Automatically Optimized Disc Configuration

The RackN update optimizes the Ceph installation for disk performance by finding and flagging SSDs.  That means that our deploy just works(tm) without you having to reconfigure your OS provisioning scripts or vendor disk layout.

3. Cluster Building and Balancing

This update allows you to place which roles you want on which nodes before you commit to the deployment.  You can decide the right monitor to OSD/MON ratio for your needs.  If you expand your cluster, the system will automatically rebalance the cluster.

4. Advanced Networking Topology & IPv6

Using the network conduit abstraction, you can separate front and back end networks for the cluster.  We also take advantage of native IPv6 support and even use that as the preferred addressing.

5. Both Block and Object Services

Building up from Ready State Core, you can add the Ceph workload and be quickly installing Ceph for block and object storage.
That’s a lot of advanced capabilities included out-of-the-box made possible by having a ops orchestration platform that actually understands metal.
Of course, there’s always more to improve.  Before we take on further automated tuning, we want to hear from you and learn what use-cases are most important.

The Matrix & Surrogates as an analogies for VMs, Containers and Metal

010312_1546_2012CloudOu1.jpgTrench coats aside, I used The Matrix as a useful analogy to explain visualization and containers to a non-technical friend today.  I’m interested in hearing from others if this is a helpful analogy.

Why does anyone care about virtual servers?

Virtual servers (aka virtual machines or VMs) are important because data centers are just like the Matrix.  The real world of data centers is a ugly, messy place fraught with hidden dangers and unpleasant chores.  Most of us would rather take the blue pill and live in a safe computer generated artificial environment where we can ignore those details and live in the convenient abstraction of Mega City.

Do VMs really work to let you ignore the physical stuff?

Pretty much.  For most people, they can live their whole lives within the virtual world.  They can think they are eating the steak and never try bending the spoons.

So why are containers disruptive?  

Well, it’s like the Surrogates movie.  Right now, a lot of people living in the Cloud Matrix are setting up even smaller bubbles.  They are finding that they don’t need a whole city, they can just live inside a single room.  For them, it’s more like Surrogates where people never leave their single room.

But if they never leave the container, do they need the Matrix?

No.  And that’s the disruption.  If you’ve wrapped yourself in a smaller bubble then you really don’t need to larger wrapper.

What about that messy “real world”?

It’s still out there in both cases.  Just once you are inside the inner bubble, you can’t really tell the difference.

@NextCast chat about DefCore, Metal Ops and OpenStack evolution

In Vancouver, I sat down with Scott Sanchez (EMC) and Jeff Dickey (Redapt) for a NextCast discussion.   We covered a lot of my favorite subjects including DefCore and Ready State bare metal operations.

One of the things I liked about this discussion was that we were able to pull together the seemly disparate threads that I’m work on around OpenStack.

10 ways to make OpenStack more Start-up Friendly [even more critical in wake of recent consolidation]

The Josh McKenty comment that OpenStack is “aggressively anti-startup” for Business Insider got me thinking and today’s news about IBM & Cisco acquiring startups Blue Box & Piston made me decide to early release this post.

2013-03-11_20-01-50_458I think there’s a general confusion about start-ups in OpenStack.  Many of the early (and now acquired) start-ups were selling OpenStack the platform.  Since OpenStack is community infrastructure, that’s a really hard place to differentiate.  Unfortunately, there’s no material install base (yet) to create an ecosystem of start-ups on top of OpenStack.

The real question is not how to make OpenStack start-up friendly, but how to create a thriving system around OpenStack like Amazon and VMware have created.

That said, here’s my list of ten ways that OpenStack could be more start-up friendly:

  1. Accept companies will have some closed tech – Many investors believe that companies need proprietary IP. An “open all things” company will have more trouble with investors.
  2. Stop scoring commits as community currency – Small companies don’t show up in the OpenStack committer economy because they are 1) small and 2) working on their product upstream ahead of OpenStack upstream code.
  3. Have start-up travel assistance – OpenStack demands a lot of travel and start-ups don’t have the funds to chase the world-wide summits and mid-cycles.
  4. Embrace open projects outside of OpenStack governance – Not all companies want or need that type of governance for their start-up code base.  That does not make them less valuable, it just makes them not ready yet.
  5. Stop anointing ecosystem projects as OpenStack projects – Projects that are allowed into OpenStack get to grab to a megaphone even if they have minimal feature sets.
  6. Be language neutral – Python is not the only language and start-ups need to make practical choices based on their objectives, staff and architecture.
  7. Have a stable base – start-ups don’t have time to troubleshoot both their own product and OpenStack.  Without core stability, it’s risky to add OpenStack as a product requirement.
  8. Focus on interoperability – Start-ups don’t have time evangelize OpenStack.  They need OpenStack to have large base of public and private installs because that creates an addressable market.
  9. Limit big companies from making big pre-announcements – Start-ups primary advantage is being a first/fast mover.  When OpenStack members make announcements of intention (generally without substance) it damages the market for start-ups.  Normally corporate announcements are just noise but they are given credibility when they appear to come from the community.
  10. Reduce the contribution tax and patch backlog – Start-ups must seek the path of least friction.  If needed OpenStack code changes require a lot of work and time then they are unlikely to look for less expensive alternatives.

While I believe these items would help start-ups, they would have negative consequences for the large corporate contributors who have fashioned OpenStack into the type of project that supports their needs.

I’d love to what items you think I’ve overlooked or incorrectly added.

OpenStack Vancouver six observations: partners, metal, tents, defore, brands & breakage

As always, OpenStack conferences/summits are packed with talks and discussions.  Any one of these six points could be a full post; however, I would rather post now and start discussions.  Let me know what you think!

1. Partnering Everywhere – it’s froth, not milk

Everyone is partnering with everyone! It’s a good way to appear to cover more around and appear more open. Right now, I believe these partnerships are for show and very shallow. There will be blood when money is flowing and both partners want the lion’s share.

2. Metal is Hot! attention on Ironic & MaaS

Metal is very hot topic. No surprise, but I do not think that either MaaS or Ironic have the right architecture to deal with the real complexity of automating metal in a generalized way. The consequence is that they are limited and hard to operate.

Container talks were also very hot and I believe are ultimately disruptive.  The very fact that all the container talks were overflowing is an indication of the challenges facing virtualization.

3. DefCore – Just in the Nick of Time

I think that the press and analysts were ready to proclaim that OpenStack was fragmenting and being unable to deliver the “one cloud, multiple vendors” vision. DefCore (presented as Interopability by Jonathan Bryce, DefCore shout out!) came in on the buzzer to buy us more time.

4. Big Tent Concerns – what is ecosystem & release?

Big Tent is shorthand for project governance changes that make it easier for new projects to become OpenStack projects and removes the concept of integrated releases.  The exact definition is still a work in progress.

The top concerns I have are:

  1. We cannot tell difference between community & ecosystem. We’re back to anointed projects because we’re now telling projects they have to join OpenStack to work with OpenStack.
  2. We’re changing the definition of the release but have not defined how it will change. I acknowledge that continuous release is ideal but we’re confusing people again.

5. Brands are battling – will they destroy the city?

OpenStack is hard for startups – read the full post here.  The short version is that big companies are taking up all the air.

While some are leading, others they are learning how to collaborate.  Those new to open source are slow to trust and uncertain about where to invest.  Unfortunately, we’ve created a visible contributions economy that does not reward doing the scut work so it’s no surprise that there are concerns that some of the bigger companies are free riding.

6. OpenStack is broken talks – could we reboot?  no.

It’s a sign of OpenStack’s age that Bias, Termie and others suggested we need clean slate.  Frankly, I think that OpenStack would be irrelevant by the time a rewrite was completed and it not helpful to suggest it.

What would I suggest?  I’d promote a strong core (doing!), ensure big companies collaborate on roadmap (doing!) and stop having a single node install as gate and dev reference (I’d happily help use OCB for this with partners)

PS: Apparently Neutron is not broken.

I’m very excited about the “just give me a network” work to make Neutron duplicate Nova-Net functionality.  Finally.