If Private Cloud is dead. Where did it go? How did it get there? [JOINT POST]

Posted on May 12, 2017 by Rob H

TL;DR: Hybrid killed IT.

I’m a regular participant on BWG Roundtable calls and often extend those discussions 1×1. This post collects questions from one of those follow-up meetings where we explored how data center markets are changing based on new capacity and also the impact of cloud.

We both believe in the simple answer, “it’s going to be hybrid.” We both feel that this answer does not capture the real challenges that customers are facing.

So who are we? Haynes Strader, Jr. comes at this from a real estate perspective via CBRE Data Center Solutions. Rob Hirschfeld comes at this from an ops and automation perspective via RackN. We are in very different aspects of the data center market.

Rob: I know that we’re building a lot of data center capacity. So far, it’s been really hard to move operations to new infrastructure and mobility is a challenge. Do you see this too?

Haynes: Yes. Creating a data center network that is both efficient and affordable is challenging. A couple of key data center interconnection providers offer this model, but few companies are in a position to truly leverage the node-cloud-node model, where a company leverages many small data center locations (colo) that all connect to a cloud option for the bulk of their computing requirements. This works well for smaller companies with a spread-out workforce, or brand new companies with no legacy infrastructure, but the Fortune 2000 still have the majority of their compute sitting in-house in owned facilities that weren’t originally designed to serve as data centers. Moving these legacy systems is nearly impossible.

Rob: I see many companies feeling trapped by these facilities and looking to the cloud as an alternative. You are describing a lot of inertia in that migration. Is there something that can help improve mobility?

Haynes: Data centers are physical presences to hold virtual environments. The physical aspect can only be optimized when a company truly understands its virtual footprint. IT capacity planning is key to this. System monitoring and usage analytics are critical to make growth and consolidation decisions. Why isn’t this being adopted more quickly? Is it cost? Is it difficulty to implement in complex IT environments? Is it the fear of the unknown?

Rob: I think that it’s technical debt that makes it hard (and scary) to change. These systems were built manually or assuming that IT could maintain complete control. That’s really not how cloud-focused operations work. Is there a middle step between full cloud and legacy?

Haynes: Creating an environment where a company maximizes the use for its owned assets (leveraging sale leasebacks and forward-thinking financing) vs. waiting until end of life and attempting to dispose leads to opportunities to get capital injections early on and move to an OPEX model. This makes the transition to colo much easier, and avoids a large write-down that comes along with most IT transformations. Colocation is an excellent tool if it is properly negotiated because it can provide a flexible environment that can grow or shrink based on your utilization of other services. Sophisticated colo users know when it makes sense to pay top dollar for an environment that requires hyperconnectivity and when to save money for storage and day-to-day compute. They know when to leverage providers for services and when to manage IT tasks in-house. It is a daunting process, but the initial approach is key to getting to that place in the long term.

Rob: So I’m back to thinking that the challenge for accessing all these colo opportunities is that it’s still way too hard to move operations between facilities and also between facilities and the cloud. Until we improve mobility, choosing a provider can be a high stakes decision. What factors do you recommend reviewing?

Haynes: There is an overwhelming number of factors in picking new colos:

Location
Connectivity/Latency
Cloud Connectivity Options
Pricing
Quality of Services
Security
Hazard Risk Mitigation
Comfort with services/provider
Growth potential
Flexibility of spend/portability (this is becoming ever-more important)

Rob: Yikes! Are there minor operational differences between colos that are causing breaking changes in operations?

Haynes: We run into this with our clients occasionally, but it is usually because they created two very different environments with different providers. This is a big reason to use a broker. Creating identical terms, pricing models, SLAs and work flows allow for clients to have a lot of leverage when they go to market. A select few of the top cloud providers do a really good job of this. They dominate the markets that they enter because they have a consistent, reliable process that is replicated globally. They also achieve some of the most attractive pricing and terms in the marketplace on a regular basis.

Rob: That makes sense. Process matters for the operators and consistent practices make it easier to work with a partner. Even so, moving can save a lot of money. Is that savings justified against the risk and interruption?

Haynes: This is the biggest hurdle that our enterprise clients face. The risk of moving is risking an IT leader’s job. How do we do this with minimal risk and maximum upside? Long-term strategic planning is one answer, but in today’s world, IT leadership changes often and strategies go along with that. We don’t have a silver bullet for this one – but are always looking to partner with IT leaders that want to give it a shot and hopefully save a lot of money.

Rob: So is migration practical?

Haynes: Migration makes our clients cringe, but the ones that really try to take it on and make it happen strategically (not once it is too late) regularly reap the benefits of saving their company money and making them heroes to the organization.

Rob: I guess that brings us back to mixing infrastructures. I know that public clouds have interconnect with colos that make it possible to avoid picking a single vendor. Are you seeing this too?

Haynes: Hybrid, hybrid, hybrid. No one is the best one-stop shop. We all love 7-11 and it provides a lot of great solutions on the run, but I’m not grocery shopping there. Same reason I don’t run into a Kroger every time I need a bottle of water. Pick the right solution for the right application and workload.

Rob: That makes sense to me, but I see something different in practice. Teams are too busy keeping the lights on to take advantage of longer-term thinking. They seem so busy fighting fires that it’s hard to improve.

Haynes: I TOTALLY agree. I don’t know how to change this. I get it, though. The CEO says, “We need to be in the cloud, yesterday,” and the CIO jumps. Suddenly everyone’s strategic planning is out the window and it is off to the races to find a quick-fix. Like most things, time and planning often reap more productive results.

Thanks for sharing our discussion!

We’d love to hear your opinions about it. We both agree that creating multi-site management abstractions could make life easier on IT and relatable to real estate and finance. With all of these organizations working in sync the world would be a better place. The challenge is figuring out how to get there!

Why IBM’s hybrid “no-single-way” is a good plan

Posted on April 23, 2017 by Rob H

I got to spend a few days hearing IBM’s cloud plans at IBM Interconnect including a presentation, dinner and guest blogging. Read below for links to that content.

As part of their CloudMinds group, we’re encouraged to look at the big picture of the conference and there’s a lot to take in. IBM has serious activity around machine learning, cognitive, serverless, functional languages, block chain, platform and infrastructure as a service. Frankly, that’s a confusing array of technologies.

Does this laundry list of technologies fit into a unified strategy? No, and that’s THE POINT.

Anyone who thinks they can predict a definitive right mix of technologies to solve customer problems is not paying attention to the pace of innovation. IBM is listening to their customers and hearing that needs are expanding not consolidating. In this type of market, limiting choice hurts customers.

That means that a hybrid strategy with overlapping offerings serves their customers interests.

IBM has the luxury and scale of being able to chase multiple technologies to find winners. Of course, there’s a danger of hanging on to losers too long too. So far, it looks like they are doing a good job of riding that sweet spot. Their agility here may be the only way that they can reasonably find a chink in Amazon’s cloud armour.

While the hybrid story is harder to tell, it’s the right one for this market.

Four Posts For Deeper Reading

The posts below cover a broad range of topics! Chris Ferris and I did some serious writing about collaboration and my DevOps/Hybrid post has been getting some attention. It’s all recommended reading so I’ve included some highlights.

#CloudMinds tackle the future of cognitive in Las Vegas huddle

Rob is part of the IBM CloudMinds group that meets occasionally to discuss rising cloud, infrastructure and technology challenges.

“Cognitive cannot and will not exist without trust. Humans will not trust cognitive unless we can show that our cognitive solutions understand them.”

How open communities can hurt, and help, interoperability

“The days of using open software passively from vendors are past, users need to have a voice and opinion about project governance. This post is a joint effort with Rob Hirschfeld, RackN, and Chris Ferris, IBM, based on their IBM Interconnect 2017 “Open Cloud Architecture: Think You Can Out-Innovate the Best of the Rest?” presentation.”

When DevOps and hybrid collide (2017 trend lines)

“We’ve clearly learned that DevOps automation pays back returns in agility and performance. Originally, small-batch, lean thinking was counter-intuitive. Now it’s time to make similar investments in hybrid automation so that we can leverage the most innovation available in IT today.”

Open Source Collaboration: The Power of No & Interoperability

“Users and operators can put significant pressure on project leaders and vendors to ensure that the platforms are interoperable. “

Don’t Balkanize My Installer, Yo!

Posted on March 28, 2017 by Rob H

Last week, RackN announced our enterprise support for Kubernetes using nothing but upstream Ansible from the project itself. This effort represents years of effort by the RackN founders to keep platforms interoperable via open and shareable operations automation.

That’s why our Digital Rebar approach targets underlay challenges and leverages existing automation tools instead of investing yet another install path.

dcos This week, we added Install Wizard templates to the DC/OS install automation we build in collaboration with Mesosphere last year. That makes it even easier to run DC/OS on physical infrastructure. Like our Kubernetes work, the Digital Rebar automation uses the same community dcos_install.sh that’s used in the community documentation. The difference is that we’re also driving all the underlay prep and configuration automatically.

If this approach appeals to you, contact RackN and join in the open Day 2 revolution.

Interested in seeing the DC/OS install in action? Here’s a demo video:

SRE role with DevOps for Enterprise [@HPE podcast]

Posted on February 21, 2017 by Rob H

My focus on SRE series continues… At RackN, we see a coming infrastructure explosion in both complexity and scale. Unless our industry radically rethinks operational processes, current backlogs will escalate and stability, security and sharing will suffer.

Yes, DevOps and SRE are complementary

In this short 16 minute podcast, HPE’s Stephen Spector and I discuss how DevOps and SRE thinking overlaps and where are the differences. We also discuss how Enterprises should be evaluating Site Reliability Engineering as a function and where it fits in their organization.

Open Source as Reality TV and Burning Data Centers [gcOnDemand podcast notes]

Posted on May 24, 2016 by Rob H

During the OpenStack summit, Eric Wright (@discoposse) and I talked about a wide range of topics from scoring success of OpenStack early goals to burning down traditional data centers.

Why burn down your data center (and move to public cloud)? Because your ops process are too hard to change. Rob talks about how hybrid provides a path if we can made ops more composable.

Here are my notes from the audio podcast (source):

1:30 Why “zehicle” as a handle? Portmanteau from electrics cars… zero + vehicle

Let’s talk about OpenStack & Cloud…

OpenStack History
- 2:15 Rob’s OpenStack history from Dell and Hyperscale
- 3:20 Early thoughts of a Cloud API that could be reused
- 3:40 The practical danger of Vendor lock-in
- 4:30 How we implemented “no main corporate owner” by choice
About the Open in OpenStack
- 5:20 Rob decomposes what “open” means because there are multiple meanings
- 6:10 Price of having all open tools for “always open” choice and process
- 7:10 Observation that OpenStack values having open over delivering product
- 8:15 Community is great but a trade off. We prioritize it over implementation.
Q: 9:10 What if we started later? Would Docker make an impact?
- Part of challenge for OpenStack was teaching vendors & corporate consumers “how to open source”
Q: 10:40 Did we accomplish what we wanted from the first summit?
- Mixed results – some things we exceeded (like growing community) while some are behind (product adoption & interoperability).
13:30 Interop, Refstack and Defcore Challenges. Rob is disappointed on interop based on implementations.
Q: 15:00 Who completes with OpenStack?
- There are real alternatives. APIs do not matter as much as we thought.
- 15:50 OpenStack vendor support is powerful
Q: 16:20 What makes OpenStack successful?
- Big tent confuses the ecosystem & push the goal posts out
- “Big community” is not a good definition of success for the project.
18:10 Reality TV of open source – people like watching train wrecks
18:45 Hybrid is the reality for IT users
20:10 We have a need to define core and focus on composability. Rob has been focused on the link between hybrid and composability.
22:10 Rob’s preference is that OpenStack would be smaller. Big tent is really ecosystem projects and we want that ecosystem to be multi-cloud.

Now, about RackN, bare metal, Crowbar and Digital Rebar….

23:30 (re)Intro
24:30 VC market is not metal friendly even though everything runs on metal!
25:00 Lack of consistency translates into lack of shared ops
25:30 Crowbar was an MVP – the key is to understand what we learned from it
26:00 Digital Rebar started with composability and focus on operations
27:00 What is hybrid now? Not just private to public.
30:00 How do we make infrastructure not matter? Multi-dimensional hybrid.
31:00 Digital Rebar is orchestration for composable infrastructure.
Q: 31:40 Do people get it?
- Yes. Automation is moving to hybrid devops – “ops is ops” and it should not matter if it’s cloud or metal.
32:15 “I don’t want to burn down my data center” – can you bring cloud ops to my private data center?

Problems with the “Give me a Wookiee” hybrid API

Posted on April 13, 2016 by Rob H

Greg Althaus, RackN CTO, creates amazing hybrid DevOps orchestration that spans metal and cloud implementations. When it comes to knowing the nooks and crannies of data centers, his ops scar tissue has scar tissue. So, I knew you’d all enjoy this funny story he wrote after previewing my OpenStack API report.

“APIs are only valuable if the parameters mean the same thing and you get back what you expect.” Greg Althaus

The following is a guest post by Greg:

While building the Digital Rebar OpenStack node provider, Rob Hirschfeld tried to integrate with 7+ OpenStack clouds. While the APIs matched across instances, there are all sorts of challenges with what comes out of the API calls.

The discovery made me realize that APIs are not the end of interoperability. They are the beginning.

I found I could best describe it with a story.

I found an API on a service and that API creates a Wookiee!

I can tell the API that I want a tall or short Wookiee or young or old Wookiee. I test against the Kashyyyk service. I consistently get a 8ft Brown 300 year old Wookiee when I ask for a Tall Old Wookiee.

I get a 6ft Brown 50 Year old Wookiee when I ask for a Short Young Wookiee. Exactly what I want, all the time.

My pointy-haired emperor boss says I need to now use the Forest Moon of Endor (FME) Service. He was told it is the exact same thing but cheaper. Okay, let’s do this. It consistently gives me 5 year old 4 ft tall Brown Ewok (called a Wookiee) when I ask for the Tall Young Wookiee.

This is a fail. I mean, yes, they are both furry and brown, but the Ewok can’t reach the top of my bookshelf.

The next service has to work, right? About the same price as FME, the Tatooine Service claims to be really good too. It passes tests. It hands out things called Wookiees. The only problem is that, while size is an API field, the service requires the use of petite and big instead of short and tall. This is just annoying. This time my tall (well big) young Wookiee is 8 ft tall and 50 years old, but it is green and bald (scales are like that).

I don’t really know what it is. I’m sure it isn’t a Wookiee.

And while she is awesome (better than the male Wookiees), she almost froze to death in the arctic tundra that is Boston.

My point: APIs are only valuable if the parameters mean the same thing and you get back what you expect.

Composability & Commerce: drivers for #CloudMinds Hybrid discussion

Posted on February 24, 2016 by Rob H

Last night, I had the privilege of being included in an IBM think tank group called CloudMinds. The topic for the night was accelerating hybrid cloud. cb81gdhukaetyga

During discussion, I felt that key how and why aspects of hybrid computing emerged: composability and commerce.

Composability, the discipline of creating segmenting IT into isolated parts, was considered a primary need. Without composability, we create vertically integrated solutions that are difficult to hybrid.

Commerce, the acknowledgement that we are building technology to solve problems, was considered a way to combat the dogma that seems to creep into the platform wars. That seems obvious, yet I believe it’s often overlooked and the group seemed to agree.

It’s also worth adding that the group strongly felt that hybrid was not a cloud discussion – it was a technology discussion. It is a description of how to maintain an innovative and disruptive industry by embracing change.

The purpose of the think tank is to create seeds of an ongoing discussion. We’d love to get your perspective on this too.

We need DevOps without Borders! Is that “Hybrid DevOps?”

Posted on February 9, 2016 by Rob H

The RackN team has been working on making DevOps more portable for over five years. Portable between vendors, sites, tools and operating systems means that our automation needs be to hybrid in multiple dimensions by design.

Why drive for hybrid? It’s about giving users control.

launch! I believe that application should drive the infrastructure, not the reverse. I’ve heard may times that the “infrastructure should be invisible to the user.” Unfortunately, lack of abstraction and composibility make it difficult to code across platforms. I like the term “fidelity gap” to describe the cost of these differences.

What keeps DevOps from going hybrid? Shortcuts related to platform entangled configuration management.

Everyone wants to get stuff done quickly; however, we make the same hard-coded ops choices over and over again. Big bang configuration automation that embeds sequence assumptions into the script is not just technical debt, it’s fragile and difficult to upgrade or maintain. The problem is not configuration management (that’s a critical component!), it’s the lack of system level tooling that forces us to overload the configuration tools.

What is system level tooling? It’s integrating automation that expands beyond configuration into managing sequence (aka orchestration), service orientation, script modularity (aka composibility) and multi-platform abstraction (aka hybrid).

My ops automation experience says that these four factors must be solved together because they are interconnected.

What would a platform that embraced all these ideas look like? Here is what we’ve been working towards with Digital Rebar at RackN:

Mono-Infrastructure IT	“Hybrid DevOps”
Locked into a single platform	Portable between sites and infrastructures with layered ops abstractions.
Limited interop between tools	Adaptive to mix and match best-for-job tools. Use the right scripting for the job at hand and never force migrate working automation.
Ad hoc security based on site specifics	Secure using repeatable automated processes. We fail at security when things get too complex change and adapt.
Difficult to reuse ops tools	Composable Modules enable Ops Pipelines. We have to be able to interchange parts of our deployments for collaboration and upgrades.
Fragile Configuration Management	Service Oriented simplifies API integration. The number of APIs and services is increasing. Configuration management is not sufficient.
Big bang: configure then deploy scripting	Orchestrated action is critical because sequence matters. Building a cluster requires sequential (often iterative) operations between nodes in the system. We cannot build robust deployments without ongoing control over order of operations.

Should we call this “Hybrid Devops?” That sounds so buzz-wordy!

I’ve come to believe that Hybrid DevOps is the right name. More technical descriptions like “composable ops” or “service oriented devops” or “cross-platform orchestration” just don’t capture the real value. All these names fail to capture the portability and multi-system flavor that drives the need for user control of hybrid in multiple dimensions.

Simply put, we need devops without borders!

What do you think? Do you have a better term?

Post-OpenStack DefCore, I’m Chasing “open infrastructure” via cross-platform Interop

Posted on January 26, 2016 by Rob H

Like my previous DefCore interop windmill tilting, this is not something that can be done alone. Open infrastructure is a collaborative effort and I’m looking for your help and support. I believe solving this problem benefits us as an industry and individually as IT professionals.

So, what is open infrastructure? It’s not about running on open source software. It’s about creating platform choice and control. In my experience, that’s what defines open for users (and developers are not users).

I’ve spent several years helping lead OpenStack interoperability (aka DefCore) efforts to ensure that OpenStack cloud APIs are consistent between vendors. I strongly believe that effort is essential to build an ecosystem around the project; however, in talking to enterprise users, I’ve learned that that their real interoperability gap is between that many platforms, AWS, Google, VMware, OpenStack and Metal, that they use everyday.

Instead of focusing inward to one platform, I believe the bigger enterprise need is to address automation across platforms. It is something I’m starting to call hybrid DevOps because it allows users to mix platforms, service APIs and tools.

Open infrastructure in that context is being able to work across platforms without being tied into one platform choice even when that platform is based on open source software. API duplication is not sufficient: the operational characteristics of each platform are different enough that we need a different abstraction approach.

We have to be able to compose automation in a way that tolerates substitution based on infrastructure characteristics. This is required for metal because of variation between hardware vendors and data center networking and services. It is equally essential for cloud because of variation between IaaS capabilities and service delivery models. Basically, those minor differences between clouds create significant challenges in interoperability at the operational level.

Rationalizing APIs does little to address these more structural differences.

The problem is compounded because the differences are not nicely segmented behind abstraction layers. If you work to build and sustain a fully integrated application, you must account for site specific needs throughout your application stack including networking, storage, access and security. I’ve described this as all deployments have 80% of the work common but the remaining 20% is mixed in with the 80% instead of being nicely layers. So, ops is cookie dough not vinaigrette.

Getting past this problem for initial provisioning on a single platform is a false victory. The real need is portable and upgrade-ready automation that can be reused and shared. Critically, we also need to build upon the existing foundations instead of requiring a blank slate. There is openness value in heterogeneous infrastructure so we need to embrace variation and design accordingly.

This is the vision the RackN team has been working towards with open source Digital Rebar project. We now able to showcase workload deployments (Docker, Kubernetes, Ceph, etc) on multiple cloud platforms that also translate to full bare metal deployments. Unlike previous generations of this tooling (some will remember Crowbar), we’ve been careful to avoid injecting external dependencies into the DevOps scripts.

While we’re able to demonstrate a high degree of portability (or fidelity) across multiple platforms, this is just the beginning. We are looking for users and collaborators who want to want to build open infrastructure from an operational perspective.

You are invited to join us in making open cross-platform operations a reality.

12 Predictions for ’16: mono-cloud ambitions die as containers drive more hybrid IT

Posted on December 31, 2015 by Rob H

I expect 2016 to be a confusing year for everyone in IT. For 2015, I predicted that new uses for containers are going to upset cloud’s apple cart; however, the replacement paradigm is not clear yet. Consequently, I’m doing a prognostication mix and match: five predictions and seven items on a “container technology watch list.”

TL;DR: In 2016, Hybrid IT arrives on Containers’ wings.

Considering my expectations below, I think it’s time to accept that all IT is heterogeneous and stop trying to box everything into a mono-cloud. Accepting hybrid as current state unblocks many IT decisions that are waiting for things to settle down.

Here’s the memo: “Stop waiting. It’s not going to converge.”

2016 Predictions

Container Adoption Seen As Two Stages: We will finally accept that Containers have strength for both infrastructure (first stage adoption) and application life-cycle (second stage adoption) transformation. Stage one offers value so we will start talking about legacy migration into containers without shaming teams that are not also rewriting apps as immutable microservice unicorns.
OpenStack continues to bump and grow. Adoption is up and open alternatives are disappearing. For dedicated/private IaaS, OpenStack will continue to gain in 2016 for basic VM management. Both competitive and internal pressures continue to threaten the project but I believe they will not emerge in 2016. Here’s my complete OpenStack 2016 post?
Amazon, GCE and Azure make everything else questionable. These services are so deep and rich that I’d question anyone who is not using them. At least one of them simply have to be part of everyone’s IT strategy for financial, talent and technical reasons.
Cloud API becomes irrelevant. Cloud API is so 2011! There are now so many reasonable clients to abstract various Infrastructures that Cloud APIs are less relevant. Capability, interoperability and consistency remain critical factors, but the APIs themselves are not interesting.
Metal aaS gets interesting. I’m a big believer in the power of operating metal via an API and the RackN team delivers it for private infrastructure using Digital Rebar. Now there are several companies (Packet.net, Ubiquity Hosting and others) that offer hosted metal.

2016 Container Tech Watch List

I’m planning posts about all these key container ecosystems for 2016. I think they are all significant contributors to the emerging application life-cycle paradigm.

Service Containers (& VMs): There’s an emerging pattern of infrastructure managed containers that provide critical host services like networking, logging, and monitoring. I believe this pattern will provide significant value and generate it’s own ecosystem.
Networking & Storage Services: Gaps in networking and storage for containers need to get solved in a consistent way. Expect a lot of thrash and innovation here.
Container Orchestration Services: This is the current battleground for container mind share. Kubernetes, Mesos and Docker Swarm get headlines but there are other interesting alternatives.
Containers on Metal: Removing the virtualization layer reduces complexity, overhead and cost. Container workloads are good choices to re-purpose older servers that have too little CPU or RAM to serve as VM hosts. Who can say no to free infrastructure?! While an obvious win to many, we’ll need to make progress on standardized scale and upgrade operations first.
Immutable Infrastructure: Even as this term wins the “most confusing” concept in cloud award, it is an important one for container designers to understand. The unfortunate naming paradox is that immutable infrastructure drives disciplines that allow fast turnover, better security and more dynamic management.
Microservices: The latest generation of service oriented architecture (SOA) benefits from a new class of distribute service registration platforms (etcd and consul) that bring new life into SOA.
Paywall Registries: The important of container registries is easy to overlook because they seem to be version 2.0 of package caches; however, container layering makes these services much more dynamic and central than many realize. (more? Bernard Golden and I already posted about this)

What two items did not make the 2016 cut? 1) Special purpose container-focused operating systems like CoreOS or RancherOS. While interesting, I don’t think these deployment technologies have architectural level influence. 2) Container Security via VMs. I’m seeing patterns where containers may actually be more secure than VMs. This is FUD created by people with a vested interest in virtualization.

Did I miss something? I’d love to know what you think I got right or wrong!

Rob Hirschfeld

On Computing, Containers, Cloud & Tech Culture

Tag Archives: hybrid