DC2020: Putting the Data back in the Data Center

For the past two decades, data centers have been more about compute than data, but the machine learning and IoT revolutions are changing that focus for the 2020 Data Center (aka DC2020). My experience at IBM Think 2018 suggests that we should be challenging our compute centric view of a data center; instead, we should be considering the flow and processing of data. Since data is not localized, that reinforces our concept of DC2020 as a distributed and integrated environment.

We have defined data centers by the compute infrastructure stored there. Cloud (especially equated with virtualized machines) has been an infrastructure as a service (IaaS) story. Even big data “lakes” are primary compute clusters with distributed storage. This model dominates because data sources locked in application silos control of the compute translates directly to control of the data.

What if control of data is being decoupled from applications? Data is becoming it’s own thing with new technologies like machine learning, IoT, blockchain and other distributed sourcing.

In a data centric model, we are more concerned with movement and access to data than building applications to control it. Think of event driven (serverless) and microservice platforms that effectively operate on data-in-flight. It will become impossible to actually know all the ways that data is manipulated as function as a service progresses because there are no longer boundaries for applications.

This data-centric, distributed architecture model will be even more pronounced as processing moves out of data centers and into the edge. IT infrastructure at the edge will be used for handling latency critical data and aggregating data for centralization. These operations will not look like traditional application stacks: they will be data processing microservices and functions.

This data centric approach relegates infrastructure services to a subordinate role. We should not care about servers or machines except as they support platforms driving data flows.

I am not abandoning making infrastructure simple and easy – we need to do that more than ever! However, it’s easy to underestimate the coming transformation of application architectures based on advanced data processing and sharing technologies. The amount and sources of data have already grown beyond human comprehension because we still think of applications in a client-server mindset.

We’re only at the start of really embedding connected sensors and devices into our environment. As devices from many sources and vendors proliferate, they also need to coordinate. That means we’re reaching a point where devices will start talking to each other locally instead of via our centralized systems. It’s part of the coming data avalanche.

Current management systems will not survive explosive growth.  We’re entering a phase where control and management paradigms cannot keep up.

As an industry, we are rethinking management automation from declarative (“start this”) to intent (“maintain this”) focused systems.  This is the simplest way to express the difference between OpenStack and Kubernetes. That change is required to create autonomous infrastructure designs; however, it also means that we need to change our thinking about infrastructure as something that follows data instead of leads it.

That’s exactly what RackN has solved with Digital Rebar Provision.  Deeply composable, simple APIs and extensible workflows are an essential component for integrated automation in DC2020 to put the data back in data center.

DC2020: Is Exposing Bare Metal Practical or Dangerous?

One of IBM’s major announcements at Think 2018 was Managed Kubernetes on Bare Metal. This new offering combines elements of their existing offerings to expose some additional security, attestation and performance isolation. Bare metal has been a hot topic for cloud service providers recently with AWS adding it to their platform and Oracle using it as their primary IaaS. With these offerings as a backdrop, let’s explore the role of bare metal in the 2020 Data Center (DC2020).

Physical servers (aka bare metal) are the core building block for any data center; however, they are often abstracted out of sight by a virtualization layer such as VMware, KVM, HyperV or many others. These platforms are useful for many reasons. In this post, we’re focused on the fact that they provide a control API for infrastructure that makes it possible to manage compute, storage and network requests. Yet the abstraction comes at a price in cost, complexity and performance.

The historical lack of good API control has made bare metal less attractive, but that is changing quickly due to two forces.

These two forces are Container Platforms and Bare Metal as a Service or BMaaS (disclosure: RackN offers a private BMaaS platform called Digital Rebar). Container Platforms such as Kubernetes provide an application service abstraction level for data center consumers that eliminates the need for users to worry about traditional infrastructure concerns.  That means that most users no longer rely on APIs for compute, network or storage allowing the platform to handle those issues. On the other side, BMaaS VM infrastructure level APIs for the actual physical layer of the data center allow users who care about compute, network or storage the ability to work without VMs.  

The combination of containers and bare metal APIs has the potential to squeeze virtualization into a limited role.

The IBM bare metal Kubernetes announcement illustrates both of these forces working together.  Users of the managed Kubernetes service are working through the container abstraction interface and really don’t worry about the infrastructure; however, IBM is able to leverage their internal bare metal APIs to offer enhanced features to those users without changing the service offering.  These benefits include security (IBM White Paper on Security), isolation, performance and (eventually) access to metal features like GPUs. While the IBM offering still includes VMs as an option, it is easy to anticipate that becoming less attractive for all but smaller clusters.

The impact for DC2020 is that operators need to rethink how they rely on virtualization as a ubiquitous abstraction.  As more applications rely on container service abstractions the platforms will grow in size and virtualization will provide less value.  With the advent of better control of the bare metal infrastructure, operators have real options to get deep control without adding virtualization as a requirement.

Shifting to new platforms creates opportunities to streamline operations in DC2020.

Even with virtualization and containers, having better control of the bare metal is a critical addition to data center operations.  The ideal data center has automation and control APIs for every possible component from the metal up.

Learn more about the open source Digital Rebar community:

DC2020: Skeptics Guide to Blockchain in the Data Center

At Think 2018, Machine Learning and Blockchain technologies are beyond pervasive, they are assumed to be beneficial to ROI in every situation. That type of hype begs for closer review. In this post, we’ll look at a potentially real use of blockchain for operations.

There is so much noise about blockchain that it can be difficult to find a starting point. I’m leaving background reading as an exercise for the reader; instead, I want to focus on how blockchain creates a distributed ledger with shared trust. That’s a lot of buzz words! Basically, we’re talking about a system where nodes share data in a way that they use consensus with their peer to determine if the information is trustworthy.

The key concept in blockchain is moving from a central authority to a distributed authority.

In the data center, administrative trust is essential. The premises, networks, and access credentials all rely on the idea that we have a centralized authoritative group. Even PKI, which is designed for decentralized trust, relies on a centralized trust to sign keys. Looking objectively at the bundle of passwords, certificates, keys and isolation layers, there are gaping risks in this model. It only takes getting the right access to flip administrative control from an asset into a liability.

Blockchain allows us to decentralize trust in the data center by requiring systems to collaboratively validate administrative instructions.

In this model, we’d still have administrative controls and management; however, the nodes would be able to validate configuration changes with their peers or other administrative sources. For example, an out of process change (potential hack?) on a single node would be confirmed via consensus with other nodes instead of automatically trusting the source. The body of nodes protects from a bad administrative request. It also allows operators to quickly propagate configurations peer-to-peer instead of relying on a central hub and spoke model.

This is even more powerful if configuration is composited from multiple sources in a pipeline. In a multiple author system, each contributor will be involved in verifying that changes to the whole configuration. This ensures that downstream insertions are both communicated and accepted by upstream steps.  This works because blockchain is a distributed ledger. Changes made to the chain are passed back to all parties. Just like in a decentralized supply chain model, this ensures both validation and transparency.

Blockchain’s ability to provide both horizontal and vertical integrity for operations is an intriguing possibility.

I’m interested in hearing your thoughts about this application for blockchain. From a RackN and Digital Rebar perspective, these capabilities are well aligned with our composable approach to configuration. We’d be happy to talk with operators who want to look more deeply into this type of integration.

Series Intro: A Focus on Sustaining Operations

When discussing the data center of the future, it’s critical that we start by breaking the concept of the data center as a physical site with guarded walls, raised floors, neat rows of servers and crash cart pushing operators. The Data Center of 2020 (DC2020) is a distributed infrastructure comprised of many data centers, cloud services and connected devices.

The primary design concept of DC2020 is integrated automation not actual infrastructures.

As an industry, we need to actively choose implementations that unify our operational models to create portability and eliminate silos. This means investing more in sustaining operations (aka Day 2 Ops) that ensure our IT systems can be constantly patched, updated and maintained. The pace of innovation (and discovered vulnerabilities!) requires that we build with the assumption of change. DC2020 cannot be “fire and forget” building that assumes occasional updates.

There are a lot of disruptive and exciting technologies entering the market. These create tremendous opportunities for improvement and faster innovation cycles. They also create significant risk for further fragmenting our IT operations landscape in ways to increase costs, decrease security and further churn our market.

It is possible to be for both rapid innovation and sustaining operations, but it requires a plan for building robust automation.

The focus on tightly integrated development and operations work is a common theme in both DevOps and Site/System Reliability Engineering topics that we cover all the time. They are not only practical, we believe they are essential requirements for building DC2020.

Over this week, I’m going to be using the backdrop of IBM Think to outline the concepts for DC2020. I’ll both pull in topics that I’m hearing there and revisit topics that we’ve been discussing on our blogs and L8ist Sh9y podcast. Ultimately, we’ll create a comprehensive document: for now, we invite you to share your thoughts about this content in it’s more raw narrative form.