I left the OpenStack OpenDev Edge Infrastructure conference with a lot of concerns relating to how to manage geographically distributed infrastructure at scale. We’ve been asking similar questions at RackN as we work to build composable automation that can be shared and reused. The critical need is to dramatically reduce site-specific customization in a way that still accommodates required variation – this is something we’ve made surprising advances on in Digital Rebar v3.1.
These are very serious issues for companies like AT&T with 1000s of local exchanges, Walmart with 10,000s of in-store server farms or Verizon with 10,000s of coffee shop Wifi zones. These workloads are not moving into centralized data centers. In fact, with machine learning and IoT, we are expecting to see more and more distributed computing needs.
Running each site as a mini-cloud is clearly not the right answer.
While we do need the infrastructure to be easily API addressable, adding cloud without fixing the underlying infrastructure management moves us in the wrong direction. For example, AT&T‘s initial 100+ OpenStack deployments were not field up-gradable and lead to their efforts to deploy OpenStack on Kubernetes; however, that may have simply moved the upgrade problem to a different platform because Kubernetes does not address the physical layer either!
There are multiple challenges here. First, any scale infrastructure problem must be solved at the physical layer first. Second, we must have tooling that brings repeatable, automation processes to that layer. It’s not sufficient to have deep control of a single site: we must be able to reliably distribute automation over thousands of sites with limited operational support and bandwidth. These requirements are outside the scope of cloud focused tools.
Containers and platforms like Kubernetes have a significant part to play in this story. I was surprised that they were present only in a minor way at the summit. The portability and light footprint of these platforms make them a natural fit for edge infrastructure. I believe that lack of focus comes from the audience believing (incorrectly) that edge applications are not ready for container management.
With hardware layer control (which is required for edge), there is no need for a virtualization layer to provide infrastructure management. In fact, “cloud” only adds complexity and cost for edge infrastructure when the workloads are containerized. Our current cloud platforms are not designed to run in small environments and not designed to be managed in a repeatable way at thousands of data centers. This is a deep architectural gap and not easily patched.
OpenStack sponsoring the edge infrastructure event got the right people in the room but also got in the way of discussing how we should be solving these operational. How should we be solving them? In the next post, we’ll talk about management models that we should be borrowing for the edge…
Read 1st Post of 3 from OpenStack OpenDev: OpenStack on Edge? 4 Ways Edge is Distinct from Cloud