Week in Review: Move Away from Virtualization Overheard with Bare Metal

Welcome to the RackN and Digital Rebar Weekly Review. You will find the latest news related to Edge, DevOps, SRE and other relevant topics.

Had Enough Virtualization Overheard? Time to Think Bare Metal

Software Defined Infrastructure (SDx) allows operators to manage data centers in a more consistent and controlled way. It allows teams to define their environment as code and use automation to execute that definition in practice. To deliver this capability for physical (aka bare metal) servers, RackN has created a Digital Rebar Provision (DRP) plugin for users of HashiCorp’s Terraform.

More on RackN Terraform Opportunity

Observability into Bare Metal Provisioning with RackN

(Posted 5/15 on Honeycomb.io Blog)

At RackN, a core design principle is that operations should be easy to track and troubleshoot. We work hard to automate provisioning with observable processes because insight into complex interactions within a modern data center is critical for success. So, it’s not helpful if we require complex technologies to understand where issues arise from disconnected processes. RackN and the open source Digital Rebar community require a simple, best in class solution to provide a better way to observe provisioning operations within our system without adding complexity and overhead.

Full Post on Honeycomb.io


News

RackN

Digital Rebar Community

L8ist Sh9y Podcast

Social Media

Week in Review: Provision Physical and Virtual from a Single Platform

Welcome to the RackN and Digital Rebar Weekly Review. You will find the latest news related to Edge, DevOps, SRE and other relevant topics.

RackN NOW Provisions Virtual Machines Not Just Physical Machines 

This expansion to virtual machines allows Digital Rebar Provision (DRP) users to not only provision physical infrastructure but virtual as well both locally and in clouds. In this simple demo video we show how to connect a virtual platform to DRP and provision virtual machines alongside your bare metal infrastructure.

Learn More


News

RackN

Digital Rebar Community

L8ist Sh9y Podcast

Social Media

DC2020: Is Exposing Bare Metal Practical or Dangerous?

One of IBM’s major announcements at Think 2018 was Managed Kubernetes on Bare Metal. This new offering combines elements of their existing offerings to expose some additional security, attestation and performance isolation. Bare metal has been a hot topic for cloud service providers recently with AWS adding it to their platform and Oracle using it as their primary IaaS. With these offerings as a backdrop, let’s explore the role of bare metal in the 2020 Data Center (DC2020).

Physical servers (aka bare metal) are the core building block for any data center; however, they are often abstracted out of sight by a virtualization layer such as VMware, KVM, HyperV or many others. These platforms are useful for many reasons. In this post, we’re focused on the fact that they provide a control API for infrastructure that makes it possible to manage compute, storage and network requests. Yet the abstraction comes at a price in cost, complexity and performance.

The historical lack of good API control has made bare metal less attractive, but that is changing quickly due to two forces.

These two forces are Container Platforms and Bare Metal as a Service or BMaaS (disclosure: RackN offers a private BMaaS platform called Digital Rebar). Container Platforms such as Kubernetes provide an application service abstraction level for data center consumers that eliminates the need for users to worry about traditional infrastructure concerns.  That means that most users no longer rely on APIs for compute, network or storage allowing the platform to handle those issues. On the other side, BMaaS VM infrastructure level APIs for the actual physical layer of the data center allow users who care about compute, network or storage the ability to work without VMs.  

The combination of containers and bare metal APIs has the potential to squeeze virtualization into a limited role.

The IBM bare metal Kubernetes announcement illustrates both of these forces working together.  Users of the managed Kubernetes service are working through the container abstraction interface and really don’t worry about the infrastructure; however, IBM is able to leverage their internal bare metal APIs to offer enhanced features to those users without changing the service offering.  These benefits include security (IBM White Paper on Security), isolation, performance and (eventually) access to metal features like GPUs. While the IBM offering still includes VMs as an option, it is easy to anticipate that becoming less attractive for all but smaller clusters.

The impact for DC2020 is that operators need to rethink how they rely on virtualization as a ubiquitous abstraction.  As more applications rely on container service abstractions the platforms will grow in size and virtualization will provide less value.  With the advent of better control of the bare metal infrastructure, operators have real options to get deep control without adding virtualization as a requirement.

Shifting to new platforms creates opportunities to streamline operations in DC2020.

Even with virtualization and containers, having better control of the bare metal is a critical addition to data center operations.  The ideal data center has automation and control APIs for every possible component from the metal up.

Learn more about the open source Digital Rebar community:

Cloud Immutability on Metal in the Data Center

Cloud has enabled a create-destroy infrastructure process that is now seen as common, e.g.  launching and destroying virtual machines and containers. This process is referred to as immutable infrastructure and until now, has not been available to operators within a data center. RackN technology is now actively supporting customers in enabling immutability within a data center on physical infrastructure.

In this post, I will highlight the problems faced by operators in deploying services at scale and introduce the immutability solution available from RackN. In addition, I have added two videos providing background on this topic and a demonstration showing an image deployment of Linux and Windows on RackN using this methodology.

PROBLEM

Traditional data center operations provision and deploy services to a node before configuring the application. This post-deployment configuration introduces mutability into the infrastructure due to dependency issues such as operating system updates, library changes, and patches. Even worse, these changes make it incredibly difficult to rollback a change to a previous version should the update cause an issue.

Looking at patch management highlights key problems faced by operators. Applying patches across multiple nodes may lead to inconsistent services with various dependency changes impacted not just by the software but also the hardware. The ability to apply these patches require root access to the nodes which leaves a security vulnerability for an unauthorized login.

SOLUTION

Moving the configuration of a service before deployment solves the problems discussed previously by delivering a complete runnable image for execution. However, there is some initialization that is hardware dependent and should only be run once (Cloud-Init) allowing a variety of hardware to be used.

This new approach moves the patching stage earlier in the process allowing operators to ensure a consistent deployment image without the possibility of drift, security issues as no root access is required, as well as simplifying the ability to instantly and quickly move backwards to a previously running image.

IMMUTABILITY OVERVIEW

In this presentation, Rob Hirschfeld makes the case of immutable infrastructure on bare metal within your data center using RackN technology. Rob delivers the complete story highlighted in this blog post.

DEMONSTRATION 

In this demonstration, Rob Hirschfeld and Greg Althaus do a complete immutable image deployment of a Linux server and a Windows server using the RackN Portal in less than 20 minutes.

Get started with RackN today to learn more about how you can change your model to this immutability approach.

  • Join the Digital Rebar Community to learn the basics of the Digital Rebar Provision
  • Create an account on the RackN Portal to simplify DRP installation and management
  • Join the RackN Trial program to obtain access to advanced RackN features

Podcast – Year of the Crawfish Recap and 2018 Predictions for Bare Metal, Virtualization, Edge and Serverless

Welcome to the final L8istSh9y Podcast for 2017 with a recap of Rob Hirschfeld’s predictions for 2017 (2016 Infrastructure Revolt makes 2017 the “year of the IT Escape Clause”) as well as a look ahead into 2018. Key topics covered in the podcast:

  • Hybrid is Reality; How do I Cope with it?
  • Site Reliability Engineering; People are Just Doing it
  • Bare Metal to Immutable Images
  • Virtualization Decline with Bare Metal Growth
  • 2018 is not the Year of Serverless
  • Edge Computing Still Not Ready for Prime Time
  • OpenStack Foundation as Open Infrastructure Group

Topic                                                       Time (Minutes.Seconds)

Introduction                                               0.0 – 1.50
2017 ~ Year of Crawfish                           1.50 – 3.00  (Summary)
Hybrid Mainstream                                  3.00 – 7.30
Site Reliability Engineering                    7.30 – 12.45 (Cloud Native Infrastructure Book)
RackN Changed Focus to Bare Metal  12.45 – 13.50
Bare Metal to Immutable                       13.50 – 17.03
Decline of Virtualization                         17.03 – 21.47  (ARM Servers)
Serverless – Not in 2018                         21.47 – 22.57
Edge Computing                                      23.16 – 26.39
OpenStack Foundation                           26.39 – 32.55
Wrap Up                                                    32.55 – END

 

Thank you for joining us in the past few months in launching our new Podcast focused on DevOps, Site Reliability Engineering, Operators, Infrastructure, Edge Computing, Cloud Computing and other related topics. Please contact us if you are looking for information on a specific topic for a future podcast or if you are interested in participating as a guest.

Podcast Home Page – L8istSh9y Podcast
YouTube Videos of Audio Podcasts – Playlist

Virtualizing #OpenStack Nova: looking at the many ways to skin the CAcTus (#KVM v #XenServer v #ESX)

<service bulletin> Server virtualization is not cloud: it is a commonly used technology that creates convenient  resource partitions for cloud operations and infrastructure as a service providers. </service bulletin>

OpenStack claims support for nearly every virtualization platform on the market.  While the basics of “what is virtualization” are common across all platforms, there are important variances in how these platforms are deployed.   It is important to understand these variances to make informed choices about virtualization platforms. 

Your virtualization model choice will have deep implications on your server/networking choice, deployment methodology and operations infrastructure.

My focus is on architecture not specific hypervisors so I’m generalizing to just three to make the each architecture description more concrete:

  1. KVM (open source) is highly used by developers and single host systems
  2. XenServer (open/freemium) leads public cloud infrastructure (Amazon EC2, Rackspace Cloud, and GoGrid)
  3. ESX/vCenter (licensed) leads enterprise virtualized infrastructure

Of course, there are many more hypervisors and many different ways to deploy the three I’m referencing.

This picture shows all three options as a single system.  In practice, only operators wishing to avoid exposure to RESTful recreational activities would implement multiple virtualization architectures in a single system.   Let’s explore the three options:

OS + Hypervisor (KVM) architecture deploys the hypervisor a free standing application on top of an operating system (OS).  In this model, the service provider manages the OS and the hypervisor independently.  This means that the OS needs to be maintained, but is also allows the OS to be enhanced to better manage the cloud or add other functions (share storage).  Because they are least restricted, free standing hypervisors lead the virtualization innovation wave.

Bare Metal Hypervisor (XenServer) architecture integrates the hypervisor and the OS as a single unit.  In this model, the service provider manages the hypervisor as a single unit.  This makes it easier to support and maintain the hypervisor because the platform can be tightly controlled; however, it limits the operator’s ability to extend or multi-purpose the server.   In this model, operators may add agents directly to the individual hypervisor but would not make changes to the underlying OS or resource allocation.

Clustered Hypervisor (ESX + vCenter) architecture integrates multiple servers into a single hypervisor pool.  In this model, the service provider does not manage the individual hypervisor; instead, they operate the environment through the cluster supervisor.  This makes it easier to perform resource balancing and fault tolerance within the domain of the cluster; however, the operator must rely on the supervisor because directly managing the system creates a multi-master problem.  Lack of direct management improves supportability at the cost of flexibility.  Scale is also a challenge for clustered hypervisors because their span of control is limited to practical resource boundaries: this means that large clouds add complexity as they deal with multiple clusters.

Clearly, choosing a virtualization architecture is difficult with significant trade-offs that must be considered.  It would be easy to get lost in the technical weeds except that the ultimate choice seems to be more stylistic.

Ultimately, the choice of virtualization approach comes down to your capability to manage and support cloud operations.  The Hypervisor+OS approach maximum flexibility and minimum cost but requires an investment to build a level competence.  Generally, this choice pervades an overall approach to embrace open cloud operations.  Selecting more controlled models for virtualization reduces risk for operations and allows operators to leverage (at a price, of course) their vendor’s core competencies and mature software delivery timelines.

While all of these choices are seeing strong adoption in the general market, I have been looking at the OpenStack community in particular.  In that community, the primary architectural choice is an agent per host instead of clusters.  KVM is favored for development and is the hypervisor of NASA’s Nova implementation.  XenServer has strong support from both Citrix and Rackspace. 

Choice is good: know thyself.

Alert the villagers, it’s Frankencloud!

I’m growing more and more concerned about the preponderance of Frankencloud offerings that I see being foisted into the market place (no, my employer, Dell, is not guiltless).  Frankenclouds are “cloud solutions” that are created by using duct tape, twine, wishful marketing brochures, and at least 4 marginally cloud enabled products.

The official Frankencloud recipe goes like this:

  • Take 1 product that includes server virtualization (substitutions to VMware at your own risk)
  • Take 1 product that does storage virtualization (substitutions to SAN at your own risk)
  • Take 1 product that does network virtualization (substitutions to VLANs at your own risk)
  • Take 1 product that does IT orchestration (your guess is as good as any)
  • Take 1 product that does IT monitoring
  • Take 1 product that does Virtualization monitoring
  • Recommended: an unlimited Pizza budget for your IT Ops team

Combine the ingredients at high voltage in a climate conditioned environment.  Stir in a seriously large amounts of consulting services, training, and Red Bull.  At the end of this process, you will have your very own Frankencloud!

Frankenclouds are notoriously difficult to maintain because each part has its own version life cycle.  More critically, they also lack a brain.

Unfortunately, there are few alternatives to the Frankencloud today.  I think that the alternatives will rewrite the rules that Ops uses to create clouds.  Here are the rules that I think help drive a wooden stake through the heart of the Frankencloud (yeah, I mixed monsters):

  • not assume that server virtualization == cloud. 
  • simple, simple and simpler than that
  • focus on applications (need to write more about DevOps)
  • start with networking, not computation
  • assume that software containers are replaced, not upgraded

What do you think we can do to defeat Frankenclouds?