Redefining PXE Boot Provisioning for the Modern Data Center

Over the past 20 years, Linux admins have defined provisioning with a limited scope; PXE boot with Cobbler. This approach continues to be popular today even though it only installs an operating system limiting the operators’ ability to move beyond this outdated paradigm

Digital Rebar is the answer operators have been looking for as provisioning has taken on a new role within the data center to include workflow management, infrastructure automation, bare metal, virtual machines inside and outside the firewall as well as the coming need for edge IoT management. The active open source community is expanding the capabilities of provisioning ensuring operators a new foundational technology to rethink how data centers can be managed to meet today’s rapid delivery requirements.

Digital Rebar was architected with the global Cobbler user-base in mind to not only simplify the transition but also offer a set of common packages that are shareable across the community to simplify and automate repetitive tasks; freeing up operators to spend more time focusing on key issues instead of finding new OS packages for example.

I encourage you to take 15 minutes and visit the Digital Rebar community to learn more about this technology and how you can up-level your organization’s capability to automate infrastructure at scale,

Virtual Toilet Backing Up? Internet Plumbers get the dirty jobs

The latest mantra in IT is to cleanly abstract away everything including hardware, software, management, processes, etc. Take “serverless” for example – there are still servers involved but much more hidden than before.  This abstraction obsession is rapidly changing the way that applications and services are developed and delivered.

However, the underlying abstractions hide, not remove infrastructure; it is still there and, like plumbing, simply becomes someone else’s problem to deal with. At RackN, we are focused on solving these hidden plumbing problems at the physical infrastructure operations layer.

Working with physical hardware is viewed as messy and is not going to be a trending hashtag anytime soon. We are ok with that. In fact, we view ourselves as Internet Plumbers keeping the “pipes” open without any hesitancy of getting dirty.

Part of our mission is to standardize the processes in physical ops to provide site reliability engineers and DevOps teams with an automated, open, secure, scalable, and reliable solution. Our solution is built not only for today’s needs but also the coming Edge computing revolution whereby physical ops will move from hundreds of nodes to hundreds of thousands of endpoints.

We offer several methods to being immediately working with our technology:

  • Digital Rebar Provision– Our open source DHCP/PXE/IPXE service with community or corporate plug-ins for additional features
  • RackN Trial – Get access to our solution built on Digital Rebar Provision; contact RackN sales

Based on a prior Rob Hirschfeld Post Physical Ops = Plumbers of the Internet. Celebrating dirty IT jobs 8 bit style

Cloud Native PHYSICAL PROVISIONING? Come on! Really?!

We believe Cloud Native development disciplines are essential regardless of the infrastructure.

imageToday, RackN announce very low entry level support for Digital Rebar Provisioning – the RESTful Cobbler PXE/DHCP replacement.  Having a company actually standing behind this core data center function with support is a big deal; however…

We’re making two BIG claims with Provision: breaking DevOps bottlenecks and cloud native physical provisioning.  We think both points are critical to SRE and Ops success because our current approaches are not keeping pace with developer productivity and hardware complexity.

I’m going to post more about Provision can help address the political struggles of SRE and DevOps that I’ve been watching in our industry.   A hint is in the release, but the Cloud Native comment needs to be addressed.

First, Cloud Native is an architecture, not an infrastructure statement.

There is no requirement that we use VMs or AWS in Cloud Native.  From that perspective, “Cloud” is a useful but deceptive adjective.  Cloud Native is born from applications that had to succeed in hands-off, lower SLA infrastructure with fast delivery cycles on untrusted systems.  These are very hostile environments compared to “legacy” IT.

What makes Digital Rebar Provision Cloud Native?  A lot!

The following is a list of key attributes I consider essential for Cloud Native design.

Micro-services Enabled: The larger Digital Rebar project is a micro-services design.  Provision reflects a stand-alone bundling of two services: DHCP and Provision.  The new Provision service is designed to both stand alone (with embedded UX) and be part of a larger system.

Swagger RESTful API: We designed the APIs first based on years of experience.  We spent a lot of time making sure that the API conformed to spec and that includes maintaining the Swagger spec so integration is easy.

Remote CLI: We build and test our CLI extensively.  In fact, we expect that to be the primary user interface.

Security Designed In: We are serious about security even in challenging environments like PXE where options are limited by 20 year old protocols.  HTTPS is required and user or bearer token authentication is required.  That means that even API calls from machines can be secured.

12 Factor & API Config: There is no file configuration for Provision.  The system starts with command line flags or environment variables.  Deeper configuration is done via API/CLI.  That ensures that the system can be fully managed by remote and configured securely becausee credentials are required for configuration.

Fast Start / Golang:  Provision is a totally self-contained golang app including the UX.  Even so, it’s very small.  You can run it on a laptop from nothing in about 2 minutes including download.

CI/CD Coverage: We committed to deep test coverage for Provision and have consistently increased coverage with every commit.  It ensures quality and prevents regressions.

Documentation In-project Auto-generated: On-boarding is important since we’re talking about small, API-driven units.  A lot of Provisioning documentation is generated directly from the code into the actual project documentation.  Also, the written documentation is in Restructured Text in the project with good indexes and cross-references.  We regenerate the documentation with every commit.

We believe these development disciplines are essential regardless of the infrastructure.  That’s why we made sure the v3 Provision (and ultimately every component of Digital Rebar as we iterate to v3) was built to these standards.

What do you think?  Is this Cloud Native?  What did we miss?

RackN Ends DevOps Gridlock in Data Center [Press Release]

Today we announced the availability of Digital Rebar Provision, the industry’s first cloud-native physical provisioning utility.  We’ve had this in the Digital Rebar community for a few weeks before offering support and response has been great!

DR ProvisionBy releasing their API-driven provisioning tool as a stand-alone component of the larger Digital Rebar suite, RackN helps DevOps teams break automation bottlenecks in their legacy data centers without disrupting current operations. The stand-alone open utility can be deployed in under 5 minutes and fits into any data center design. RackN also announced a $1,000 starter support and consulting package to further accelerate transition from tools like Cobbler, MaaS or Stacki to the new Golang utility.

“We were seeing SREs suffering from high job turnover,” said Rob Hirschfeld, RackN founder and CEO. “When their integration plans get gridlocked by legacy tooling they quickly either lose patience or political capital. Digital Rebar Provision replaces the legacy tools without process disruption so that everyone can find shared wins early in large SRE initiatives.”

The first cloud-native physical provisioning utility

Data center provisioning is surprisingly complex because it’s caught between cutting edge hardware and arcane protocols and firmware requirements that are difficult to disrupt.  The heart of the system is a fickle combination of specific DHCP options, a firmware bootstrap environment (known as PXE), a very lightweight file transfer protocol (TFTP) and operating system specific templating tools like preseed and kickstart.  Getting all these pieces to work together with updated APIs without breaking legacy support has been elusive.

By rethinking physical ops in cloud-native terms, RackN has managed to distill out a powerful provisioning tool for DevOps and SRE minded operators who need robust API/CLI, Day 2 Ops, security and control as primary design requirements. By bootstrapping foundational automation with Digital Rebar Provision, DevOps teams lay a foundation for data center operations that improves collaboration between operators and SRE teams: operators enjoy additional control and reuse and SREs get a doorway into building a fully automated process.

A pragmatic path without burning downing the data center

“I’m excited to see RackN providing a pragmatic path from physical boot to provisioning without having to start over and rebuild my data center to get there.” said Dave McCrory, an early cloud and data gravity innovator.  “It’s time for the industry to stop splitting physical and cloud IT processes because snowflaked, manual processes slow everyone down.  I can’t imagine an easier on-ramp than Digital Rebar Provision”

The RackN Digital Rebar is making it easy for Cobbler, Stacki, MaaS and Forman users to evaluate our RESTful, Golang, Template-based PXE Provisioning utility.  Interested users can evaluate the service in minutes on a laptop or engage with RackN for a more comprehensive trail with expert support.  The open Provision service works both independently and as part of Digital Rebar’s full life-cycle hybrid control.

Scontactee specific features at http://rackn.com/provision/drsa.

Want help starting on this journey?  Contact us and we can help.

Cloud-first Physical Provisioning? 10 ways that the DR is in to fix your PXE woes.

image

Why has it been so hard to untie from Cobbler? Why can’t we just REST-ify these 1990s Era Protocols? Dealing with the limits of PXE, DHCP and TFTP in wide-ranging data centers is tricky and Cobbler’s manual pre-defined approach was adequate in legacy data centers.

Now, we have to rethink Physical Ops in Cloud-first terms. DevOps and SRE minded operators services that have need real APIs, day-2 ops, security and control as primary design requirements.

The Digital Rebar team at RackN is hunting for Cobbler, Stacki, MaaS and Forman users to evaluate our RESTful, Golang, Template-based PXE Provisioning utility. Deep within the Digital Rebar full life-cycle hybrid control was a cutting-edge bare metal provisioning utility. As part of our v3 roadmap, we carved out the Provisioner to also work as a stand-alone service.

Here’s 10 reasons why DR Provisioning kicks aaS:

  1. Swagger REST API & CLI. Cloud-first means having a great, tested API. Years of provisioning experience went into this 3rd generation design and it shows. That includes a powerful API-driven DHCP.
  2. Security & Authenticated API. Not an afterthought, we both HTTPS and user authentication for using the API. Our mix of basic and bearer token authentication recognizes that both users and automation will use the API. This brings a new level of security and control to data center provisioning.
  3. Stand-alone multi-architecture Golang binary. There are no dependencies or prerequisites, plus upgrades are drop in replacements. That allows users to experiment isolated on their laptop and then easily register it as a SystemD service.
  4. Nested Template Expansion. In DR Provision, Boot Environments are composed of reusable template snippets. These templates can incorporate global, profile or machine specific properties that enable users to set services, users, security or scripting extensions for their environment.
  5. Configuration at Global, Group/Profile and Node level. Properties for templates can be managed in a wide range of ways that allows operators to manage large groups of servers in consistent ways.
  6. Multi-mode (but optional) DHCP. Network IP allocation is a key component of any provisioning infrastructure; however, DHCP needs are highly site dependant. DR Provision works as a multi-interface DHCP listener and can also resolve addresses from DHCP forwarders. It can even be disabled if your environment already has a DHCP service that can configure a the “next boot” provider.
  7. Dynamic Provisioner templates for TFTP and HTTP. For security and scale, DR Provision builds provisioning files dynamically based on the Boot Environment Template system. This means that critical system information is not written to disk and files do not have to be synchronized. Of course, when you need to just serve a file that works too.
  8. Node Discovery Bootstrapping. Digital Rebar’s long-standing discovery process is enabled in the Provisioner with the included discovery boot environment. That process includes an integrated secure token sequence so that new machines can self-register with the service via the API. This eliminates the need to pre-populate the DR Provision system.
  9. Multiple Seeding Operating Systems. DR Provision comes with a long list of Boot Environments and Templates including support for many Linux flavors, Windows, ESX and even LinuxKit. Our template design makes it easy to expand and update templates even on existing deployments.
  10. Two-stage TFTP/HTTP Boot. Our specialized Sledgehammer and Discovery images are designed for speed with optimized install cycles the improve boot speed by switching from PXE TFTP to IPXE HTTP in a two stage process. This ensures maximum hardware compatibility without creating excess network load.

Digital Rebar Provision is a new generation of data center automation designed for operators with a cloud-first approach. Data center provisioning is surprisingly complex because it’s caught between cutting edge hardware and arcane protocols embedded in firmware requirements that are still very much alive.

We invite you to try out Digital Rebar Provision yourself and let us know what you think. It only takes a few minutes. If you want more help, contact RackN for a $1000 Quick Start offer.

Need PXE? Try out this Cobbler Replacement

DR Provision

Operators & SREs – we need your feedback on an open DHCP/PXE technical preview that will amaze you and can be easily tested right from your laptop.

We wanted to make open basic provisioning API-driven, secure, scalable and fast.  So we carved out the Provision & DHCP services as a stand alone unit from the larger open Digital Rebar project.  While this Golang service lacks orchestration, this complete service is part of Digital Rebar infrastructure and supports the discovery boot process, templating, security and extensive image library (Linux, ESX, Windows, … ) from the main project.

TL;DR: FIVE MINUTES TO REPLACE COBBLER?  YES.

The project APIs and CLIs are complete for all provisioning functions with good Swagger definitions and docs.  After all, it’s third generation capability from the Digital Rebar project.  The integrated UX is still evolving.

Here’s a video of the quick install process.

 

Here are some examples from the documentation:

core_servicesinstall_discovered

Apparently IT death smells like kickstart files. Six Reasons why.

Today, I’m sharing a parable about always being focused on adding value.

Recently, I was on a call with an IT Ops manager who insisted that his team had their on-premises operations under control with “python scripts and manual kickstart files” because they “really don’t change their infrastructure setup.” He explained that he and his team was comfortable with this because it was something they understood and did not require learning new systems. While I understand his position, I was sort of sad for him and his employer because…

No value is created for his company by maintaining custom kickstart, preseeds or boot files.

Maintaining kickstarts is fatal for many reasons. Is there a way to make it less fatal? Yes, and it involves investing in learning tools that let you move up stack.

Contrary to popular IT mythology, managing physical infrastructure is still a reality for many IT teams and will remain a part of best practices until every workload simply runs on Amazon and it becomes their problem.  Since that “Utopian” future is unlikely, let’s deal with some practical realities of hybrid IT.

Here are my six reasons why custom kickstarts (and other site-specific boot provisioning scripts) are dangerous:

1. Creating Site Unique Processes

Every infrastructure is unique and that’s a practical reality that we have to accept because otherwise we would never be able to make improvements and corrects without touching everything that already deployed. However, we really want to work hard to minimize places where we inject variation into the environment. That means that server and site specific kickstarts with lots of post-provisioning steps forces operators to maintain additional information about each server.

2. Building Server Specific Configurations

When we create server specific templates, it becomes nearly impossible to recreate server builds. That directly leads to fragile infrastructure because teams cannot quickly redeploy or automate refreshes. Static IT infrastructure is a known fail pattern and makes enterprises vulnerable to staff changes, hacking and inability to manage and patch.

3. Having Opaque Configurations

Kickstart is hard to understand (and even harder to troubleshoot). When teams take actions during the provisioning process they are often not tracked or managed like other operational scripting tools. Failures or injections can easily go undetected. Even if they are tracked, the number of operators who can read and manage these scripts is limited. That means that critical aspects of your operational environment happen outside of your awareness.

4. Being Less Secure

Kickstart processes generally include injecting SSH keys, certificates and other authentication credentials. These embedded credentials are often hard coded into the process with minimal awareness of the operational team leaving you vulnerable at the most foundational level. This is not an acceptable security process; however, teams who hack kickstarts often don’t want to consider the implications.

Security side note: most teams don’t have the expertise to integrate TPM or HSM into their kickstart processes; consequently, these key security technologies are generally unused and ignored. If you want to talk about this, please contact me!

5. Diverging Provisioning Patterns

Cloud does not use kickstarts. Provisioning variation increases when teams keep/add logic and configuration into server provisioning instead of doing it as post-provision automation. If your physical provisioning team is not rehearsing on cloud then you’re in a serious IT hole because all workloads should be managed as hybrid-ready. Deployment fidelity helps accelerate teams and reduces cost.

6. Reusing Community Practice

Finally, managing your own kickstarts makes it impossible to leverage community patterns and practices. Kickstarts are not exactly a hive of innovation so you are not creating any competitive advantage by adding variation there. In cases like that, reusing community tooling is a net benefit to your organization. Why have we not done this already? Until recently, provisioning tools were not API driven or focused on reusable shared practice.

While Kickstart or similar is pretty much required for physical, we have a solution for these issues.

One of the key design elements of Digital Rebar is an templated, API driven boot provisioner. Our approach uses kickstarts, preseeds and other tools; however, we’ve worked hard to minimize their span and decompose them into reusable components. That allows users to inject site specific code as snippets that are centrally managed and hardware neutral.

Critically, our approach allows SRE and Ops teams to get out of the kickstart business and focus on provisioning workflow and automation. Yes, there’s some learning curve but there are a lot of benefits to moving up stack.

It’s not too late to “:q!” those kickstart edits and accelerate your infrastructure.