Crowbar 2 Status Update > I can feel the rumble of the engines

two

Crowbar Two

While I’ve been more muted on our Crowbar 2 progress since our pivot back to CB1 for Grizzly, it has been going strong and steady.  We took advantage of the extra time to do some real analysis about late-binding, simulated annealing, emergent services and functional operations that are directly reflected in Crowbar’s operational model (yes, I’m working on posted explaining each concept).

We’re planning Crowbar 2 hack-a-thon in Hong Kong before the OpenStack Ice House Summit (11/1-3).  We don’t expect a big crowd on site, but the results will be fun to watch remote and it should be possible to play along (watch the crowbar list for details).

In the mean time, I wanted to pass along this comprehensive status update by Crowbar’s leading committer, Victor Lowther:

It has been a little over a month since my last status report on
Crowbar 2.0, so now that we have hit the next major milestone
(installing the OS on a node and being able to manage it afterwards),
it is time for another status report.

Major changes since the initial status report:

* The Crowbar framework understands node aliveness and availability.
* The Network barclamp is operational, and can manage IPv4 and IPv6 in
  the same network.
* delayed_jobs + a stupidly thin queuing layer handle all our
  long-running tasks.
* We have migrated to postgresql 9.3 for all our database needs.
* DHCP and DNS now utilize the on_node_* role hooks to manage their
  databases.
* We support a 2 layer deployment tree -- system on top, everything
  else in the second layer.
* The provisioner can install Ubuntu 12.04 on other nodes.
* The crowbar framework can manage other nodes that are not in
  Sledgehammer.
* We have a shiny installation wizard now.

In more detail:

Aliveness and availability:

Nodes in the Crowbar framework have two related flags that control
whether the annealer can operate on them.

Aliveness is under the control of the Crowbar framework and
encapsulates the framework's idea of whether any given node is
manageable or not.  If a node is pingable and can be SSH'ed into as
root without a password using the credentials of the root user on
the admin node, then the node is alive, otherwise it is dead.
Aliveness is tested everytime a jig tries to do something on a node
-- if a node cannot be pinged and SSH'ed into from at least one of
its addresses on the admin network, it will be marked as
dead.  When a node is marked as dead, all of the noderoles on that
node will be set to either blocked or todo (depending on the state of
their parent noderoles), and those changes will ripple down the
noderole dependency graph to any child noderoles.

Nodes will also mark themselves as alive and dead in the course of
their startup and shutdown routines.

Availability is under the control of the Crowbar cluster
administrators, and should be used by them to tell Crowbar that it
should stop managing noderoles on the node.  When a node is not
available, the annealer will not try to perform any jig runs on a
node, but it will leave the state of the noderoles alone.

A node must be both alive and available for the annealer to perform
operations on it.

The Network Barclamp:

The network barclamp is operational, with the following list of
features:

* Everything mentioned in Architecture for the Network Barclamp in
  Crowbar 2.0
* IPv6 support.  You can create ranges and routers for IPv6 addresses
  as well as IPv4 addresses, and you can tell a network that it should
  automatically assign IPv6 addresses to every node on that network by
  setting the v6prefix setting for that network to either:
  * a /64 network prefix, or
  * "auto", which will create a globally unique RFC4193 IPv6 network
    prefix from a randomly-chosen 40 bit number (unique per cluster
    installation) followed by a subnet ID based on the ID of the
    Crowbar network.
  Either way, nodes in a Crowbar network that has a v6prefix will get
  an interface ID that maps back to their FQDN via the last 64 bits of
  the md5sum of that FQDN. For now, the admin network will
  automatically create an RFC4193 IPv6 network if it is not passed a
  v6prefix so that we can easily test all the core Crowbar components
  with IPv6 as well as IPv4.  The DNS barclamp has been updated to
  create the appropriate AAAA records for any IPv6 addresses in the
  admin network.

Delayed Jobs and Queuing:

The Crowbar framework runs all jig actions in the background using
delayed_jobs + a thin queuing layer that ensures that only one task is
running on a node at any given time.  For now, we limit ourselves to
having up to 10 tasks running in the background at any given time,
which should be enough for the immediate future until we come up with
proper tuning guidelines or auto-tuning code for significantly larger
clusters.

Postgresql 9.3:

Migrating to delayed_jobs for all our background processing made it
immediatly obvious that sqlite is not at all suited to handling real
concurrency once we started doing multiple jig runs on different nodes
at a time. Postgresql is more than capable of handling our forseeable
concurrency and HA use cases, and gives us lots of scope for future
optimizations and scalability.

DHCP and DNS:

The roles for DHCP and DNS have been refactored to have seperate
database roles, which are resposible for keeping their respective
server roles up to date.  Theys use the on_node_* roles mentioned in
"Roles, nodes, noderoles, lifeycles, and events, oh my!" along with a
new on_node_change event hook create and destroy DNS and DHCP database
entries, and (in the case of DHCP) to control what enviroment a node
will PXE/UEFI boot into.  This gives us back the abiliy to boot into
something besides Sledgehammer.

Deployment tree:

Until now, the only deployment that Crowbar 2.0 knew about was the
system deployment.  The system deployment, however, cannot be placed
into proposed and therefore cannot be used for anything other than
initial bootstrap and discovery.  To do anything besides
bootstrap the admin node and discover other nodes, we need to create
another deployment to host the additional noderoles needed to allow
other workloads to exist on the cluster.  Right now, you can only
create deployments as shildren of the system deployment, limiting the
deployment tree to being 2 layers deep.

Provisioner Installing Ubuntu 12.04:

Now, we get to the first of tqo big things that were added in the last
week -- the provisioner being able to install Ubuntu 12.04 and bring
the resulting node under management by the rest of the CB 2.0
framework.  This bulds on top of the deployment tree and DHCP/DNS
database role work.  To install Ubuntu 12.04 on a node from the web UI:

1: Create a new deployment, and add the provisioner-os-install role to
that deployment.  In the future you will be able to edit the
deployment role information to change what the default OS for a
deployment should be.
2: Drag one of the non-admin nodes onto the provisioner-os-install
role.  This will create a proposed noderole binding the
provisioner-os-install role to that node, and in the future you would
be able to change what OS would be installed on that node by editing
that noderole before committing the deployment.
3: Commit the snapshot.  This will cause several things to happen:
  * The freshly-bound noderoles will transition to TODO, which will
    trigger an annealer pass on the noderoles.
  * The annealer will grab all the provisioner-os-install roles that
    are in TODO, set them in TRANSITION, and hand them off to
    delayed_jobs via the queuing system.
  * The delayed_jobs handlers will use the script jig to schedule a
    reboot of the nodes for 60 seconds in the future and then return,
    which will transition the noderole to ACTIVE.
  * In the crowbar framework, the provisioner-os-install role has an
    on_active hook which will change the boot environment of the node
    passed to it via the noderole to the appropriate os install state
    for the OS we want to install, and mark the node as not alive so
    that the annealer will ignore the node while it is being
    installed.
  * The provisioner-dhcp-database role has an on_node_change handler
    that watches for changes in the boot environment of a node.  It
    will see the bootenv change, update the provisioner-dhcp-database
    noderoles with the new bootenv for the node, and then enqueue a
    run of all of the provisioner-dhcp-database roles.
  * delayed_jobs will see the enqueued runs, and run them in the order
    they were submitted.  All the runs sholuld happen before the 60
    seconds has elapsed.
  * When the nodes finally reboot, the DHCP databases should have been
    updated and the nodes will boot into the Uubntu OS installer,
    install, and then set their bootenv to local, which will tell the
    provisioner (via the provisioner-dhcp-database on_node_change
    hook) to not PXE boot the node anymore.
  * When the nodes reboot off their freshly-installed hard drive, they
    will mark themselves as alive, and the annealer will rerun all of
    the usual discovery roles.
The semi-astute observer will have noticed some obvious bugs and race
conditions in the above sequence of steps.  These have been left in
place in the interest of expediency and as learning oppourtunities for
others who need to get familiar with the Crowbar codebase.

Installation Wizard:

We have a shiny installation that you can use to finish bootstrapping
your admin node.  To use it, pass the --wizard flag after your FQDN to
/opt/dell/bin/install-crowbar when setting up the admin node, and the
install script will not automatically create an admin network or an
entry for the admin node, and logging into the web UI will let you
customize things before creating the initial admin node entry and
committing the system deployment.  

Once we get closer to releasing CB 2.0, --wizard will become the default.

Leave a comment