Podcast – Ash Young talks Everything in your PC is IoT

Joining us this week is Ash Young, Chief Evangelist of Cachengo and OPNFV Ambassador. Cachengo builds smart, predictive storage for machine learning.

NOTE – We had a microphone problem that is solved at the 9 minute 19 second mark of the podcast. Start there if you find the clicking noise an issue

Highlights

  • 1 min 34 sec: Time to Change Basic Storage Architecture
    • Converged Protocol Appliances & Nothing has changed form early 90s
  • 7 min 8 sec: Sounds like Hadoop?
    • Underlying hardware still used proprietary protocols
  • 9 min 19 sec: Single Drive Cluster – it’s built?
    • 24 Servers and 24 Drives in a 1U ; has done 48 drives
    • Working on a new design for 96 drives in a 1U
  • 11 min 52 sec: Truly a Distributed Storage Array
    • Storage focused microservers
  • 13 min 24 sec: Limitations in Operations with Hardware
    • Hinders Innovation
  • 15 min 40 sec: Lessons Learned on Managing Devices
    • Over-dependence on tunneling protocols requiring full networking (e.g. VPN)
    • Move to peer-to-peer network slicing
  • 17 min 28 sec: Software Defined Networking Topology
    • Introduce devices to each other and get out of the way
  • 18 min 33sec: Every Storage Node is Part of the Network
    • Moves into a world of networking challenges
    • Ipv4 cannot support this model
  • 21 min 06 sec: Networking Magic in the Model
    • Peer to Peer w/ Broker Introduction and then Removal from Traffic
    • Scale out for Edge Computing Requires this New Model
    • 5G Energy Cost Savings are a Must
  • 27 min 28 sec: Issues of Powering On/Off Machines to Save Money
    • Creating a massive array of smaller GPUs for Machine Learning
    • Build a fast, cheap, lower power storage system to get started in the model
  • 34 min 09 sec: Doesn’t fit the model that Edge infrastructure will be Cloud patterned
    • Rob makes a point to listeners to consider various ideas in future Edge infrastructure
  • 36 min 48 sec: State of Open Source?
    • Consortium’s and open source standards
    • Creating the lowest common denominator free thing so competitors can build differentiation on top of it for revenue
    • Not a fan of open core models
  • 41 min 44 sec: Does Open Source include Supporting Implementation?
    • Look at the old WINE project financing
    • You can’t just deploy people onsite for free<
  • 48 min 24 sec: Wrap-Up

Podcast Guest: Ash Young,Chief Evangelist of Cachengo

Technology leader with over 20 years experience, primarily in storage. Created the first open source NAS (network attached storage) stack, the first unified block/file storage stack for Linux, the first storage management software, and the list goes on.

Since 2012, I have been heavily involved in NFV (Network Functions Virtualization). I wrote a bunch of the standards and was editor for the Compute/Storage Domain in the Infrastructure Working Group for NFV. And then I started up the open source effort to close the gaps for achieving our vision of the NFVI. This was the precursor to OPNFV.

The best way to understand what I do is to imagine being a high-level marketing exec who comes up with a whiz bang product and business idea, including business plan, competitive analysis, MRD, everything, but now comes the hand-off with your engineering organization, only to hear a litany of nos. Well, I got tired of being told “No, it can’t be done” or “No, we don’t know how to do it”, so I started doing it myself. I call this skill “Rapid Prototyping”, and over the years I have found it to be a very missing gap in the product development process. When Marketing comes up with ideas, we need a way to very efficiently validate the technology and business concepts before we commit to a lengthy engineering cycle.

I’m just one person, working in a company of over 180,000 people and in a very dynamic industry. My ability to get creative and to influence businesses is never a dull moment; and I will probably be 100 years old and still writing open source software.

Create your first CentOS 7 Machine on RackN Portal with Digital Rebar Provision

This is the third blog in a series demonstrating the steps required to complete a series of tasks in the RackN Portal using Digital Rebar Provision.

Prerequisite

You will need an account on the RackN Portal with an active Digital Rebar Provision endpoint running. In this How To, I am using Packet.net for my infrastructure as I have no local hardware available to build a local system.

For information on creating a Digital Rebar Provision endpoint and connecting it to a RackN Portal please see these two prior How To blogs:

Step 1 : Create a new Machine on Packet.net

The RackN Portal needs a physical machine for Digital Rebar Provision (DRP) to discover and track in the Machine section of the UX. I am providing steps to create that machine on my Packet.net account:

  • Login into your Packet.net account

In the image above, I show my DRP endpoint (spectordemo-drp-ewr1-00) and a machine (spectordemo-machines-ewr1-01) I created during the Deploy and Test DRP in less than 10 Minutes How To guide. Note – my machines are Type 0 which is about $0.07 an hour to run and the location is at the EWR1 Packet.net data center.

  • Select +Add New to create a new physical machine on Packet.net

Enter the following information for the entry fields on the “Deploy on Demand” page:

  • Hostname: Enter anything you want with a .com (e.g. spectortest.com)
  • Location: Choose the same location of your endpoint – see screen above (e.g. EWR1)
  • Type: Type 0 (cheapest machine ~ $.07 per hour)
  • OS: Custom iPXE ; a new window will appear below that selection area after choosing Custom iPXE
    • Enter the http address of your Endpoint along with “:default.ipxe” at then end so you get “http://#.#.#.#:8091:default.ipxe&#8221; (NOTE – the RackN portal address will have :8092, be sure to switch here to :8091)
  • Select the “User Data” button and a new pop-up screen will appear; select SAVE

Packet will then show the new machine as it is setup with the color going from yellow to green during setup. If you click “View Progress” you can monitor the machine start.

Within a few minutes, the machine will switch from yellow to green at which point you will have created a new physical machine to provision with DRP.

Step 2 : Provision a new CentOS 7 Machine from with the RackN Portal 

  • Prepare the Global Workflow

The default Workflow available needs to be removed if you are working with Packet.net machines. If your screen does not look like the final Workflow image shown below, take the following steps:

  1. Delete the Workflow by clicking “Remove” on each step until it is removed
  2. Click the Workflow Wizard to create the 3 Stages shown below

The final Workflow page should look like the image below with three separate Stages and follow-on steps for processing.

  • Confirm new Machine is Visible to RackN Portal

The newly created machine on Packet.net should now be visible in your Bulk Actions page as shown below. The Stage will be set to “sledgehammer-wait and BootEnv to “sledgehammer.”

If the Stage for the new machine is not correct, reboot the machine using the Plugin Action -> powercycle option. The machine should then set to the proper Stage and BootEnv as shown above.

  • Change the Stage and BootEnv to CentOS 7 Settings

Before this final step, be sure to check the machine in the Packet.net settings that it is set for PXE Boot to YES/ON.

In the Bulk Action page, you can change the Stage and BootEnv settings. Select the newly created machine and set the Stages to “centos-7-install” as shown below and then click the 4-arrow button.

Once complete you will see the following setup on the Bulk Action page.

  • Reboot the new Machine in Packet.net

The final step to provision this new machine from DRP is to change the Plugin Action option to “powercycle” and press the hand with figure down. Of course, make sure your machine is selected as show in the image above.

Step 3 : Monitor the Installation of CentOS 7 on the new Machine

To monitor the activity on your new machine you will need to ssh into that machine from a terminal window. To get the ssh key, I selected the new machine in the RackN Portal and grabbed the content from the >_packet/sos: line below. In this case I used 9a17d7d1-fa74-4757-8683-82b57e8e3ed2@sos.sjc1.packet.net.

In the same directory you ran the “pkt-demo” How To in the first blog, you will see a file like “spectordemo-machines-ssh-key” depending on the names you used in the first blog.  Run this command:

ssh -i spectordemo-machines-ssh-key 9a17d7d1-fa74-4757-8683-82b57e8e3ed2@sos.sjc1.packet.net

This will connect to the new machine so you can see activity. For the machine waiting at sledgehammer-wait you will see the following image:

Once the reboot is executed in STEP 3 / (Reboot the New Machine in Packet.net) you will see the machine shut down and disconnect you. Run the same ssh command and you will see this screen while the machine reboots:

The machine will then move into the CentOS 7 install and you will see a sequence of Linux install information such as the following:

This completes the provisioning of a new machine on Packet.net using the RackN Portal Workflow process.

Redefining PXE Boot Provisioning for the Modern Data Center

Over the past 20 years, Linux admins have defined provisioning with a limited scope; PXE boot with Cobbler. This approach continues to be popular today even though it only installs an operating system limiting the operators’ ability to move beyond this outdated paradigm

Digital Rebar is the answer operators have been looking for as provisioning has taken on a new role within the data center to include workflow management, infrastructure automation, bare metal, virtual machines inside and outside the firewall as well as the coming need for edge IoT management. The active open source community is expanding the capabilities of provisioning ensuring operators a new foundational technology to rethink how data centers can be managed to meet today’s rapid delivery requirements.

Digital Rebar was architected with the global Cobbler user-base in mind to not only simplify the transition but also offer a set of common packages that are shareable across the community to simplify and automate repetitive tasks; freeing up operators to spend more time focusing on key issues instead of finding new OS packages for example.

I encourage you to take 15 minutes and visit the Digital Rebar community to learn more about this technology and how you can up-level your organization’s capability to automate infrastructure at scale,

Crowbar 2.0 Design Summit Notes (+ open weekly meetings starting)

I could not be happier with the results Crowbar collaborators and my team at Dell achieved around the 1st Crowbar design summit. We had great discussions and even better participation.

The attendees represented major operating system vendors, configuration management companies, OpenStack hosting companies, OpenStack cloud software providers, OpenStack consultants, OpenStack private cloud users, and (of course) a major infrastructure provider. That’s a very complete cross-section of the cloud community.

I knew from the start that we had too little time and, thankfully, people were tolerant of my need to stop the discussions. In the end, we were able to cover all the planned topics. This was important because all these features are interlocked so discussions were iterative. I was impressed with the level of knowledge at the table and it drove deep discussion. Even so, there are still parts of Crowbar that are confusing (networking, late binding, orchestration, chef coupling) even to collaborators.

In typing up these notes, it becomes even more blindingly obvious that the core features for Crowbar 2 are highly interconnected. That’s no surprise technically; however, it will make the notes harder to follow because of knowledge bootstrapping. You need take time and grok the gestalt and surf the zeitgeist.

Collaboration Invitation: I wanted to remind readers that this summit was just the kick-off for a series of open weekly design (Tuesdays 10am CDT) and coordination (Thursdays 8am CDT) meetings. Everyone is welcome to join in those meetings – information is posted, recorded, folded, spindled and mutilated on the Crowbar 2 wiki page.

These notes are my reflection of the online etherpad notes that were made live during the meeting. I’ve grouped them by design topic.

Introduction

  • Contributors need to sign CLAs
  • We are refactoring Crowbar at this time because we have a collection of interconnected features that could not be decoupled
  • Some items (Database use, Rails3, documentation, process) are not for debate. They are core needs but require little design.
  • There are 5 key topics for the refactor: online mode, networking flexibility, OpenStack pull from source, heterogeneous/multi operating systems, being CDMB agnostic
  • Due to time limits, we have to stop discussions and continue them online.
  • We are hoping to align Crowbar 2 beta and OpenStack Folsom release.

Online / Connected Mode

  • Online mode is more than simply internet connectivity. It is the foundation of how Crowbar stages dependencies and components for deploy. It’s required for heterogeneous O/S, pull from source and it has dependencies on how we model networking so nodes can access resources.
  • We are thinking to use caching proxies to stage resources. This would allow isolated production environments and preserves the run everything from ISO without a connection (that is still a key requirement to us).
  • Suse’s Crowbar fork does not build an ISO, instead it relies on RPM packages for barclamps and their dependencies.
  • Pulling packages directly from the Internet has proven to be unreliable, this method cannot rely on that alone.

Install From Source

  • This feature is mainly focused on OpenStack, it could be applied more generally. The principals that we are looking at could be applied to any application were the source code is changing quickly (all of them?!). Hadoop is an obvious second candidate.
  • We spent some time reviewing the use-cases for this feature. While this appears to be very dev and pre-release focused, there are important applications for production. Specifically, we expect that scale customers will need to run ahead of or slightly adjacent to trunk due to patches or proprietary code. In both cases, it is important that users can deploy from their repository.
  • We discussed briefly our objective to pull configuration from upstream (not just OpenStack, but potentially any common cookbooks/modules). This topic is central to the CMDB agnostic discussion below.
  • The overall sentiment is that this could be a very powerful capability if we can manage to make it work. There is a substantial challenge in tracking dependencies – current RPMs and Debs do a good job of this and other configuration steps beyond just the bits. Replicating that functionality is the real obstacle.

CMDB agnostic (decoupling Chef)

  • This feature is confusing because we are not eliminating the need for a configuration management database (CMDB) tool like Chef, instead we are decoupling Crowbar from the a single CMDB to a pluggable model using an abstraction layer.
  • It was stressed that Crowbar does orchestration – we do not rely on convergence over multiple passes to get the configuration correct.
  • We had strong agreement that the modules should not be tightly coupled but did need a consistent way (API? Consistent namespace? Pixie dust?) to share data between each other. Our priority is to maintain loose coupling and follow integration by convention and best practices rather than rigid structures.
  • The abstraction layer needs to have both import and export functions
  • Crowbar will use attribute injection so that Cookbooks can leverage Crowbar but will not require Crowbar to operate. Crowbar’s database will provide the links between the nodes instead of having to wedge it into the CMDB.
  • In 1.x, the networking was the most coupled into Chef. This is a major part of the refactor and modeling for Crowbar’s database.
  • There are a lot of notes captured about this on the etherpad – I recommend reviewing them

Heterogeneous OS (bare metal provisioning and beyond)

  • This topic was the most divergent of all our topics because most of the participants were using some variant of their own bare metal provisioning project (check the etherpad for the list).
  • Since we can’t pack an unlimited set of stuff on the ISO, this feature requires online mode.
  • Most of these projects do nothing beyond OS provisioning; however, their simplicity is beneficial. Crowbar needs to consider users who just want a stream-lined OS provisioning experience.
  • We discussed Crowbar’s late binding capability, but did not resolve how to reconcile that with these other projects.
  • Critical use cases to consider:
    • an API for provisioning (not sure if it needs to be more than the current one)
    • pick which Operating Systems go on which nodes (potentially with a rules engine?)
    • inventory capabilities of available nodes (like ohai and factor) into a database
    • inventory available operating systems

The real workloads begin: Crowbar’s Sophomore Year

Given Crowbar‘s frenetic Freshman year, it’s impossible to predict everything that Crowbar could become. I certainly aspire to see the project gain a stronger developer community and the seeds of this transformation are sprouting. I also see that community driven work is positioning Crowbar to break beyond being platforms for OpenStack and Apache Hadoop solutions that pay the bills for my team at Dell to invest in Crowbar development.

I don’t have to look beyond the summer to see important development for Crowbar because of the substantial goals of the Crowbar 2.0 refactor.

Crowbar 2.0 is really just around the corner so I’d like to set some longer range goals for our next year.

  • Growing acceptance of Crowbar as an in data center extension for DevOps tools (what I call CloudOps)
  • Deeper integration into more operating environments beyond the core Linux flavors (like virtualization hosts, closed and special purpose operating systems.
  • Improvements in dynamic networking configuration
  • Enabling more online network connected operating modes
  • Taking on production ops challenges of scale, high availability and migration
  • Formalization of our community engagement with summits, user groups, and broader developer contributions.

For example, Crowbar 2.0 will be able to handle downloading packages and applications from the internet. Online content is not a major benefit without being able to stage and control how those new packages are deployed; consequently, our goals remains tightly focused improvements in orchestration.

These changes create a foundation that enables a more dynamic operating environment. Ultimately, I see Crowbar driving towards a vision of fully integrated continuous operations; however, Greg & Rob’s Crowbar vision is the topic for tomorrow’s post.

OpenStack discussion at 5/19 Central Texas Linux Users Group (CTLUG ATX)

image

Greg Althaus (@glathaus) and I will be leading a discussion about OpenStack at the May CTLUG  on 5/19 at 7pm.  The location is Mangia Pizza on Burnet and Duval (In the strip mall where Taco Deli is).

We’ll talk about how OpenStack works, where we see it going, and what Dell is doing to participate in the community.

OpenStack should be very interesting to the CTLUG because of the technologies being used AND way that the community is engaged in helping craft the software.