Cloud Culture: Becoming L33T – Five ways to “go digital native” [Collaborative Series 7/8]

Subtitle: Five keys to earn Digital Natives’ trust

This post is #7 in an collaborative eight part series by Brad Szollose and I about how culture shapes technology.

WARNING: These are not universal rules! These are two cultures. What gets high scores for Digital Natives is likely to get you sacked with Digital Immigrants.

How do Digital Natives do business?

You've gotta deal with itYou don’t sell! You collaborate with them to help solve their problems. They’ll discredit everything say if you “go all marketing on them” and try to “sell them.”

Here are five ways that you can build a two-way collaborative relationship instead of one-way selling. These tips aren’t speculation: Brad has proven these ideas work in real-world business situations.

Interested in Digital Native Culture?  We recommend reading (more books):

1) Share, don’t tell.

Remember the cultural response in Rob’s presentation discussed in the introduction to this paper? The shift took place because Rob wanted to share his expertise instead of selling the awesomeness of his employeer. This is what changed the dynamic.

In a selling situation, the sales pitch doesn’t address our client’s needs. It addresses what we want to tell them and what we think they need. It is a one-way conversation. And if someone has a choice between saying “yes” or “no” in a sales meeting, a client can always have the choice to say “no.”

Sharing draws our customers in so we can hear their problems and solve them. We can also get a barometer on what they know versus what they need. When Rob is presenting to a customer, he’s qualifying the customer too. Solutions are not one size fits all and Digital Natives respect you more for admitting this to them.

Digital Native business is about going for a long-term solution-driven approach instead of just positioning a product. If you’ve collaborated with customers and they agree you’ve got a solution for them then it’s much easier to close the sale. And over the long term, it’s a more lucrative way to do business.

2) Eliminate bottlenecks.

Ten years ago, IT departments were the bottleneck to getting products into the market. If customers resisted, it could take years to get them to like something new. Today, Apple introduces new products every six month with a massive adoption rate because Digital Natives don’t wait for permission from an authority.

The IT buyer has made that sales cycle much more dynamic because our new buyers are Digital Natives. Where Digital Immigrants stayed entrenched in a process or technology, Digital Natives are more willing to try something unproven. Amazon’s EC2 public cloud presented a huge challenge to the authority of IT departments because developers were simply bypassing internal controls. Digital Natives have been trained to look for out-of-the-box solutions to problems.

Time-to-market has become the critical measure for success.

We now have IT end-user buyers who adopt and move faster through the decision process than ever before! We interfere with their decision process if we still treating new buyers as if they can’t keep up and we have to educate them.

Today’s Digital Workers are smart, self-starters who more than understand technology; they live it. Their intuitive nature toward technology and the capacity to use it without much effort has become a cultural skill set. Also they can look up, absorb, and comprehend products without much effort. They did their homework before we walked in the door.

Digital Natives are impatient. They want to skip over what they know and get to real purpose and collaboration. You add bottlenecks when you force them back into a traditional decision process that avoids risk; instead, they are looking to business partners to help them iterate and accelerate.

 How did this apply to the Crowbar project?

Crowbar addresses a generation’s impatience to be up and running in record time. But there is more to it than that: we engage with customers differently too. Our open source collaboration and design flexibility mean that we can dialog with customers and partners to figure out the real wants and needs in record time.

3) Let go of linear.

Digital Natives do not want to be walked through detailed linear presentations. They do want the information but leave out the hand holding. The best strategy is to prepare to be a well-trained digital commando—plan a direction, be confident, be ready to respond, and be willing to admit knowledge gaps. It’s a strategy without a strategy.

Ask questions at the beginning of a meeting—this becomes a knowledge base “smell test.” Listening to what our clients know and don’t know gets us to the heart and purpose of why we are there. Take notes. Stay open to curve balls, tough questions, and—dare we say it—the client telling us we are off base. You should not be surprised at how much they know.

For open source projects at Dell (Rob’s Employeer), customers have often downloaded and installed the product before they have talked to the sales team. Rob has had to stop being surprised when they are better informed about our offerings than our well trained internal teams. Digital Natives love collecting information and getting started independently. This completely violates the normal linear sales process; instead, customers enter more engaged and ready if you can be flexible enough to meet them where they already are.

4) Be attentively interactive.

No one likes to sit in one meeting after another. Why are meetings boring? Meetings should be engaging and collaborative; unfortunately, most meetings are simply one-way presentations or status updates. When Digital Natives interrupt a presentation, it may mean they are not getting what they want but it also means they are paying attention.

Aren’t instant messaging, texting, and tweeting attention-stealing distractions?

Don’t confuse IMing, texting, emailing, and tweeting as lack of attention or engagement.

Digital Natives use these “back channels” to speed up knowledge sharing while eliminating the face-to-face meeting inertia of centralized communication.

Of course, sometimes we do check out and stop paying attention.

Time and attention are valuable commodities!

With all the distractions and multi-tasking for speed and connectivity, giving someone undivided attention is about respect, and paying attention is not passive! When we ask questions, it shows that we’re engaged and paying attention. When we compile all the answers from those questions, our intention leads us to solutions. Solving our client’s problems is about getting to the heart of the matter and becomes the driving force behind every action and solution.

Don’t be afraid to stray from the agenda—our attention is the agenda.

5) Stay open to happy accidents.

In Brad’s book, Liquid Leadership, the chapter titled “Have Laptop. Will Travel” points out how Digital Natives have been trained in virtualized work habits because they are more effective.

Our customers are looking for innovative solutions to their problems and may find them in places that we do not expect. It is our job to stay awake and open to solution serendipity. Let’s take this statement out of our vocabulary: “That’s not how we do it.” Let’s try a new approach: “That isn’t traditionally how we would do it, but let us see if it could improve things.”

McDonald’s uses numbers for their combo meals to make sure ordering is predictable and takes no more than 30 seconds. It sounds simple, but changes come from listening to customers’ habits. We need to stop judging and start adapting. Imagine a company that adapts to the needs of its customers?

Sales guru Jeffery Gitomer pays $100 in cash to any one of his employees who makes a mistake. This mistake is analyzed to figure out if it is worthy of application or to be discarded. He doesn’t pay $100 if they make the same mistake twice. Mistakes are where we can discover breakthrough ideas, products, and methods.

Making these kinds of leaps requires that we first let go of rigid rules and opinions and make it OK to make a few mistakes … as long as we look at them through a lens of possibility. Digital Natives have spent 10,000 hours playing learning to make mistakes, take risks, and reach mastery.

Boot me up! out-of-band IPMI rocks then shuts up and waits

It’s hard to get excited about re-implementing functionality from v1 unless the v2 happens to also be freaking awesome.   It’s awesome because the OpenCrowbar architecture allows us to it “the right way” with real out-of-band controls against the open WSMAN APIs.

gangnam styleWith out-of-band control, we can easily turn systems on and off using OpenCrowbar orchestration.  This means that it’s now standard practice to power off nodes after discovery & inventory until they are ready for OS installation.  This is especially interesting because many servers RAID and BIOS can be configured out-of-band without powering on at all.

Frankly, Crowbar 1 (cutting edge in 2011) was a bit hacky.  All of the WSMAN control was done in-band but looped through a gateway on the admin server so we could access the out-of-band API.  We also used the vendor (Dell) tools instead of open API sets.

That means that OpenCrowbar hardware configuration is truly multi-vendor.  I’ve got Dell & SuperMicro servers booting and out-of-band managed.  Want more vendors?  I’ll give you my shipping address.

OpenCrowbar does this out of the box and in the open so that everyone can participate.  That’s how we solve this problem as an industry and start to cope with hardware snowflaking.

And this out-of-band management gets even more interesting…

Since we’re talking to servers out-of-band (without the server being “on”) we can configure systems before they are even booted for provisioning.  Since OpenCrowbar does not require a discovery boot, you could pre-populate all your configurations via the API and have the Disk and BIOS settings ready before they are even booted (for models like the Dell iDRAC where the BMCs start immediately on power connect).

Those are my favorite features, but there’s more to love:

  • the new design does not require network gateway (v1 did) between admin and bmc networks (which was a security issue)
  • the configuration will detect and preserves existing assigned IPs.  This is a big deal in lab configurations where you are reusing the same machines and have scripted remote consoles.
  • OpenCrowbar offers an API to turn machines on/off using the out-of-band BMC network.
  • The system detects if nodes have IPMI (VMs & containers do not) and skip configuration BUT still manage to have power control using SSH (and could use VM APIs in the future)
  • Of course, we automatically setup BMC network based on your desired configuration

 

Ops Bridges > Building a Sharable Ops Infrastructure with Composable Tool Chain Orchestration

This posted started from a discussion with Judd Maltin that he documented in a post about “wanting a composable run deck.”

Fitz and Trantrums: Breaking the Chains of LoveI’ve had several conversations comparing OpenCrowbar with other “bare metal provisioning” tools that do thing like serve golden images to PXE or IPXE server to help bootstrap deployments.  It’s those are handy tools, they do nothing to really help operators drive system-wide operations; consequently, they have a limited system impact/utility.

In building the new architecture of OpenCrowbar (aka Crowbar v2), we heard very clearly to have “less magic” in the system.  We took that advice very seriously to make sure that Crowbar was a system layer with, not a replacement to, standard operations tools.

Specifically, node boot & kickstart alone is just not that exciting.  It’s a combination of DHCP, PXE, HTTP and TFTP or DHCP and an IPXE HTTP Server.   It’s a pain to set this up, but I don’t really get excited about it anymore.   In fact, you can pretty much use open ops scripts (Chef) to setup these services because it’s cut and dry operational work.

Note: Setting up the networking to make it all work is perhaps a different question and one that few platforms bother talking about.

So, if doing node provisioning is not a big deal then why is OpenCrowbar important?  Because sustaining operations is about ongoing system orchestration (we’d say an “operations model“) that starts with provisioning.

It’s not the individual services that’s critical; it’s doing them in a system wide sequence that’s vital.

Crowbar does NOT REPLACE the services.  In fact, we go out of our way to keep your proven operations tool chain.  We don’t want operators to troubleshoot our IPXE code!  We’d much rather use the standard stuff and orchestrate the configuration in a predicable way.

In that way, OpenCrowbar embraces and composes the existing operations tool chain into an integrated system of tools.  We always avoid replacing tools.  That’s why we use Chef for our DSL instead of adding something new.

What does that leave for Crowbar?  Crowbar is providing a physical infratsucture targeted orchestration (we call it “the Annealer”) that coordinates this tool chain to work as a system.  It’s the system perspective that’s critical because it allows all of the operational services to work together.

For example, when a node is added then we have to create v4 and v6 IP address entries for it.  This is required because secure infrastructure requires reverse DNS.  If you change the name of that node or add an alias, Crowbar again needs to update the DNS.  This had to happen in the right sequence.  If you create a new virtual interface for that node then, again, you need to update DNS.   This type of operational housekeeping is essential and must be performed in the correct sequence at the right time.

The critical insight is that Crowbar works transparently alongside your existing operational services with proven configuration management tools.  Crowbar connects links in your tool chain but keeps you in the driver’s seat.

OpenCrowbar stands up 100 node community challenge

OpenCrowbar community contributors are offering a “100 Node Challenge” by volunteering to setup a 100+ node Crowbar system to prove out the v2 architecture at scale.  We picked 100* nodes since we wanted to firmly break the Crowbar v1 upper ceiling.

going up!The goal of the challenge is to prove scale of the core provisioning cycle.  It’s intended to be a short action (less than a week) so we’ll need advanced information about the hardware configuration.  The expectation is to do a full RAID/Disk hardware configuration beyond the base IPMI config before laying down the operating system.

The challenge logistics starts with an off-site prep discussion of the particulars of the deployment, then installing OpenCrowbar at the site and deploying the node century.  We will also work with you about using OpenCrowbar to manage the environment going forward.  

Sound too good to be true?  Well, as community members are doing this on their own time, we are only planning one challenge candidate and want to find the right target.
We will not be planning custom code changes to support the deployment, however, we would be happy to work with you in the community to support your needs.  If you want help to sustain the environment or have longer term plans, I have also been approached by community members who willing to take on full or part-time Crowbar consulting engagements.
Let’s get rack’n!
* we’ll consider smaller clusters but you have to buy the drinks and pizza.

You need a Squid Proxy fabric! Getting Ready State Best Practices

Sometimes a solving a small problem well makes a huge impact for operators.  Talking to operators, it appears that automated configuration of Squid does exactly that.

Not a SQUID but...

If you were installing OpenStack or Hadoop, you would not find “setup a squid proxy fabric to optimize your package downloads” in the install guide.   That’s simply out of scope for those guides; however, it’s essential operational guidance.  That’s what I mean by open operations and creating a platform for sharing best practice.

Deploying a base operating system (e.g.: Centos) on a lot of nodes creates bit-tons of identical internet traffic.  By default, each node will attempt to reach internet mirrors for packages.  If you multiply that by even 10 nodes, that’s a lot of traffic and a significant performance impact if you’re connection is limited.

For OpenCrowbar developers, the external package resolution means that each dev/test cycle with a node boot (which is up to 10+ times a day) is bottle necked.  For qa and install, the problem is even worse!

Our solution was 1) to embed Squid proxies into the configured environments and the 2) automatically configure nodes to use the proxies.   By making this behavior default, we improve the overall performance of a deployment.   This further improves the overall network topology of the operating environment while adding improved control of traffic.

This is a great example of how Crowbar uses existing operational tool chains (Chef configures Squid) in best practice ways to solve operations problems.  The magic is not in the tool or the configuration, it’s that we’ve included it in our out-of-the-box default orchestrations.

It’s time to stop fumbling around in the operational dark.  We need to compose our tool chains in an automated way!  This is how we advance operational best practice for ready state infrastructure.

OpenCrowbar Design Principles: Attribute Injection [Series 6 of 6]

This is part 5 of 6 in a series discussing the principles behind the “ready state” and other concepts implemented in OpenCrowbar.  The content is reposted from the OpenCrowbar docs repo.

Attribute Injection

Attribute Injection is an essential aspect of the “FuncOps” story because it helps clean boundaries needed to implement consistent scripting behavior between divergent sites.

attribute_injectionIt also allows Crowbar to abstract and isolate provisioning layers. This operational approach means that deployments are composed of layered services (see emergent services) instead of locked “golden” images. The layers can be maintained independently and allow users to compose specific configurations a la cart. This approach works if the layers have clean functional boundaries (FuncOps) that can be scoped and managed atomically.

To explain how Attribute Injection accomplishes this, we need to explore why search became an anti-pattern in Crowbar v1. Originally, being able to use server based search functions in operational scripting was a critical feature. It allowed individual nodes to act as part of a system by searching for global information needed to make local decisions. This greatly added Crowbar’s mission of system level configuration; however, it also created significant hidden interdependencies between scripts. As Crowbar v1 grew in complexity, searches became more and more difficult to maintain because they were difficult to correctly scope, hard to centrally manage and prone to timing issues.

Crowbar was not unique in dealing with this problem – the Attribute Injection pattern has become a preferred alternative to search in integrated community cookbooks.

Attribute Injection in OpenCrowbar works by establishing specific inputs and outputs for all state actions (NodeRole runs). By declaring the exact inputs needed and outputs provided, Crowbar can better manage each annealing operation. This control includes deployment scoping boundaries, time sequence of information plus override and substitution of inputs based on execution paths.

This concept is not unique to Crowbar. It has become best practice for operational scripts. Crowbar simply extends to paradigm to the system level and orchestration level.

Attribute Injection enabled operations to be:

  • Atomic – only the information needed for the operation is provided so risk of “bleed over” between scripts is minimized. This is also a functional programming preference.
  • Isolated Idempotent – risk of accidentally picking up changed information from previous runs is reduced by controlling the inputs. That makes it more likely that scripts can be idempotent.
  • Cleanly Scoped – information passed into operations can be limited based on system deployment boundaries instead of search parameters. This allows the orchestration to manage when and how information is added into configurations.
  • Easy to troubleshoot – since the information is limited and controlled, it is easier to recreate runs for troubleshooting. This is a substantial value for diagnostics.

OpenCrowbar Design Principles: Emergent services [Series 5 of 6]

This is part 5 of 6 in a series discussing the principles behind the “ready state” and other concepts implemented in OpenCrowbar.  The content is reposted from the OpenCrowbar docs repo.

Emergent services

We see data center operations as a duel between conflicting priorities. On one hand, the environment is constantly changing and systems must adapt quickly to these changes. On the other hand, users of the infrastructure expect it to provide stable and consistent services for consumption. We’ve described that as “always ready, never finished.”

Our solution to this duality to expect that the infrastructure Crowbar builds is decomposed into well-defined service layers that can be (re)assembled dynamically. Rather than require any component of the system to be in a ready state, Crowbar design principles assume that we can automate the construction of every level of the infrastructure from bios to network and application. Consequently, we can hold off (re)making decisions at the bottom levels until we’ve figured out that we’re doing at the top.

Effectively, we allow the overall infrastructure services configuration to evolve or emerge based on the desired end use. These concepts are built on computer science principles that we have appropriated for Ops use; since we also subscribe to Opscode “infrastructure as code”, we believe that these terms are fitting in a DevOps environment. In the next pages, we’ll explore the principles behind this approach including concepts around simulated annealing, late binding, attribute injection and emergent design.

Emergent (aka iterative or evolutionary) design challenges the traditional assumption that all factors must be known before starting

  • Dependency graph – multidimensional relationship
  • High degree of reuse via abstraction and isolation of service boundaries.
  • Increasing complexity of deployments means more dependencies
  • Increasing revision rates of dependencies but with higher stability of APIs