Podcast – Ian Rae talks Cloud, Innovation, and Updates from Google Next 2018

Joining us this week is Ian Rae, CEO and Founder CloudOps who recorded the podcast during the Google Next conference in 2018.

Highlights

  • 1 min 55 sec: Define Cloud from a CloudOps perspective
    • Business Model and an Operations Model
  • 3 min 59 sec: Update from Google Next 2018 event
    • Google is the “Engineer’s Cloud”
    • Google’s approach vs Amazon approach in feature design/release
  • 9 min 55 sec: Early Amazon ~ no easy button
    • Amazon educated the market as industry leader
  • 12 min04 sec: What is the state of Hybrid? Do we need it?
    • Complexity of systems leads to private, public as well as multiple cloud providers
    • Open source enabled workloads to run on various clouds even if the cloud was not designed to support a type of workload
    • Google’s strategy is around open source in the cloud
  • 14 min 12 sec: IBM visibility in open source and cloud market
    • Didn’t build cloud services (e.g. open a ticket to remap a VLAN)
  • 16 min 40 sec: OpenStack tied to compete on service components
    • Couldn’t compete without Product Managers to guide developers
    • Missed last mile between technology and customer
    • Didn’t want to take on the operational aspects of the customer
  • 19 min 31 sec: Is innovation driven from listening to customers vs developers doing what they think is best?
    • OpenStack is seen as legacy as customers look for Cloud Native Infrastructure
    • OpenStack vs Kubernetes install time significance
  • 22 min 44 sec: Google announcement of GKE for on-premises infrastructure
    • Not really On-premise; more like Platform9 for OpenStack
    • GKE solve end user experience and operational challenges to deliver it
  • 26 min 07 sec: Edge IT replaces what is On-Premises IT
    • Bullish on the future with Edge computing
    • 27 min 27 sec: Who delivers control plane for edge?
      • Recommends Open Source in control plan
  • 28 min 29 sec: Current tech hides the infrastructure problems
    • Someone still has to deal with the physical hardware
  • 30 min 53 sec: Commercial driver for rapid Edge adoption
  • 32 min 20 sec: CloudOps building software / next generation of BSS or OSS for telco
    • Meet the needs of the cloud provider for flexibility in generating services with the ability to change the service backend provider
    • Amazon is the new Win32
  • 38 min 07 sec: Can customers install their own software? Will people buy software anymore?
    • Compare payment models from Salesforce and Slack
    • Google allowing customers to run their technology themselves of allow Google to manage it for you
  • 40 min 43 sec: Wrap-Up

Podcast Guest: Ian Rae, CEO and Founder CloudOps

Ian Rae is the founder and CEO of CloudOps, a cloud computing consulting firm that provides multi-cloud solutions for software companies, enterprises and telecommunications providers. Ian is also the founder of cloud.ca, a Canadian cloud infrastructure as a service (IaaS) focused on data residency, privacy and security requirements. He is a partner at Year One Labs, a lean startup incubator, and is the founder of the Centre cloud.ca in Montreal. Prior to clouds, Ian was responsible for engineering at Coradiant, a leader in application performance management.

Podcast – Erica Windisch on Observability of Serverless, Edge Computing, and Abstraction Boundaries

Joining us this week is Erica Windisch, Founder/CTO at IOpipe, a high fidelity metrics and monitoring service which allows you to see inside AWS Lambda functions for better insights into the daily operations and development of severless applications.

Highlights

  • Intro of AWS Lambda and IOpipe
  • Discussion of Observability and Opaqueness of Serverless
  • Edge Computing Definition and Vision
  • End of Operating Systems and Abstraction Boundaries

Topic                                                                                Time (Minutes.Seconds)

Introduction                                                                    0.0 – 1.16
Vision of technology future                                         1.16 – 3.04 (Containers ~ Docker)
Complexity of initial experience with new tech       3.04 – 5.38 (Devs don’t go deep in OS)
Why Lambda?                                                                5.38 – 8.14 (Deploy functions)
What IOpipe does?                                                        8.14 – 10.54 (Observability for calls)
Lambda and Integration into IOpipe                          10.54 – 13.48 (Overhead)
Observability definition                                                 13.48 – 17.25
Opaque system with Lambda                                     17.25 – 21.13
Serverless framework still need tools to see inside   21.13 – 24.20 (Distributed Issues Day 1)
Edge computing definition                                           24.20 – 26.56 (Microprocessor in Everything)
Edge infrastructure vision                                            26.56 – 29.32 (TensorFlow example)
Portability of containers vs functions                         29.32 – 31.00 (Linux is Dying)
Abstraction boundaries                                                31.00 – 33.50 (Immutable Infra Panel)
Is Serverless the portability unit for abstraction?     33.50 – 39.46 (Amazon Greengrass)
Wrap Up                                                                          39.46 – END

 

Podcast Guest: Erica Windisch, Founder/CTO at IOpipe

Erica Windisch is the founder and CTO of IOpipe, a company that builds, connects, and scales code. She was previously a software and security engineer at Docker. Before joining Docker, worked as a principal engineer at Cloudscaling. Studied at Florida Institute of Technology.

 

DC2020: Is Exposing Bare Metal Practical or Dangerous?

One of IBM’s major announcements at Think 2018 was Managed Kubernetes on Bare Metal. This new offering combines elements of their existing offerings to expose some additional security, attestation and performance isolation. Bare metal has been a hot topic for cloud service providers recently with AWS adding it to their platform and Oracle using it as their primary IaaS. With these offerings as a backdrop, let’s explore the role of bare metal in the 2020 Data Center (DC2020).

Physical servers (aka bare metal) are the core building block for any data center; however, they are often abstracted out of sight by a virtualization layer such as VMware, KVM, HyperV or many others. These platforms are useful for many reasons. In this post, we’re focused on the fact that they provide a control API for infrastructure that makes it possible to manage compute, storage and network requests. Yet the abstraction comes at a price in cost, complexity and performance.

The historical lack of good API control has made bare metal less attractive, but that is changing quickly due to two forces.

These two forces are Container Platforms and Bare Metal as a Service or BMaaS (disclosure: RackN offers a private BMaaS platform called Digital Rebar). Container Platforms such as Kubernetes provide an application service abstraction level for data center consumers that eliminates the need for users to worry about traditional infrastructure concerns.  That means that most users no longer rely on APIs for compute, network or storage allowing the platform to handle those issues. On the other side, BMaaS VM infrastructure level APIs for the actual physical layer of the data center allow users who care about compute, network or storage the ability to work without VMs.  

The combination of containers and bare metal APIs has the potential to squeeze virtualization into a limited role.

The IBM bare metal Kubernetes announcement illustrates both of these forces working together.  Users of the managed Kubernetes service are working through the container abstraction interface and really don’t worry about the infrastructure; however, IBM is able to leverage their internal bare metal APIs to offer enhanced features to those users without changing the service offering.  These benefits include security (IBM White Paper on Security), isolation, performance and (eventually) access to metal features like GPUs. While the IBM offering still includes VMs as an option, it is easy to anticipate that becoming less attractive for all but smaller clusters.

The impact for DC2020 is that operators need to rethink how they rely on virtualization as a ubiquitous abstraction.  As more applications rely on container service abstractions the platforms will grow in size and virtualization will provide less value.  With the advent of better control of the bare metal infrastructure, operators have real options to get deep control without adding virtualization as a requirement.

Shifting to new platforms creates opportunities to streamline operations in DC2020.

Even with virtualization and containers, having better control of the bare metal is a critical addition to data center operations.  The ideal data center has automation and control APIs for every possible component from the metal up.

Learn more about the open source Digital Rebar community:

Podcast with The CTO Advisor on Edge vs Cloud, Compute vs Data Gravity, and Impact of Massive Scale

Joining us this week is Keith Townsend, The CTO Advisor, for a joint podcast of the L8ist Sh9y and CTO Advisor Podcast. Keith and Rob discuss the Edge Computing concept and several issues facing enterprise companies looking to move beyond the current cloud offerings. Key highlights from the podcast:

  • What is the Edge? 2 Separate Definitions are Discussed
  • Comparison of Edge and the Electricity Model
  • Building and Managing Apps for Edge at Massive Scale
  • Data vs Compute Gravity

Topic                                                        Time (Minutes.Seconds)

Introduction                                              0.00 – 1.30
What is Edge?                                           1.30 – 2.30
Hands-Off Edge Infrastructure              2.30 – 5.20   (Snowball Edge)
General Purpose App Stacks in Edge  5.20 – 6.38
AWS Predictions                                      6.38 – 7.32
Enterprise Model for Edge                     7.32 – 9.24
Bernard Golden and Death Cloud        9.24 – 15.15   (Edge vs Electric Market / AWS in China)
Will Edge be Transaction Model?        15.15 – 19.32  (Workloads Space Access?)
What is the Edge? (Second Pass)         19.32 – 21.55
Scale of Edge / Thousands of Nodes   21.55 – 25.27   (Building Apps for Massive Scale)
Centrally Managed Edge                       25.27 – 29.20   (Patch Management)
Cloud Outages                                         29.20 – 31.39
SAP Example                                            31.39 – 33.10
Scale and Automation from AWS         33.10 – 34.46
Edge not like Cloud for App Devs        34.46 –  37.04  (Control Plane too large for Edge)
Compute Now has Gravity                     37.04 – 40.00  (Data vs Compute Gravity)
Conclusion and Wrap-Up                      40.00 – END

 

Podcast Guest: Keith Townsend, CTO Advisor

Keith is a Principal CTO Advisor with 20 years of experience helping organizations achieve their mission through optimized IT infrastructures. Keith holds a Bachelor’s Degree in Computing and a Master’s in IT Project Management from DePaul University. Follow Keith on Twitter @CTOAdvisor

Sound and Fury as AWS Pulls Back Curtain for Bare Metal Offering

Yesterday, AWS confirmed that it actually uses physical servers to run its cloud infrastructure and, gasp, no one was surprised.  The actual news about the i3.metal instances by AWS Chief Evangelist Jeff Barr shows that bare metal is being treated as just another AMI managed instance type (see also Geekwire, Techcrunch, Venture Beat).  For AWS users, there’s no drama here because it’s an incremental add to processes they are already know well.

Infrastructure as a Service (IaaS) is fundamentally about automation and API not the type of infrastructure.

Lack of drama is a key principle at RackN: provisioning hardware should be as easy to automate as a virtual machine. The addition of bare metal to the AWS instance types validates two important parts of the AWS cloud automation story.  First, having control metal is valuable and, second, operations are expected image (AMI) based deployments.

There are interesting AWS specific items to unpack around this bare metal announcement that shows otherwise hidden details about AWS infrastructure.

It took Amazon a long time to create this offering because allowing users to access bare metal requires a specialized degree of isolation inside their massive data center.  It’s only recently possible in AWS data centers because of their custom hardware and firmware.  These changes provide AWS with a hidden control layer under the operating system abstraction.  This does not mean everyone needs this hardware – it’s an AWS specific need based on their architecture.

It’s not a surprise the AWS has built cloud infrastructure optimized hardware.  All the major cloud providers design purpose-built machines with specialized firmware to handle their scale network, security and management challenges.

The specialized hardware may create challenges for users compared to regular virtualized servers.  There are already a few added requirements for AMIs before they can run on the i3.metal instance.  Any image deploy to metal process requires a degree of matching the target server.  That’s the reason that Digital Rebar defaults to safer (but slower) kickstart and pre-seed processes.

Overall, this bare metal announcement is signifying nothing dramatic and that’s a very good thing.

Automating every layer of a data center should be the expected default.  Our mission has been to make metal just another type of automated infrastructure and we’re glad to have AWS finally get on the same page with us.

Podcast with Zach Smith talking Bare Metal and AWS Training Wheels

Joining this week’s L8ist Sh9y Podcast is Zach Smith, CEO of Packet and long-time champion of bare metal hardware. Rob Hirschfeld and Zach discuss the trends in bare metal, the impact of AWS changing the way developers view infrastructure, and issues between networking and server groups in IT organizations.

Topic                                                            Time (Minutes.Seconds)

Introduction                                                       0.0 – 0.43
History of Packet                                               0.43 – 1:38
Why Public Cloud Bare Metal                         1.38 – 2.10
Price Points Metal vs VM                                 2.10 – 3.08
Intro Compute to Non-Data Center People 3.08 – 4:27
RackN early Customer                                      4.27 – 5.41
Managing non-Enterprise Hardware             5.41 – 7.45
Cloud has forever changed IT Ops                 7.45 – 10.20
Making Hardware Easier                                 10.20 – 12.35
Continuous Integration (CI)                            12.35 – 14.37
Customer Story w/ Terraform                        14.47 – 16.08
SRE, DevOps and Engineering Thinking     16.08 – 16:49
Most extreme Metal Pipelines                        16.49 – 18.02
Coolest New Hardware in Use                        18.02 – 19.28
How order metal and add to data center     19.28 – 22.47
RackN and the Switch                                       22.47 – 24.39
Edge Computing Break Enterprise IT           24.39 – 25.16
DevOps Highlights for Today                          25.16 – 27.01
Post Provision Control in Open Source          27.01 – 30.03
Data Centers in early 2000’s                            30.03 – 31.27
Nov 1 in NYC: Cloud Native in DataCenter   31.27 –  END

Podcast Guest: Zach Smith, CEO Packet

Zachary has spent his last 16 years building, running and fixing public cloud infrastructure platforms.  As the CEO of Packet, Zachary is responsible for the company’s strategic product roadmap and is most passionate about helping customers and partners take advantage of fundamental compute and avoid vendor lockin.  Prior to founding Packet, Zachary was an early member of the management team at Voxel, a NY-based cloud hosting company sold to Internap in 2011, that built software to automate all aspects of hosting datacenters.  He lives in New York City with his wife and 2 young children. Twitter @zsmithnyc

August 18 – Weekly Recap of All Things Site Reliability Engineering (SRE)

Welcome to the weekly post of the RackN blog recap of all things SRE. If you have any ideas for this recap or would like to include content please contact us at info@rackn.com or tweet Rob (@zehicle) or RackN (@rackngo)

SRE Items of the Week

Beyond Google SRE: What is Site Reliability Engineering like at Medium?
https://blog.netsil.com/beyond-google-sre-what-is-site-reliability-engineering-like-at-medium-71c65bd35f4e


We had the opportunity to sit down with Nathaniel Felsen, DevOps Engineer at Medium and the author of “Effective DevOps with AWS”. We are happy to share some practical insights from Nathaniel’s extensive experience as a seasoned DevOps and SRE practitioner.

While we hear a lot about these experiences from Google, Netflix, etc., we wanted to gather perspectives on DevOps and SRE life with other easily relatable companies. From tech-stack challenges to organization structure, Nathaniel provides a wide range of practical insights that we hope will be valuable in improving DevOps practices at your organization. READ MORE

GitHub seeks to spur innovation with Kubernetes migration
http://www.zdnet.com/article/github-seeks-to-spur-innovation-with-kubernetes-migration/

GitHub on Wednesday is sharing the details of the massive technical endeavor its engineers went through to migrate the infrastructure that powers github.com and api.github.com — some of its most critical workloads — from a set of manually-configured physical servers to Kubernetes clusters that run application containers.

GitHub is confident the move will allow for faster innovation on the online code sharing and development platform. READ MORE

SRE Thinking: Reframing Dev + Ops
http://bit.ly/2w2I53F

Last month, Eric Wright and I were able to complete a discussion the inspired my guest post for CapitalOne “How Platforms and SREs Change the DevOps Contract.” While our conversation ranged widely over the challenges of building and integration of IT processes, the key message is simple: we need to make investments in operations. READ MORE

Coal or Diamonds? Configuration Management is Under Pressure
http://bit.ly/2uTvADN

Cloud Native thinking is thankfully changing the way we approach traditional IT infrastructure.  These profound changes in how we build applications with 12-factor design and containers has deep implications on how we manage configuration and the tools we use to do it.  These are not cloud only impacts – the changes impact every corner of IT data centers. READ MORE

Subscribe to our new daily DevOps, SRE, & Operations Newsletter https://paper.li/e-1498071701#/

_____________

UPCOMING EVENTS

Rob Hirschfeld and Greg Althaus are preparing for a series of upcoming events where they are speaking or just attending. If you are interested in meeting with them at these events please email info@rackn.com.

OTHER NEWSLETTERS

my 8 steps that would improve OpenStack Interop w/ AWS

I’ve been talking with a lot of OpenStack people about frustrating my attempted hybrid work on seven OpenStack clouds [OpenStack Session Wed 2:40].  This post documents the behavior Digital Rebar expects from the multiple clouds that we have integrated with so far.  At RackN, we use this pattern for both cloud and physical automation.

Sunday, I found myself back in front of the the Board talking about the challenge that implementation variation creates for users.  Ultimately, the question “does this harm users?” is answered by “no, they just leave for Amazon.”

I can’t stress this enough: it’s not about APIs!  The challenge is twofold: implementation variance between OpenStack clouds and variance between OpenStack and AWS.

The obvious and simplest answer is that OpenStack implementers need to conform more closely to AWS patterns (once again, NOT the APIs).

Here are the eight Digital Rebar node allocation steps [and my notes about general availability on OpenStack clouds]:

  1. Add node specific SSH key [YES]
  2. Get Metadata on Networks, Flavors and Images [YES]
  3. Pick correct network, flavors and images [NO, each site is distinct]
  4. Request node [YES]
  5. Get node PUBLIC address for node [NO, most OpenStack clouds do not have external access by default]
  6. Login into system using node SSH key [PARTIAL, the account name varies]
  7. Add root account with Rebar SSH key(s) and remove password login [PARTIAL, does not work on some systems]
  8. Remove node specific SSH key [YES]

These steps work on every other cloud infrastructure that we’ve used.  And they are achievable on OpenStack – DreamHost delivered this experience on their new DreamCompute infrastructure.

I think that this is very achievable for OpenStack, but we’re doing to have to drive conformance and figure out an alternative to the Floating IP (FIP) pattern (IPv6, port forwarding, or adding FIPs by default) would all work as part of the solution.

For Digital Rebar, the quick answer is to simply allocate a FIP for every node.  We can easily make this a configuration option; however, it feels like a pattern fail to me.  It’s certainly not a requirement from other clouds.

I hope this post provides specifics about delivering a more portable hybrid experience.  What critical items do you want as part of your cloud ops process?

Fast Talk: Creating Operating Environments that Span Clouds and Physical Infrastructures

This short 15-minute talk pulls together a few themes around composability that you’ll see in future blogs where I lay out the challenges and solutions for hybrid DevOps practices.  Like any DevOps concept – it’s a mix of technology, attitude (culture) and process.

Our hybrid DevOps objective is simple: We need multi-infrastructure Amazon equivalence for ops automation.

IT perspective of AWSHere’s the summary:

  • Hybrid Infrastructure is new normal
  • Amazon is the Ops benchmark
  • Embrace operations automation
  • Invest in making IT composable

 

Want to listen to it?  Here’s the voice over:

 

Smaller Nodes? Just the Right Size for Docker!

Container workloads have the potential to redefine how we think about scale and hosted infrastructure.

Last Fall, Ubiquity Hosting and RackN announced a 200 node Docker Swarm cluster as a phase one of our collaboration. Unlike cloud-based container workloads demonstrations, we chose to run this cluster directly on the bare metal.  

Why bare metal instead of virtualized? We believe that metal offers additional performance, availability and control.  

With the cluster automation ready, we’re looking for customers to help us prove those assumptions. While we could simply build on many VMs, our analysis is the a lot of smaller nodes will distribute work more efficiently. Since there is no virtualization overhead, lower RAM systems can still give great performance.

The collaboration with RackN allows us to offer customers a rapid, repeatable cluster capability. Their Digital Rebar automation works on a broad spectrum of infrastructure allow our users to rehearse deployments on cloud, quickly change components and iteratively tune the cluster.

We’re finding that these dedicated metal nodes have much better performance than similar VMs in AWS?  Don’t believe us – you can use Digital Rebar to spin up both and compare.   Since Digital Rebar is an open source platform, you can explore and expand on it.

The Docker Swarm deployment is just a starting point for us. We want to hear your provisioning ideas and work to turn them into reality.