Podcast – Mathew Lodge on Data Science as a Service in 20 Minutes from Scratch

Joining us this week is Mathew Lodge, SVP of Products & Marketing of Anaconda.

About Anaconda

Anaconda Distribution

With over 6 million users, the open source Anaconda Distribution is the fastest and easiest way to do Python and R data science and machine learning on Linux, Windows, and Mac OS X. It’s the industry standard for developing, testing, and training on a single machine.

Anaconda Enterprise 

Anaconda Enterprise is an AI/ML enablement platform that empowers organizations to develop, govern, and automate AI/ML and data science from laptop through training to production. It lets organizations scale from individual data scientists to collaborative teams of thousands, and to go from a single server to thousands of nodes for model training and deployment.

 Highlights

  • 2 min 57 sec: What does Anaconda do?
    • Help data scientists be productive & enterprise AI / Data Science
  • 3 min 36 sec: How do you interact with Anaconda?
    • About 2.5 million downloads a month of Anaconda Distribution
    • Install binary packages for data science to Python
  • 5 min 55 sec: Who are data scientists?
    • Data wrangling and understanding
  • 9 min 12 sec: Data Science as a verb
    • Understand how to turn data into actionable insight
  • 10 min 47 sec: How learn to use the tools? Community!
    • Community around Anaconda open source to share packages, etc
  • 13 min 26 sec: How does Anaconda change as AI/Machine Learning improve?
    • Python is standard language with R close behind for data science
  • 14 min 58 sec: Reproducibility in results
    • 16 min 01 sec: Model training issue?
  • 17 min 16 sec: Parking lot on Sam Charrington’s AI Bias Podcasts
  • 17 min 43 sec: Training models for limited sets of data for reliability in Edge
    • Answer by example of Google ImageNet
    • 20 min 14 sec: Optimizations to reduce processing requirements
      • Hey Siri example on how iPhone works
    • 22 min 03 sec: Do models improve over time? Transfer learning
  • 22 min 30 sec: Accelerative Learning in AI
    • Fashion example of layering learning
    • Issues around lack of data for training
  • 26 min 01 sec: Portability of models via Anaconda
  • 26 min 48 sec: Cloud Native Model of AI (no longer 2004)
    • Moved on from Java and distributed computing to Kubernetes
    • 29 min 05 sec: Giving up data locality (Hadoop) & specialized hardware?
    • 32 min 42 sec: Cloud model gives private and public options
  • 34 min 23 sec: How Anaconda play into the Cloud Native data science model?
    • Data scientists interested in data problems not cloud architecture
    • Data science as a Service
    • Kubernetes & Docker installed for you by Anaconda
  • 38 min 05 sec: WRAP UP
    • Anaconda Con Videos

Podcast Guest: Mathew Lodge, SVP of Products & Marketing of Anaconda

Mathew has well over 20 years’ diverse experience in cloud computing and product leadership. Prior to joining Anaconda, he served as Chief Operating Officer at Weaveworks, the container and microservices networking and management start-up; and previously as Vice President in VMware’s Cloud Services group. At VMware he was co-founder of what became its vCloud Air IaaS service.

Early in his career, Mathew built compilers and distributed systems for projects like the International Space Station, helped connect six countries to the Internet for the first time, and managed a $630m router product line at Cisco. At start-up CPlane he attempted to do SDN 10 years too early. Prior to VMware, Mathew was Senior Director at Symantec in its $1Bn+ information management group.

Podcast – Blockchain Technology Partners on their Startup and Key Issues of Blockchain

Joining us this week is the team from Blockchain Technology Partners: Duncan-Johnston-Watt, Kevin O’Donnell, and Mike Zaccardo live from GlueCon 2018 in Colorado.

Blockchain Technology Partners is an Edinburgh-based technology startup:

  • Mission – to radically simplify the enterprise adoption of blockchain technologies
  • Goal – to reduce the cost and complexity of doing business through decentralization while ensuring trust, transparency and accountability in a distributed world
  • Focus – providing a production-ready blockchain platform and partnering with businesses to deliver blockchain-based solutions

Highlights

  • Who is Blockchain Technology Partners and Company Objectives
  • What is Blockchain: Distributed Transaction Log and Consensus
  • Decentralization of Ledgers and Centralization Weakness (Bitcoin e.g)
  • Use Cases for Blockchain
  • Publication Components of Blockchain; its Middle-Ware
  • Trusted Authorities and Broker Replacement (Shipping e.g.)
  • Edge Computing and Blockchain Examples
  • Data Responsibility and Local Blockchains
  • Blockchain Technology Partners Open Source Model and Technology

Topic                                                                                    Time (Minutes.Seconds)

Introduction                                                                                 0.0 – 0.47
Blockchain Technology Partners – What are you doing?   0.47 – 1.52
What is Blockchain? (Proof of Elapsed Time)                        1.52 – 5.17
Common Thread of Decentralization                                      5.17 – 10.13
Block of Use Cases                                                                    10.13 – 11.55
Publication Component of Blockchain                                   11.55 – 14.05
Trusted Authorities / Broker / Supply Chain                         14.05 – 17.52
Medical Waste Handling Example                                          17.52 – 19.01
Edge Computing and Blockchain                                            19.01 – 20.11
Multi-Vendor IoT Devices and Multiple Clouds                    20.11 – 25.49
Data at Edge Accumulation (Tokenizing Data)                      25.49 – 27.51
Description of Open Source Model for BTP                          27.51 – 31.08
Wrap Up                                                                                      31.08  – END

Podcast Guests

  • Duncan Johnston-Watt (Chief Executive Officer)

Duncan is a serial entrepreneur and pioneer with thirty years’ experience in the software industry.

He founded Cloudsoft Corporation, creator of the Apache Brooklyn open source project, serving as its CEO for nine years. Previously he was founder and CTO of Enigmatec Corporation which was sold to iWave and subsequently acquired by EMC.

Prior to this he had a number of fintech roles. At Instinet he led the delivery of their Fixed Income brokerage platform. In recognition of this work pioneering the use of Java enterprise technologies in Financial Services he was nominated for a Computerworld Smithsonian Award.

  • Kevin O’Donnell (Chief Strategy Officer)

Kevin is co-founder of the DevOps platform Chill Code. He served as VP Infrastructure & Operations at JP Morgan from September 2012 until July 2016. Prior to this he served as VP Operations, Video Analytics at Nielsen following its acquisition of Glance Guide in March 2010 where he was VP Operations form December 2007.

Kevin has worked with Duncan at both Instinet Corporation where he was Director, Global Fixed Income Engineering and Enigmatec Corporation Limited where he served as VP, Product Management for four years.

Kevin has over 20 years’ experience delivering large-scale production systems, much of it gained on Wall Street, focusing on automation and operations. He holds a BA in Philosophy from Vanderbilt University.

  • Mike Zaccardo (Lead Blockchain Engineer)

Mike is a computer scientist, comedian, traveler, and dancer.

As a senior software engineer at Cloudsoft, he worked with Duncan deploying and managing bleeding edge open source distributed systems including Hyperledger Fabric.

Mike also enjoys performing improv comedy and traveling the world, making dance videos along the way.

He holds a B.S. in Computer Science from John Hopkins University.

Christine Yen on 2nd Wave of DevOps, Monitoring Containers, and Listening to Users at a Startup

Joining us this week is Christine Yen, Co-founder at Honeycomb coming from a recording at SRECon Americas in March 2018 at Santa Clara Convention Center Hyatt.

Highlights

  • Understanding of what developer tools are today
  • Observability vs Monitoring
  • Instrumenting Apps for Diagnostics to help Developers do More
  • Tool to build not just better engineers but teams as well to support customers
  • Brief history of Honeycomb and where it came from (Parse and Facebook)
  • How debug containers that are most likely gone by time problem arises?
  • AI / Machine Learning – can it really help today?
  • 2nd Wave of DevOps
  • Impact of listening to users at a startup – people problems vs technology

Topic                                                                                    Time (Minutes.Seconds)

Introduction                                                                             0.0 – 2.05
Integration of Honeycomb and Digital Rebar Provision  2.05 – 3.01 (Plugin Info)
Developer Tools – what is that category?                          3.01 – 5.15 (Not doing harm)
Observability vs Monitoring                                                  5.15 – 7.45 (Doctor analogy)
Instrumenting Applications for Diagnostics                      7.45 – 10.19
My View vs Team View                                                         10.19 – 14.45 (Build better eng & teams)
Why we built Honeycomb?                                                 14.45 – 18.38
Centralized Logging in Distributed Containers                18.38 – 19.25
Can AI / Machine Learning assist in Finding Issues?     19.25 – 21.35 (7 Different Ways by Barry Schwartz)
Team Specialties – 2nd Wave of DevOps                          21.35 – 26.35 (Teach Devs to Own Code)
Listening to Users as a Startup                                           26.35 – 35.35 (UI Issues)
Who is Charity Majors? Co-Founder Honeycomb          35.35 – 38.30
Wrap Up                                                                                  38.30 – END

Podcast Guest:  Christine Yen, Co-founder at Honeycomb

Christine Yen is a cofounder of Honeycomb, a startup with a new approach to observability and debugging systems with data. Christine has built systems and products at companies large and small and likes to have her fingers in as many pies as possible. Previously, she built Parse’s analytics product (and leveraged Facebook’s data systems to expand it) and wrote software at a few now-defunct startups.

 

2017 Gartner IO & DC Wrap Up

Like other Gartner events, the Infrastructure and Operations (IO) show is all about enterprises maintaining systems.  There are plenty of hype chasing sessions, but the vibe is distinctly around working systems and practical implementations.  Think: sports coats not t-shirts.  In that way, it’s less breathless and wild-eyed than something like KubeCon (which is busy celebrating a bumper crop of 1.0 projects).  The very essence of this show is to project an aura of calm IT stewardship.

So what keeps these seasoned IT pros awake?  Lack of cross-vendor Integration.

Terry Cosgrove of Gartner said this very clearly, “most components were not designed to work together.” This was not just a comment about the industry, but within vendor suites.  In today’s acquisitive and agile market, there’s no expectation that even products from a single vendor will integrate smoothly.  Why is integration so hard?  We’re innovating so quickly that legacy APIs and new architectures don’t align well. For enterprises who cannot simply jump to the new-new thing, integrations drive considerable value.

Cosgrove went on to add that enterprises need to OWN the integrations – they can’t delegate that to vendors.

That advice resonated for me.  We’re clearly in a best-of-breed IT environment where hybrid and portability concerns dominate discussions.  This is not about vendor lock-in but innovation.  That leads us back to the need for better integrations between products, platforms and projects.  Customers need to start rejecting products without great, documented APIs; otherwise, there is no motivation for products to focus on integration over adding features.  

Sadly, it was left to the audience to infer the “use dollars to force vendors to integrate” message.

There were many other topics of interest at the show.  Here’s a very short synopsis of my favorites:

  • Edge is coming and will be a big deal.  We’re still having to explain what it is.  Check back next summit (or listen to our great podcasts to get ahead of the curve).
  • AI Ops is not really AI, it’s just smarter logging.  We’ll get there eventually, but it will take some time.
  • DevOps is still a thing and it’s still hard because of the culture change required.  We’re slowly getting to a point where “DevOps = Automated Processes” and that’s OK.  If you agree with that then you’ve missed the point of system thinking and lean.  We’re done trying to explain it to you for now.
  • No start-ups.  Sadly, disruptive innovation is antithetical to this show and that may be OK.  The audience counts on the analysts to filter this for them instead of getting raw.

In all these cases, it’s listener beware.  There’s more behind the curtain that you are allowed to see.

Exploring the Edge Series: “Edge is NOT just Mini-Cloud”

While the RackN team and I have been heads down radically simplifying physical data center automation, I’ve still been tracking some key cloud infrastructure areas.  One of the more interesting ones to me is Edge Infrastructure.

This once obscure topic has come front and center based on coming computing stress from home video, retail machine and distributed IoT.  It’s clear that these are not solved from centralized data centers.

While I’m posting primarily on the RackN.com blog, I like to take time to bring critical items back to my personal blog as a collection.  WARNIING: Some of these statements run counter to other industry.  Please let me know what you think!

Don’t want to read?  Here’s a summary podcast.

Post 1: OpenStack On Edge? 4 Ways Edge Is Distinct From Cloud

By far the largest issue of the Edge discussion was actually agreeing about what “edge” meant.  It seemed as if every session had a 50% mandatory overhead in definitioning.  Putting my usual operations spin on the problem, I choose to define edge infrastructure in data center management terms.  Edge infrastructure has very distinct challenges compared to hyperscale data centers.  Read article for the list...

Post 2: Edge Infrastructure Is Not Just Thousands Of Mini Clouds

Running each site as a mini-cloud is clearly not the right answer.  There are multiple challenges here. First, any scale infrastructure problem must be solved at the physical layer first. Second, we must have tooling that brings repeatable, automation processes to that layer. It’s not sufficient to have deep control of a single site: we must be able to reliably distribute automation over thousands of sites with limited operational support and bandwidth. These requirements are outside the scope of cloud focused tools.

Post 3: Go CI/CD And Immutable Infrastructure For Edge Computing Management

If “cloudification” is not the solution then where should we look for management patterns?  We believe that software development CI/CD and immutable infrastructure patterns are well suited to edge infrastructure use cases.  We discussed this at a session at the OpenStack OpenDev Edge summit.

What do YOU think?  This is an evolving topic and it’s time to engage in a healthy discussion.