WhatTheBus fun with Cucumber and MemCacheD

Sometimes a problem has to kick you upside the head so you can learn an important lesson.  Tonight’s head slapper was an interaction between Cucumber and MemCacheD.

If you are using CUCUMBER AND MEMCACHE read this post carefully so you don’t get burned.  If you’re using MemCache and not writing tests then return to Jail, do not collect $200.

It’s important to note that Cucumber has the handy side effect of running each scenario in a transaction.  The impact is that the data from each scenario does not impact the next scenario.  (note: you can pre-load data into cucumber using fixtures).

However, Cucumber does not do any rollback for Cache keys added into MemCache.  In fact, your MemCache entries will happily persist between your development and test systems.

WhatTheBus has a simple check to reduce database writes – it only writes to the database if there is no cache hit for the bus.  My thinking is that we only need to add a new bus if there is no key as shown in this partial snippet:

  cache = Rails.cache.read params[:id]
  if cache.nil?
     bus = Bus.find_or_create_by_xref :name => params[:name], :xref => params[:id]
  end

This works great for live testing, but fails in technicolor for Cucumber because tests with the same ID will not make it to the find_or_create.

To solve the problem, I had to add a pre-condition (‘given’ in Cucumber speak) to each scenario to make sure the cache was cleared.  It looks like this in the scenario feature:

  Given no cache for "1234"

And that’s translated as code in the steps like so:

  Given /^no cache for "([^\"]*)"$/ do |id|
   Rails.cache.delete id
  end

Ready to Fail

Or How Monte Python taught me to program

Sometimes you learn the most from boring conference calls.  In this case, I was listening to a deployment that was so painfully reference-example super-redundant by-the-book that I could have completed the presenter’s sentences.  Except that he kept complaining about the cost.  It turns out that our typical failure-proofed belt-and-suspenders infrastructure is really, really expensive.

Shouldn’t our applications be Monte Python’s Black Knight yelling “It’s just a flesh wound!  Come back and fight!”   Instead, we’ve grown to tolerate princess applications that throw a tantrum of over skim milk instead of organic soy in their mochaito.

Making an application failure-ready requires a mindset change.  It means taking of our architecture space suit and donning our welding helmet.

Fragility is often born from complexity and complexity is the compounded interest from system design assumptions.

Let’s consider a transactional SQL database.  I love relational databases.  Really, I do.  Just typing SELECT * FROM or LEFT OUTER JOIN gives me XKCD-like goose bumps.  Unfortunately, they are as fragile as Cinderella’s glass slippers.  The whole concept of relational databases requires a complex web of sophisticated data integrity we’ve been able to take for granted.  The web requires intricate locking mechanisms that make data replication tricky.  We could take it for granted because our operations people have built up super-complex triple-redundant infrastructure so that we did not have to consider what happens when the database can’t perform its magic.

What is the real cost for that magic?

I’m learning about CouchDB.  It’s not a relational database, it a distributed JSON document warehouse with smart indexing.  And compared some of the fine grained features of SQL, it’s an arc welder.   The data in CouchDB is loosely structured (JSON!) and relationships are ad hoc.  The system doesn’t care (let alone enforce) that if you’ve maintained referential integrity within the document – it just wants to make sure that the documents are stored, replicated, and indexed.   The goodness here is that CouchDB allows you to distribute your data broadly so that it can be local and redundant.  Even better, weak structure allows you to evolve your schema agilely (look for a future post on this topic).

If you’re cringing about lack referential integrity then get over it – every SQL backed application I ever wrote required RI double-checking anyway!

If you’re cringing about possible dirty reads or race conditions then get over it – every SQL backed application I ever wrote required collision protection too!

I’m not pitching CouchDB (or similar) is a SQL replacement.   I’m holding it up as an example of a pragmatic approach to failure-ready design.   I’m asking you to think about the hidden complexity and consequential fragility that you may blindly inherit.

So cut off my arms and legs – I can still spit on your shoes.

WhatTheDB? Adding mySQL into WhatTheBus

Today’s WhatTheBus update added data persistence to the application. Ultimately, I am planning to use CouchDB for persistence; however, I wanted to show a SQL to document migration as part of this process. My objective is to allow dual modes for this application.

In the latest updates, I continued to show Test Driven Development (TDD) process using Cucumber. Before starting work, I ran the test suite and found a bug – spectacular failure if MemCacheD is not running. So my first check-in adds recovery and logging around that event. Next I wrote a series of tests for database persistence. These tests included checking a web page that did not exist at this time. I ran the tests – as expected, all failed.

The persistence was very simple: models for bus and district. These minimal models are created dynamically when a bus location is updated. The data contract is that the first location update should include the bus name and distract in the url. After the first update, only ID and location (lat, lng) are expected. In addition to the model and migrations, I also updated the database.yml to use mySQL.

Creating a web page for the bus (bus/index/[xref id]) required the addition of a little infrastructure for the application. Specifically, I had to add an application layout and style sheet. Just because I have a styles sheet, does not mean there is any style (I’ve got style, brother. I’ve got million dollar charm, sister. I’ve got headaches and toothaches and bad times too).

To preserve simplicity, I am not storing the location information in the database. Location is so time sensitive that I don’t want to create any storage burden and I’m using cache expiration to ensure that we don’t keep stale locations around.

Up next…. I’m going to add a simulator (in rake) to make it easier to work on the application.

WhatTheBus, Day1: MemCacheD roundtrip

Today I got the very basic bus data collection working using Cucumber TDD.  That means that I wrote the basic test I wanted to prove BEFORE I wrote the code that operates the test.

The Cucumber feature test looks like this:

Feature: Mobile Access
In order to ensure that location updates are captured
School Bus Location providers
want to have data they send stored on the site

Scenario: Update Location
When bus named “lion” in the “eanes” district with a id of “1234” goes to “32,-97”
When I go to the bus “1234” page
Then json has an object called “buses”
And json has a record “1234” in “buses” with “lat” value “32”
And json has a record “1234” in “buses” with “lng” value “-97”

There’s is some code behind this feature that calls the web page and gets the JSON response back.  The code that actually does the work in the bus controller is even simpler:

The at routine takes location updates just parses the parameters and stuffs it into our cache.  For now, we’ll ignore names and district data.

def at

Rails.cache.write params[:id], “#{params[:lat]},#{params[:lng]},#{params[:name]},#{params[:district]}”, :raw=>:true, :unless_exist => false, :expires_in => 5.minutes
render :nothing => true

end

The code that returns the location (index) pulls the string out of the cache and returns the value as simple JSON.

def index

data = Rails.cache.read(params[:id], :raw => true).split(‘,’)
if data.nil?
render :nothing => true
else
render :json => {:buses => { params[:id].to_sym => { :lat => data[0], :lng => data[1] } } }
end

end

Not much to it!  It’s handy that Rails has memcache support baked right in!  I just had to add a line to the environment.rb file and start my memcached server.

Cloud Reference App, “What The Bus” intro

Today I started working on an application to demonstrate “Cloud Scale” concepts.  I had planned to do this using the PetShop application; unfortunately, the 1995 era PetShop Rails migration would take more repair work then a complete rewrite (HTML tables, no CSS, bad forms, no migrations, poor session architecture).

If I’m considering a fresh start, I’d rather do it with one of my non-PetShop pet projects called “WhatTheBus.”  The concept combines inbound live data feeds and geo mapping with a hyper-scale use target.  The use case is to allow parents to see when their kids’ bus is running late using the phone from the bus stop.

I’m putting the code in git://github.com/ravolt/WhatTheBus.git and tracking my updates on this bog.

My first sprint is to build the shell for this application.  That includes:

  • the shell RAILS application
  • Cucumber for testing
  • MemCacheD
  • Simple test that sets the location of a bus (using a GET, sorry) in the cache and checks that it can retrieve that update.

This sprint does not include a map or any database.  I’ll post more as we build out this app.

Note: http://WhatTheBus.com is a working name for this project because it appeals to m warped sense of humor.  It will likely appear under the sanitary ShowBus moniker: http://showb.us.

Recovering Outlook Autocomplete, just in the NK2 of time

I manged to head off a reverse upgrade today by renaming a file.  A couple of days ago, Microsoft Outlook 2007 ate my wife’s email configuration.  The error message was a “Unable to open the Outlook Window” and Microsoft’s support article led me to remove the registry key that linked her email configuration to her email data!

I bet you still can hear echos of her molars grinding.

I had to start from a blank configuration.  Luckily, I was able to find and put back her email and archive PST files.  Birds were singing and dust motes scintillated in the morning sun.

Unfortunately, my work was not done.  The file (user.nk2) that Outlook uses to cache email addresses for its autocomplete feature had been set to an empty file!  Major travesty, like most Outlook users, this file was my wife’s primary contact list.   She would rather abandon her frilly “artist” laptop for her ancient desktop just to recover use of that file!

Luckily, Outook did not stomp on the earlier version of the NK2 file.  I closed Outlook, found the two NK2 files in C:\Users\Wife\AppData\Local\Microsoft\Outlook, and swapped the old file for the new file.  Of course, I also made a BACKUP of the beloved NK2 file.  When I re-opened Outlook, the cache recovered.

Moral: go backup your NK2 file right away.  Apparently, Outlook 2007 is known to corrupt its configuration regularly!

Useful link about autocomplete on TimeAtlas.

Making Cloud Applications RAIN, part 1

An application that runs “in the cloud” is designed fundamentally differently than a traditional enterprise application.  Cloud apps live on fundamentally unreliable, oversubscribed infrastructure; consequently, we must adopt the same mindset that drove the first RAID storage systems to create a Redundant Array of Inexpensive Nodes (RAIN).

The drivers for RAIN are the same as RAID.  It’s more cost effective and much more scalable to put together a set of inexpensive units redundantly than build a single large super-reliable unit.  Each node in the array handles a fraction of the overall workload so application design must partition the workloads into atomic units.

I’ve attempted to generally map RAIN into RAID style levels.  Not a perfect fit, but helpful.

  • RAIN 0 – no redundancy.  If one part fails then the whole application dies.  Think of a web server handing off to a backend system that fronts for the database.  You may succeed in subdividing the workload to improve throughput, but a failure in any component breaks the system.
  • RAIN 1 – active-passive clustering.   If one part fails then a second steps in to take over the workload.  Simple redundancy yet expensive because half your resources are idle.
  • RAIN 2 – active-active clustering.  Both parts of the application perform work so resource utilization is better, but now you’ve got a data synchronization problem.
  • RAIN 5 – multiple nodes can process the load. 
  • RAIN 6 – multiple nodes with specific dedicated stand-by capacity.  Sometimes called “N+1” deployment, this approach works will with failure-ready designs.
  • RAIN 5-1 or 5-2 – multiple front end nodes (“farm”) backed by a redundant database.
  • RAIN 5-5 – multiple front end nodes with a distributed database tier.
  • RAIN 50 – mixed use nodes where data is stored local to the front end nodes.
  • RAIN 551 or 552 – geographical distribution of an application so that nodes are running in multiple data centers with data synchronization
  • RAIN 555 – nirvana (no, I’m not going to suggest a 666).

Unlike RAID, there’s an extra hardware dimension to RAIN.  All our careful redundancy goes out the window if the nodes are packed onto the same server and/or network path.  We’ll save that for another post. 

I hope you’ll agree that Clouds create RAINy apps.

Petshop, Updated Day 1

As part of a Cloud computing project, I’ve taken on updating the Rails port of the JPetShop project to Rails 2.0 and have the project on SourceForge.  This port dates back to 2005 so many of the latest conventions (e.g. CSS) were not in vogue.

My ultimate objective is to show scale out techniques on a very simple base app.  Before we can get there, I’ve got some clean-up work to do.  I’d also like to add a test framework (Cucumber?).  I’ll document the progress through this exercise here.

My first check-in provided the base level of function.  Currently, none the forms are working but the catalog is visible.

Today’s update was to fix the login page:

  • Change the view to use the form_tag helper.  This let us put protect_from_forgery into the code base again!
  • Remove the extra login form (not sure why that was there)
  • Clean-up all the references to use symbols (:field) instead of strings (‘field’)
  • Change the controller to handle both the initial request (GET) and form processing (POST)
  • Update the layout and other pages to direct users to the correct login page

I’ve been resisting:

  • removing the tables in favor of definition lists (DL)
  • add CSS
  • changing the session to store an ID instead of the full account object

Won to End (1..N) Ranking

Or I can buy a new toothbrush in Newark

Our team had a breakthrough moment last week, we changed our MRD* into a prioritized (1 to N, no ties) list and immediately identified some artificial work grouping that would have jeopardized the whole release.  Within 5 minutes, we’d done more to rescue our release than the 4 weeks prior.

Working on a traditional MRD is packing to go on a vacation.  You’ve made great plans, you’ve read all the brochures, and you can visualize yourself on the sparkling white sand surrounded beach chair relaxing next to your partner under gently swaying palms. 

Paradise sounds great, but the cold fact of today is that you’re late for your flight and you have to pack everything into a single suit case. 

Remember the McCallister household in Home Alone?   They were so busy packing and rushing out that they left behind sweet defenseless Macaulay Culkin.  Treating an MRD like a suitcase makes you so focused on the little details and the deadlines that you and your team will likely miss the big picture.  And once you’ve started your team may not react to market changes because they’ve already packed their mental bags:  “honey, we can’t have dinner with the President tonight because you promised that we’d play tennis.”

Another frustration I have with MRDs is that they are classically adversarial.  “Why did you pack that?  If you’re bringing that yoga matt then I’ll need a polo helmet!”  The fault is not in the people, the very nature of the document creates antagonism because the document is considered a monolithic whole.

It does not have to be like this.

Creating product direction should be more like planning a road trip than preparing for a transoceanic flight.  We know where we want to go, the essential things we need and who’s in the car: that’s enough to get started.  We’ll discuss the roadmaps as we go so we can explore the interesting sites (“geysers, cool!”) and toss out the bone headed detritus (“infrared beachcombing metal detector?”) along the way.

It takes trust and discipline to free fall into this type of planning.

When my team ranked our traditional MRD items into a 1 to N list, the conversation immediately became more focused, birds began to sing, and we each lost 5 pounds.  Several things happened when we changed the list. 

First, we looked at the deliverable as a system, not just a collection of features.   “Yeah, mayo is not much good without the bread, keep those together”

Then we started talking about what the customer needed (not what we had to deliver).  “Rob, we know you like sweet hot banana peppers but those are really optional compared to slicing the bread.”

Finally, we found it much easier to compare items against each other (“yeah, the ham feature is more important than tofu move that higher” instead of “we can’t ship this product without both ham and tofu!”)

Once we had the list ranked, it was obvious to everyone which features were required for the release.   Our discussion focused on the top priorities and engineering was able to focus on the most import items first.

Spoonful of Zietgiest

Do you want to have a winning team?  Bring a spoonful of zietgiest to your next meeting!

For me, Zietgiest is about how group dynamics influence how we feel about technology and make decisions.  It’s like meme, but I like Zeitgiest more because of its lower vowel ratio (Hirschfeld = 2:10).

Yesterday, my release team meeting had negative Zeitgiest.  Locally everyone was checking email while the remote speaker flipped through dense powerpoint slides.  It was like watching my divorced aunt’s family vacation slides via Webex.  We needed a spoonful of zietgiest!  That’s how I found myself explaining some of our challenges with the phrase “turd in the punchbowl” and getting people paying more attention to the real work.  A small positive spark and faked enthusiasm changed the momentum.  Yeah, it was fake at first and then become real zeitgiest when the other attendees picked up on the positive vibe.

The idea of seeding zietgiest is critical for everyone on teams.  It’s like the William James expression,feeling follows action: if you act happy then you’ll shake off the blues and start to feel happy.  Yes, this is 100% real.  The same applies for groups.  We can choose to ride or steer the zietgiests.

There’s no reason to endure low energy meetings when you can get out your spoon and stir things up.