Neustar at re:Invent 2014

On Thursday, Nov. 13th I was given the opportunity to speak at AWS re:Invent 2014 about the ad analytics products and platforms we’ve built on Redshift. My slides are available here along with what I said in the speaker’s notes.

I covered how to implement four classical ad tech queries:

  • Frequency – how many ads should I show each user?
  • Attribution – how much should I pay for each ad?
  • Overlap – where can I find these people for cheaper?
  • Ad-hoc – how can I empower my users to run their own custom queries?

I also discussed how to take maximum advantage of Redshift’s simple orchestration and provisioning to scale to as many workloads and platforms as your heart desires and save money in the process!

It was a pleasure, as usual, to present to a packed room and to discuss all the great things the team has been up to.

Bring it on!

Recently I did a bit of show and tell on some of the volumes that we’re dealing with on our new infrastructure. It’s less than a month later and we’re handling over 3.5 billion events per day.

Nov 15, 2011 -- Events per second

(The three spikes that you see are due to the fact that we’re still trying to figure out the balance between event archive compression speed versus size. We should have this figured out over the next few weeks. Hopefully I can convince our Ops folks to do a post on what they’re learned.)

Never trust a profiler

A week or so ago I had mentioned to Timon that for the first time a profiler had actually pointed me in a direction that directly lead to a positive increase in performance. Initially Timon just gave me that “you’re just a crotchety old man” look (which, in most cases, is the correct response). I pointed him to Josh Bloch’s Performance Anxiety presentation which dives into why it is so hard (in fact “impossible” in Josh’s words) to benchmark modern applications. It also references the interesting paper “Evaluating the Accuracy of Java Profilers”.

Just last week I was trying to track down a severe performance degradation in my snapshot recovery code. I was under some time pressure so I turned on my profiler to try to point me in the right direction. The result that it gave was clear, repeatable and unambiguous and pointed me into the linear probing algorithm of the hash table that I am using. Since I had recently moved to a new hash table (one that allowed for rehashing and resizing) it was possible that this was in fact the root of my performance problem but I had my doubts. (More on this in a future post.) I swapped out my hash table and re-profiled. The profiler again gave me a clear, repeatable and unambiguous result that my performance woes were solved so I moved on. When we were able to test the snapshot recovery code on production snapshots, we found that the performance problems still existed.

My profiler lied to me. Never trust your profiler.