For the bit hacker in you

My colleagues and I are wrapping up version 2 of our unique counting algorithms. The last pieces are the database tools that our analysts use — some custom functions in Postgres9 to manipulate our custom types. One step in our algorithm is to use the least-significant bit of a bit array. In Java we use a lookup technique (a 256-entry lookup table that you use to look up each byte) and in C we can use the bsf assembly instruction. (Both techniques can been researched further on Sean Eron Anderson’s awesome Bit Twiddling Hacks.)

Since I’m a hacker to the core I wanted to do a little more research to see what other folks were doing. Here are some additional references:

In the last reference, there is a response that mentions the fascinating technique of using the floating-point representation to extract the exponent:

unsigned GetLowestBitPos(unsigned value)
   double d = value ^ (value - !!value); 
   return (((int*)&d)[1]>>20)-1023; 

This response refers you to another post that explains a bit more the technique. In that response, Tony Lee mentions one of my favorite constants 0x5f3759df. I spent many years doing graphics and gaming programming (some of my first “real” programs in college where doing visual simulations on SGIs) and I remember the first time that I came across 0x5f3759df in the Quake source code. I’m always amazed at the bit techniques that people will come up with to solve a problem.

I’ll end this post with a link to one of my favorite articles on the internet: a history of the origins of 0x5f3759df in the Quake source. It’s a wonderful walk down memory lane.