My colleagues and I are wrapping up version 2 of our unique counting algorithms. The last pieces are the database tools that our analysts use — some custom functions in Postgres9 to manipulate our custom types. One step in our algorithm is to use the least-significant bit of a bit array. In Java we use a lookup technique (a 256-entry lookup table that you use to look up each byte) and in C we can use the bsf assembly instruction. (Both techniques can been researched further on Sean Eron Anderson’s awesome Bit Twiddling Hacks.)
Since I’m a hacker to the core I wanted to do a little more research to see what other folks were doing. Here are some additional references:
- Find most significant bit (left-most) that is set in a bit array
- Position of least significant bit that is set
In the last reference, there is a response that mentions the fascinating technique of using the floating-point representation to extract the exponent:
unsigned GetLowestBitPos(unsigned value) { double d = value ^ (value - !!value); return (((int*)&d)[1]>>20)-1023; }
This response refers you to another post that explains a bit more the technique. In that response, Tony Lee mentions one of my favorite constants 0x5f3759df. I spent many years doing graphics and gaming programming (some of my first “real” programs in college where doing visual simulations on SGIs) and I remember the first time that I came across 0x5f3759df in the Quake source code. I’m always amazed at the bit techniques that people will come up with to solve a problem.
I’ll end this post with a link to one of my favorite articles on the internet: a history of the origins of 0x5f3759df in the Quake source. It’s a wonderful walk down memory lane.
Leave a Reply