Thursday, August 19, 2010

Thinking about the Birthday Problem on my Birthday, as it applies to my Birthday Present

My birthday is upon me.

My birthday present was an iPhone 4. Yeah, I got it early, but it was nice to have for my just-finished vacation drive. I noticed that when I'd reshuffle the 1763 songs on there, I'd more often than not hit a collision with a song I swear I'd heard during the previous shuffling. Time for some math...

The Birthday Problem (or Birthday Paradox, not because it's a real paradox, but because it's counterintuitive) shows that it only takes 23 people to be in the same room before the chances that two of them share a birthday are equivalent to a coin flip. The link above shows how one derives this. Basically, as you keep adding people, the probability of there NOT being a shared birthday decreases. That probability hits near-enough to 50% at 23 people.

I figured if I would have listened/remembered 30 songs from a previous shuffle. That's 2-3 hours of music, not a lot when you're driving all day. So if I accidentally shook my iPhone and reshuffled the songs, how many would I need to hear until I had a coinflip's chance of hearing a repeat from the previous 30?

Basically, the probability of NOT hearing a previously-heard song was (1763-30) / 1763. If that wasn't a repeat, the probability of another non-collision would be (1762 - 30) / 1762. Note that unlike the birthday problem, I'm decrementing the denominator as well. This is because I'm not going to hear the same song twice in a random shuffle.

I wrote a C program (because I hack way too much ON code) to compute things. Here it is:

#include <stdio.h>

int
main(int argc, char *argv[])
{
double p;
int i, listened, total, tries;

if (argc != 4) {
fprintf(stderr,
"usage: ipod [listened-songs] [total-songs] [tries]\n");
return (1);
}

p = 1.0;
listened = atoi(argv[1]);
total = atoi(argv[2]);
tries = atoi(argv[3]);

for (i = 0; i < tries; i++)
p *= (double)(total - listened - i) / (double)(total - i);

printf("P(NO repeat for %d on the second playthough): %lf%%\n", tries,
p * 100.0);
printf("P(Repeat for %d on the second playthough): %lf%%\n", tries,
(1 - p) * 100.0);
return (0);
}


Turns out, I need to hear 40 songs to have a coinflip's chance of hearing one of the previous 30 songs I heard before reshuffling the 1763 total songs.

mactavish(~/sources)[0]% ./a.out 30 1763 40
P(NO repeat for 40 on the second playthough): 49.942794%
P(Repeat for 40 on the second playthough): 50.057206%
mactavish(~/sources)[0]%

The above program should work for any sized iPod/iPhone collection, or any sized song-memory/patience. I really hope I got the math/derivation right. Any probability wizards in the audience can feel free to school me in the comments section.