wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> medium >> balls and bins probability... no hard
(Message started by: puzzlecracker on Oct 5th, 2005, 9:48pm)

Title: balls and bins probability... no hard
Post by puzzlecracker on Oct 5th, 2005, 9:48pm
n balls are toosed into n bins. Each toss is independent and the ballis likely to end up in any bin. what is the expected number of empty bins?
what is the expected number of bins with one ball?

Title: Re: balls and bins probability... no hard
Post by ChunkTug on Oct 6th, 2005, 5:29am
For the expected number of bins with a single ball I get (for k balls):
[hideb][(2k-3)*nCr(2k-4,k-1) + 2*nCr(2k-3,k-1)] / nCr(2k-1,k)[/hideb]
...and for empty bins I get:
[hideb][(2k-2)*nCr(2k-3,k) + 2*nCr(2k-2,k)] / nCr(2k-1,k)[/hideb]
Where nCr(n,r) = n!/[r!*(n-r)!]

Title: Re: balls and bins probability... no hard
Post by puzzlecracker on Oct 6th, 2005, 9:31pm
care to explain?

Title: Re: balls and bins probability... no hard
Post by Barukh on Oct 6th, 2005, 11:27pm
For the empty bins case, ChunkTag's formula boils down (if I didn't make a mistake) to k(k-1)/(2k-1). That means for n sufficiently large approximatialy 50% of bins are expected to be empty.

Hmm... I think this ratio should be [hide]1/e[/hide].
:-/

Title: Re: balls and bins probability... no hard
Post by ChunkTug on Oct 7th, 2005, 2:48am
Now that I'm not so sleepy I can explain. This is also why I got an A- in my Discrete Math class; once I'd solved a problem I was happy and would turn in the homework only if convenient  :P

Summary: Count the number of ways to throw N balls into N bins, call this K. Assuming each configuration is equally probable, we have a probability of 1/K for each. Now count the number of empty bins (or single ball bins) in each case and multiply this value by the cases probability. Sum the resulting values, and you get the expected value. (Since each configuration is equally likely, this is just the average number of empty/single bins.)

Now for the actual counting...

First count the total number of ways to throw N balls into N bins. This can be done with the "stars and bars" approach. Let the balls be represented by 0's (stars) and the division between bins be represented by 1's (bars). Then, what we need to count is the number of bit strings of length 2N-1 that contain N 0's and N-1 1's. There are 2n-1 positions in the string, choose N positions to place the 0's. Which gives us...

"2N-1 choose N" = nCr(2N-1,N)

...which is the denominator(s) in the previously posted formulas.

Now we'll start counting the occurrences of empty bins. Recall, each outcome can be represented by a bit string of length 2N-1 with N 0's and N-1 1's.

Occurrences of an empty bin show up as the following patterns in our bit string representation:
1...      (first bin is empty = no leading 0's)
...1      (last bin is empty = no trailing 0's)
...11... (internal bin is empty = two 1's set with no 0 between them)

N.B. These patterns are not mutually exclusive. However this does not lead to overcounting as a single bit string may represent a outcome with multiple empty bins.

Start with the first bin empty case (1...). We have to determine number of possible remainders of the bit string. This a bit string of length 2N-2 with N 0's and N-2 1's. Which gives us....

nCr(2N-2,N)

...remaining possiblities. By symmetry, this is the number of occurrences of the last bin being empty.

Now for the internal empty bins. Considering bit strings of length 2N-1, there are 2N-2 positons where the substring "11" can appear. For each position, we need to compute the possible remaining bit strings which are of length 2N-3 and have N 0's and N-3 1's. This gives us a total count for the occurrences of empty internal bins as...

(2N-2)*nCr(2N-3,N)

So the total occurrences for empty bins:
internal + first + last
or...
(2N-2)*nCr(2N-3,N) + 2*nCr(2N-2,N)

A similar argument can be made for bins containing a single ball. Except now our patterns are...
01...      (first bin)
...10      (last bin)
...101... (internal bin)

Which gives a total occurrence of single ball bins:
(2N-3)*nCr(2N-4,N-1) + 2*nCr(2N-3,N-1)


To get our expected values, we just divide by the number of ways to throw N balls into N bins. This gives the two formulas previously posted.

Title: Re: balls and bins probability... no hard
Post by Barukh on Oct 7th, 2005, 4:10am
That's interesting! It seems that ChunkTug considers a distribution different from the classical occupancy model.

Let me ask you this: assume n = 2. Are the experiments:

a) 1st  ball to 1st bin; 2nd ball to 2nd bin, and
b) 1st  ball to 2nd bin; 2nd ball to 1st bin

distinguishable?  ;)

Title: Re: balls and bins probability... no hard
Post by ChunkTug on Oct 7th, 2005, 4:28am
In the end the order the balls were thrown doesn't matter. Imagine you threw all N balls in the air simultaeneously. Should the expected number empty bins resulting change? We can label the balls any way we want, but when we count them we really don't care what the label is.

Title: Re: balls and bins probability... no hard
Post by Sjoerd Job Postmus on Oct 7th, 2005, 4:32am

on 10/07/05 at 04:28:45, ChunkTug wrote:
In the end the order the balls were thrown doesn't matter. Imagine you threw all N balls in the air simultaeneously. Should the expected number empty bins resulting change?

Yes, it should change. At least, if the balls are long enough to fit 100000000 balls, but not wide enough to fit 2... (cilindrical with exact area of ball on front)... you'd have two balls trying to enter the same one. That'd work if you trew them one by one, but goes wrong if you trow at the same time. Well, anyway, some bounce back.

Title: Re: balls and bins probability... no hard
Post by Barukh on Oct 7th, 2005, 5:04am
ChunkTag, I am not saying you are wrong; I am just saying that you have chosen a distribution model different from the classical one.

When dealing with discrete probabilities, it’s very important to define what constitutes an elementary event. It is beneficial to choose them so that every elementary event has an equal probability.

Back to our problem: you seem to count as elementary event a distinguishable arrangement of balls into bins.  Every such arrangement has an equal probability to appear. This model is used in physics of elementary particles (it is called Bose-Einstein Statistics (http://en.wikipedia.org/wiki/Bose-Einstein_statistics)).

In the classical occupancy problem, however, balls are distinguishable, not their arrangements. This corresponds to a different elementary event. The two experiments described by me are different and of course equally probable. In physics, this corresponds to the Maxwell-Boltzmann Statistics.

It is interesting, what model puzzlecracker had in mind when asking the question?

Title: Re: balls and bins probability... no hard
Post by ChunkTug on Oct 7th, 2005, 5:54am
Now I do see the difference. I don't think I've answered the intended question.

Title: Re: balls and bins probability... no hard
Post by Neelesh on Oct 7th, 2005, 6:08am

on 10/07/05 at 04:10:57, Barukh wrote:
That's interesting! It seems that ChunkTug considers a distribution different from the classical occupancy model.

Let me ask you this: assume n = 2. Are the experiments:

a) 1st  ball to 1st bin; 2nd ball to 2nd bin, and
b) 1st  ball to 2nd bin; 2nd ball to 1st bin

distinguishable?  ;)


I guess that  in the original quetsion, when  it was stated that
"n balls are tossed into n bins"
what was implicitly understood was that n identical balls are tossed into n distinct bins.

If the question is rephrased to deal with distinct balls and distinct bins, then the number of possible arrangements changes to nn. As rightly said by Barukh, we need to get the question  in more explicit form.

Also, I am not clear whether the events of throwing the balls are to be assumed independent of each other, or there are some restrictions on the container shape/size etc.

Title: Re: balls and bins probability... no hard
Post by Barukh on Oct 12th, 2005, 12:30am
Here's my analysis for the classical occupancy problem (that is, both balls and bins are distinguishable, and the tosses are independent).

Let Z be a random variable that equals the number of empty bins after n tosses, and we are going to find E(Z), the expected value of Z.

Clearly, Z = X1 + ... + Xn, where Xi is a binary random variable that equals 1 or 0, depending on whether the i-th bin remained empty or not. Because the expectation of sum equals the sum of expectations, we have:

E(Z) = [sum]E(Xi) = [sum]Prob(Xi = 1)

(why?) But the probability Xi = 1 does not depend on I and equals (1-1/n)n. Therefore, the sought expectation equals n(1-1/n)n. For big n, it’s approximately n/e.



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board