## Tuesday, July 6, 2010

### Rounding error

Here is an easy math puzzle. Suppose we have 3 random variables a,b and c with c=a+b. We round a to A, b to B and c to C and want to know the chances that C=A+B. We need some more information for an unique answer. So for definiteness let a and b be random real variables uniformly distributed between 0 and 50. Let c=a+b and let A,B and C be a,b and c rounded to the nearest integer (it won't matter how .5 cases are rounded). Then we ask what is the probability that C=A+B?

This problem was inspired by the recent furor regarding the Research 2000 polling operation in which it has been suggested that Research 2000 was saving money by making up its poll results. However it is unclear if the above problem actually has any connection to the anomalies that have been found in the Research 2000 results.

1. Probability = 0.75. Let X be the decimal part of A. X is uniformly distributed on [0,1].
For X < 0.5, probability = 0.5 + X. For X > 0.5, probability = X.

2. Jonathan, shouldn't the probability for X < 0.5 be 1-X not 0.5+X? This doesn't change the overall answer, but it does reflect the symmetry in the puzzle (the probability is 1 minus the distance from X to the nearest integer, which obviously averages 0.75)

3. Yes, that's right.

4. As I noted this problem is pretty easy. A related problem which is a bit harder is the following. Suppose we randomly pick two points (uniformly and independently distributed) between 0 and 100. This divides the interval from 0 to 100 into 3 pieces, say a, b and c. Let A, B and C be the rounded (to the nearest integer) values of a,b and c. What is the chance that A, B and C add to 100?

5. I *think* A+B+C=100 iff a,b, & c don't all round-to-nearest in the same direction.

So, graphing the fractional part of a vs the fractional part of c, we'd have A+B+C=100 throughout the unit square *except* in the UR half of the LL quadrant and the LL half of the UR quadrant. So pr{A+B+C=100} = 7/8 (with a likelihood of, say, 2/3:-)

6. Er, scratch the 7/8 and make it 3/4. (Keep the 2/3 unchanged:-)

7. Yes, 3/4 is again correct. I found it interesting that the answer does not depend on 100 being 100 instead of some other integer. I believe this is no longer the case if you look at divisions into more than 3 parts.

8. That's interesting. It's certainly not obvious to me why n=4 (>=4?) would be qualitatively different from n=3 or n=2 in this regard; my guess would have been that the probability was independent of the upper bound for all n.

9. I now see where the dependence on the upper end of the range comes in. This problem has more moving parts than I originally appreciated. I need to think about it more (and more carefully:-)