While working on a problem, I derived an interesting result around sums of uniforms random variables. I wanted to record it here so I don't forget it (I haven't solved the more general problem yet!). Here's the summary of the result:

Let $$S_n = \sum_{i=1}^n U_i$$ be the sum of $$n$$ Uniform random variables. Let $$N$$ be the index of the first time the sum exceeds 1 (so $$S_{N-1} < 1$$ and $$S_{N} \ge 1$$). The distribution of $$U_N$$ is $$e - e^{1-u}$$.
This result is related to the really astounding result: $$N$$ (the same $$N$$ defined above) has expected value equal to $$e$$. Yes, that's right, the average number of uniform random variables needed to exceed 1 is exactely $$e$$. I just love this surprising result, but have still not found a good, intuitive reason as to why this is true. Of course it is true mathematically, and the proof is not too hard, but I just can't see why. Anyways, below is the proof of the first result.

### Break the problem into smaller pieces

We'll first use the total law of probability to break the distribution into smaller pieces: $$P(U_N \in du) = \sum_{n=2}^{\infty} P(U_N \in du \;|\; N=n)P(N=n)$$

### Numerical results first

For the below part, we'll be conditioning on a given $$N$$, so let's first look at the distibution of $$U_N \;|\; N=n$$. To do this, we'll just use Monte Carlo to sample from this distribution. Here are the results: So this is our target. If we can model these distributions, we can plug them into the above formula.

### Focusing on a particular $$N$$

Let's put our attention on the case $$N=2$$. We want to solve: \begin{align*} P(U_2 \in du \;|\; N=2) & = P(U_2 \in du \;|\; U_1 < 1, U_1 + U_2 \ge 1) \\ & = \frac{P(U_2 \in du, U_1 < 1, U_1 + U_2 \ge 1)}{P(U_1 < 1, U_1 + U_2 \ge 1)} \\ & = \frac{P(U_1 < 1, U_1 + U_2 \ge 1 \;|\; U_2 \in du)P(U_2 \in du)}{P(U_1 < 1, U_1 + U_2 \ge 1)} \\ \end{align*} Because $$U_2$$ is uniform, $$P(U_2 \in du) = du$$, so $$\frac{P(U_1 < 1, U_1 + U_2 \ge 1 \;|\; U_2 \in du) du}{P(U_1 < 1, U_1 + U_2 \ge 1)}$$ The denominator, while it can be worked out, is equal to the term $$P(N=n)$$ in the first equation, so those will cancel out later. It's equal to $$\frac{1}{n! + (n-1)!}$$, and I'll keep it in to compare our results to the distributions above. So we just have the numerator term to compute. I'll use a slight change in the problem statement here, so as to make the algebra more clear: \begin{align} P(U_1 < 1, U_1 + u \ge 1 \;|\; U_2 = u) & \\ & = P(U_1 < 1, U_1 \ge 1 - u \;|\; U_2 = u)\\ & = u \end{align} So we arrive at $$f_{U_2 \; | \; N=2}(u) = \frac{u}{1/2} = 2u$$, which nice aligns with our empirical distribution above.

### Slightly harder problem

Let's try $$N=3$$, and it gets more interesting here. We'll skip ahead to solving: \begin{align*} P(U_1 + U_2 < 1, U_1 + U_2 + u \ge 1 \;|\; U_3 = u) = \\ P(U_1 + U_2 < 1, U_1 + U_2 \ge 1 - u \;|\; U_3 = u) \end{align*} To solve this, we'll take geometric point of view: the two variables ($$U_1, U_2)$$ can be represented as the unit square in the first quadrant. The inequality $$U_1 + U_2 < 1$$ is an area in this square, bordered by the unit square and the line $$U_1 + U_2 = 1$$. Ditto with $$U_1 + U_2 \ge 1 - u$$. The probability is then just the area of the space between the lines. After computing triangles, the area comes out to $$\frac{1}{2}(1 - (1-u)^2)$$.

### In general

This geometric argument can be extended, with the help of the Irwin–Hall distribution, and in general we get: \begin{align*} f_{U_n \;|\; N = n}(u) &= \frac{1}{(n-1)! + (n-2)!} \int_{1-u}^{1} \frac{1}{(n-2)!}z^{n-2}dz \\ & =\frac{(n-1)! + (n-2)!}{(n-1)!}(1 - (1-u)^{n-1}) \end{align*} And the results are spot on: ### Near the finish line

Going back to our original equation: \begin{align*} f_{U_N}(u) &= \sum_{n=2}^{\infty} f_{U_N | N=n}(u)P(N=n)\\ &= \sum_{n=2}^{\infty} \frac{(n-1)! + (n-2)!}{(n-1)!}(1 - (1-u)^{n-1}) \frac{1}{(n-1)! + (n-2)!} \\ &= \sum_{n=2}^{\infty} \frac{1}{(n-1)!}(1 - (1-u)^{n-1})\\ &= \sum_{n=2}^{\infty} \frac{1}{(n-1)!} - \sum_{n=2}^{\infty} \frac{(1-u)^{n-1}}{(n-1)!}\\ &= \sum_{n=1}^{\infty} \frac{1}{n!} - \sum_{n=1}^{\infty} \frac{(1-u)^{n}}{n!}\\ &= e - e^{1-u}\\ \end{align*} Does this empirically line up? Again, I created samples via MC and plotted our derived curve over it: Terrific! This confirms that no mistake was made in the derivation.

### Conclusion

This worked out so nicely mostly because the probabilities were monomials with simple coefficients - then we could easily work out the solution to the infinite series. However, if I wanted a similar result for exceeding 2 (instead of 1), we get polynomials and much more complex coefficients. I haven't made much progress on this part. I think a solution is to condition the sum on some value, $$N'$$, such that the sum at $$N'$$ exceeds 1, and then I can subtract 1 (+ a smaller term) from 2, and I arrive back at a sum less than 1.