# Poissonization of Multinomials

Posted by Cameron Davidson-Pilon at

## Introduction

I've seen some really interesting numerical solutions using a strategy called Poissonization, but Googling for it revealed very few resources (just some references in some textbooks that I don't have access to). So here it is: my notes and repository for Poissonization

Theorem: Let $$N \sim \text{Poi}(\lambda)$$ and suppose $$N=n, (X_1, X_2, ... X_k) \sim \text{Multi}(n, p_1, p_2, ..., p_k)$$. Then, marginally, $$X_1, X_2, ..., X_k$$ are are independent Poisson, with $$X_i \sim \text{Poi}(p_i \lambda)$$. [1]

The proof is as follows. By the law of total probability:

\begin{align*} P(X_1=x_1, ..., X_k = x_k) \\ & = \sum_{n=0}^{\infty} P(X_1=x_1, ..., X_k = x_k \;|\; N=n) \frac{e^{-\lambda}\lambda^n}{n!} \\ & = \sum_{n=0}^{\infty} \frac{(x_1 + ... + x_k)!}{x_1!x_2! \cdots x_k!}p_1^{x_1}\cdots p_k^{x_k}\frac{e^{-\lambda}\lambda^n}{n!}I_{x_1 + ... + x_k = n} \\ & = e^{-\lambda}\lambda^{x_1} \cdots \lambda^{x_k}p_1^{x_1} \cdots p_k^{x_k}\frac{1}{x_1!x_2! \cdots x_k!}\\ & = e^{-\lambda}(\lambda p_1)^{x_1} \cdots (\lambda p_k)^{x_k}\frac{1}{x_1!x_2! \cdots x_k!}\\ & = \prod_{i=1}^k \frac{e^{-\lambda p_i}(\lambda p_i)^{x_i}}{x_i!} \end{align*}

Let's see how we can use this. Let $$(Y_1, Y_2, ...Y_k)$$ be $$\text{Multi}(n, p_1, p_2, ..., p_k)$$, and we are interested in when the realized values fall into some set $$A$$. Let $$X_i$$ be defined as above, that is, $$X_i \sim \text{Poi}(p_i \lambda)$$. Then, from above, we can write:

\begin{align*} P((X_1, ... X_k) \in A) & = \sum_{i=0}^\infty \frac{e^{-\lambda}\lambda^n}{n!} P((Y_1, ... Y_k) \in A)\\ & = e^{-\lambda} \sum_{i=0}^\infty \frac{\lambda^n}{n!} P((Y_1, ... Y_k) \in A)\\ & e^{\lambda}P((X_1, ... X_k) \in A) = \sum_{i=0}^\infty \frac{\lambda^n}{n!} P((Y_1, ... Y_k) \in A) \end{align*}

Both sides of the last line are infinite series of $$\lambda^n$$, and thus their coefficients must be equal. Then, as $$N=n$$, $$P((Y_1, ... Y_k) \in A)$$ is $$n! c(n)$$ where $$c(n)$$ is the coefficient of $$\lambda^n$$ in the expansion of $$e^{\lambda}P((X_1, ... X_k) \in A)$$.

## Example

Let's try an example:

n defects are randomly distributed amongst k integrated-circuit chips produced by a factory (any number of defects may be found on a chip and each defect is independent of the other defects). If n = 4, k=7, find the probability that there is a chip with at least 3 defects.

This is a classic case where it's easier to compute the complementary probability: what is the probability that all chips have less than 3 defects (and then we subtract this from 1 to get the original probability).

\begin{align*} P(\text{Chips have 2 or less defects}) \\ & = \Big(P(\text{Chip has 0 defects}) + P(\text{Chip has 1 defect}) + P(\text{Chip has 2 defects})\Big)^7 \\ & = \Big(e^{-\lambda/7} + \frac{e^{-\lambda/7}(\lambda/7)^1}{1!} + \frac{e^{-\lambda/7}(\lambda/7)^2}{2!}\Big)^7\\ & = e^{-\lambda}\Big(1 + \frac{\lambda}{7} + \frac{\lambda^2}{2!7^2}\Big)^7\\ \end{align*}

Here's the Poissonization step. We multiply the above by $$e^{\lambda}$$ (oh nice it cancels) and expand to get the coefficient of $$\lambda^4$$. For this, it's useful to know a bit about multinomial expansions. But basically you are looking for solutions to

$$k_0 + k_1 + k_2 = 7\\ 0 k_0 + 1 k_2 + 2 k_2 = 4$$

The first equation comes from the constraint for the powers of a multinomial expansion to sum to the outside exponent. The second equation comes from our need to find powers of $$\lambda$$ that sum up to 4 (the 0,1,2 come from the existing powers of $$\lambda$$). Two equations, three unknowns, solving for the values:

\begin{align*} k_2 &= k_0 - 3 \\ k_1 &= 10 - 2k_0\\ 3 &\le k_0 \le 5 \end{align*}

Computationally, we can write this in Python:

We get the answer 0.92711, so the complement to this is 0.07289 and is the probability of the original question.

Looking for more examples?

[1] DasGupta, A. Probability for Statistics and Machine Learning. http://link.springer.com/book/10.1007%2F978-1-4419-9634-3