Bayesian M&M Problem in PyMC 2
This Bayesian problem is from Allen Downey's Think Bayes book. I'll quote the problem here:
M&M’s are small candy-coated chocolates that come in a variety of colors. Mars, Inc., which makes M&M’s, changes the mixture of colors from time to time.
In 1995, they introduced blue M&M’s. Before then, the color mix in a bag of plain M&M’s was 30% Brown, 20% Yellow, 20% Red, 10% Green, 10% Orange, 10% Tan. Afterward it was 24% Blue , 20% Green, 16% Orange, 14% Yellow, 13% Red, 13% Brown.
Suppose a friend of mine has two bags of M&M’s, and he tells me that one is from 1994 and one from 1996. He won’t tell me which is which, but he gives me one M&M from each bag. One is yellow and one is green. What is the probability that the yellow one came from the 1994 bag?
This is a very simple Bayesian problem, on paper! See Allen's book for the solution (link above). What if we wanted to use PyMC to solve it? Below is the solution (originally from CrossValidated).
Caution: Please use with PyMC 2.3.2
It looks like there is a bug in 2.3.4 (the most recent version) that is causing the wrong inference. This took me a while to discover, but it was solved when I downgraded to PyMC 2.3.2.
Model:
import pymc as pm
p = [
#brown, yellow, red, green, orange, tan, blue
[.3, .2, .2, .1, .1, .1, .0 ], # 1994 bag
[.13, .14, .13, .2, .16, .0, .24] # 1996 bag
]
bag = pm.Categorical('which_bag', [0.5, 0.5]) # prior on bag selected
@pm.deterministic
def first_bag_selection(p=p, bag=bag):
return p[bag]
@pm.deterministic
def second_bag_selection(p=p, bag=bag):
return p[1-bag]
# observe yellow in bag 1
obs_1 = pm.Categorical('bag_1', first_bag_selection, value=1, observed=True)
#observe green in bag 2
obs_2 = pm.Categorical('bag_2', second_bag_selection, value=3, observed=True)
Inference
mcmc = pm.MCMC([bag, p, first_bag_selection, second_bag_selection, obs_1, obs_2])
mcmc.sample(15000,3000)
bag_trace = mcmc.trace('which_bag')[:]
# what is the probability the bag chosen is from 1996?
print (bag_trace == 1).sum()*1./bag_trace.shape[0]
# should be about .26