Exploring Human Psychology with Mechanical Turk Data
Posted by Cameron DavidsonPilon at
This blog post is a little different: it's a whole data collection and data analysis story. I become interested in some theories from behavioural economics, and wanted to verify them. So I used Mechanical Turkers to gather data, and then did some exploratory data analysis in Python and Pandas (bonus: I recorded my data analysis and visualization, see below).
Prospect Theory and Expected Values
It's clear that humans are irrational, but how irrational are they? After some research into behavourial economics, I became very interested in Prospect Theory, a modern theory about human behaviours. Part of Prospect Theory is determining how our brains value different outcomes under uncertainty, that is, how expected values are mentally calculated. A very interesting part of Prospect theory is that it is not probabilities that are used in the calculation of expected value:
Here, the q's are not the probabilities of outcome z, but it is from another probability measure called decision weights that humans actually use to weigh outcomes. Using a change of measure, we can observe the relationship between the actual probabilities and the decision weights:
My interest is in this change of measure.
The Setup
Suppose you have two choices:
 Lottery A: have a 1% chance to win $10 000,
 Lottery B: have a 99% chance to win $101
Which would you prefer?
Well, under the real world probabilty measure, these two choices are equal: .99 * 101 = .01 * 10000. Thus a rational human would be indifferent to either option. But an actualhuman would have a preference: they would see one more valuable than the other. Thus:
rewritten:
and dividing:
What's left to do is determine the direction of the first inequality.
Mechanical Turk it.
So I created combinations of probabilities and prizes, all with equal realworld expected value, and asked Mechanical Turkers to pick which one they preferred:
The original HIT data and the python scripts that generate are below, plus the data that I just now recieved back from MTurk. Each pair of lotteries received 10 turkers votes.
Note: The Turking cost me $88.40, if you'd like to give back over Gittip, that would be great =)
Note: I called the first choice Lottery A and the second choice Lottery B.
Analysis
Below is a slightly inappropriate heatmap of the choices people made. If everyone was rational, and hence indifferent to the two choices, the probabilities should hover around 0.5. This is clearly not the case.
What else do we see here?
 As expected, people are loss averse: every point in the lowerdiagonal is where lottery A had a high probability of success than B. The matrix shows that most points in here are greater than 50%, thus people chose the safer bet more often.
 The exception to the above point is the fact that 1% is choosen more favourably over 2%. This is an instance of the possibility effect. People are indifferent between 1% and 2%, as they are both so rare, thus will pick the one with larger payoff.
Bonus: video of data munging in Pandas
I recorded my steps to take the raw Turker data, and create the visualization above:
FAQ

Why did I ask the Turkers to deeply imagine winning $50 dollars before answering the question? This was to offset a potential anchoring effect: if a Turkers first choice had prize $10 000, then any other prize would have looked pitiful, as the anchor had been set at $10 000. By having them imagine winning $50 (lower than any prize), then any prize they latter saw would appear better than this anchor.

Next steps? I'd like to try this again, with more control over the Turkers (have a more diverse set of Turkers on it).

Can I see the code and raw data?Sure, here it is on Github!
 Tags: data collection, matplotlib, pandas, python
Latest Data Science screencasts available
Comments