8 The equity premium puzzle

The equity premium is the gap between the expected return on the stock market versus a portfolio of fixed-income securities (e.g. bonds). Since 1926 the annual real return for stocks in the United States has been about 7%, while the real return on Treasury bills has been less than 1%. (A Treasury bill is a short-period bond, issued by the United States Treasury, that pays a face value at maturity. They are bought at a discount to the face value to create a positive yield.)

This difference in yield might appear to be justifiable by the greater riskiness of stocks. If stocks are riskier, they should earn higher returns.

However, Mehra and Prescott (1985) examined this premium and argued that the size of the premium would require an implausible level of risk aversion. A reasonable level of risk aversion would result in an equity premium of around 0.1%. The size of the observed premium is hence a puzzle.

Two explanations for the equity premium puzzle are ambiguity aversion and loss aversion.

8.1 Ambiguity aversion

Ambiguity aversion is a preference for known risks over unknown risks. People don’t just not know what the return will be. They also don’t know what the potential distribution of returns is. They therefore require a greater premium for stocks than would be expected from risk aversion alone.

8.2 Loss aversion

8.2.1 Samuelson’s bet

Consider whether you would accept either of the following bets.

A 50% chance to win $200 and a 50% chance to lose $100?
A sequence of 100 bets with a 50% chance to win $200 and a 50% chance to lose $100?

These bets relate to a famous exchange between Paul Samuelson and some lunch colleagues. Samuelson offered them a $200 to $100 bet that the side of a coin they specified would not appear at the first toss. One “distinguished scholar” responded:

I won’t bet because I would feel the $100 loss more than the $200 gain. But I’ll take you on if you promise to let me make 100 such bets.

Samuelson (1963) showed that if a person would reject the first bet at any level of wealth, this pair of choices was not consistent with expected utility theory. The logic of his argument was as follows:

The sequence of 100 bets could be thought of a sequence of 99 bets, plus a decision as to whether to accept one further bet.
Given that this person will reject the single bet at any level of wealth, they will not accept this 100^th bet.
That leaves them with a sequence of 99 bets, which could also be thought of as a sequence of 98 bets, plus a decision as to whether to accept one further bet.
Again, the additional bet is rejected.
This logic is repeated until all bets are rejected.

Accordingly, Samuelson suggested that his colleague, if truly an expected utility maximiser, was making a mistake in accepting the bet of 100 flips given he refused the single flip. This is despite the fact that the sequence of 100 flips has an expected return of $5 000, with less than a 0.05% chance of losing any money, and less than a 0.002% chance of losing more than $1000.

Samuelson’s conclusion would not change if someone had to accept the 100 bets as a single bet with that range of possible outcomes. Although the probability of loss is small, the variance of possible outcomes increases with the number of bets. This means the bet remains unattractive for a risk averse expected utility maximiser. And someone who would reject a win $200, loss $100 bet at any level of wealth would require absurd amounts of risk aversion, so much that this same person would also reject a 50:50 bet to win $20,000, lose $200 (Rabin and Thaler (2001)).

However, the response of Samuelson’s colleague points to an alternative explanation for this pair of choices, that of loss aversion.

What if a person is loss averse? Suppose they have the following value function:

v(x)=\bigg\{\begin{matrix} x & x \geq 0\\ 2.5x & x <0 \end{matrix}

where x is a change in wealth relative to the status quo.

This loss averse person will turn down a 50:50 bet to win $200, lose $100:

\begin{align*} V(x)&=0.5v(\$200)+0.5v(-\$100) \\ &=0.5*200-0.5*2.5*100 \\ &=`r 0.5*200-0.5*2.5*100` \end{align*}

However, they would accept a sequence of two such bets, which has a distribution of outcomes of a 25% chance of winning $400, a 50% chance of winning $100 and a 25% chance of losing $200.

\begin{align*} V(x)&=0.25v(\$400)+0.5v(\$100)+0.25v(-\$200) \\ &=0.25*400+0.5*100-0.25*2.5*200 \\ &=`r 0.25*400+0.5*100-0.25*2.5*200` \end{align*}

Any longer sequence of bets has even higher positive value

Note, however, that this positive value for a sequence of bets only occurs if they do not have to watch the sequence of bets being played out. If they had to watch each consecutive flip, they would reject the bet as every individual flip has negative expected utility.

8.2.2 Explaining the equity premium puzzle

This story captures the intuition behind Benartzi and Thaler’s (1995) explanation of the equity premium puzzle.

Suppose an investor has a choice between risky stocks, with an expected annual return of 7% and standard deviation of 20%, and a sure return of 1%. Like Samuelson’s bet, the attractiveness of stocks to a loss averse investor will depend on both the time horizon of the investor and the frequency with which they evaluate the returns. If they monitor their portfolio frequently, they will often observe losses from stocks, which they feel with greater force than gains.

Suppose that one loss averse investor examines their portfolio every day. Since on a daily basis stocks go down almost as often as they go up, this investor will experience a lot of pain, making the stocks unattractive. Another loss averse investor only checks in on their portfolio once a decade. At that horizon, stocks have only a small probability of losing money, so will be much more attractive to someone who is loss averse.

It is a combination of loss aversion and a short evaluation period that will drive an investor to require a large premium for holding the risky option. Benartzi and Thaler call this myopic loss aversion.