by David D. Friedman
May 13, l984
One conclusion of Gary Becker's classic essay on the economic analysis of crime, Becker 1968, was that criminals would be risk preferrers under an efficient criminal justice system. The demonstration depended on a simplifying assumption--that the cost of imposing a punishment is a fixed fraction of the amount of the punishment, however large the punishment. In this note I shall show, first, that Becker's conclusion results from an incorrect specification of the social loss function, second, that if the social loss function is correctly specified, Becker's argument can be preserved only by transforming his simplifying assumption into something neither simple nor plausible, third, that if the social loss function is correctly specified and the simplifying assumption is kept in its original form, the argument implies that criminals are risk preferrers or risk averse depending on whether the cost of punishment is more or less than the amount of the the punishment, and fourth, that all of the results on risk aversion are artifacts produced by the simplifying assumption, and disappear if that assumption is replaced with a more plausible alternative.
The relevant part of Becker's argument (Becker (1968, p. 181) begins by suggesting that optimality conditions for the punishment of criminals could be based on a social welfare function, L(D,C,bf,O) measuring the social loss from offenses, where O is the number of offenses occurring[1], D the resulting damage (net of the gain to the criminals), p the probability that a criminal committing an offense will be caught and punished, C(p,O) the cost of police and courts necessary to impose probability p on O offenses, f the punishment imposed, and b the ratio between the social cost of punishment and the cost of the punishment to the criminal. Becker then writes:
"It is more convenient and transparent, however, to develop the discussion at this point in terms of a less general formulation, namely, to assume that the loss function is identical with the total social loss in real income from offenses, convictions, and punishments, as in
L = D(O) + C(p,O) + bpfO (Eqn. 1)
The term bpfO is the total social loss from punishments, since bf is the loss per offense punished and pO is the number of offenses punished (if there are a fairly large number of independent offenses). ...the coefficient b is assumed in this section to be a given constant greater than zero.
Becker goes on to derive the first order optimality conditions and use them to show that for optimal values of p and f the magnitude of the elasticity of O with respect to p must be greater than with respect to f, which implies that criminals are risk preferrers. A somewhat more transparent form of the argument is sketched later in the section and provides an easy way of explaining my objections. It may be stated as follows:
Suppose criminals are risk neutral, so that all that matters to them is pf, the expected value of the punishment. Consider any pair p,f >0 which purport to be optimal. The pair p/2, 2f implies the same expected punishment, hence the same value of O, as the pair p,f. Examining Equation 1, we observe that D depends only on O and bpfO depends on b (assumed constant), pf=(p/2)(2f), and O, hence both are the same for both pairs. C(p,O) is an increasing function of p; it costs more to catch a larger fraction of criminals, hence C(p/2,O)<C(p,O), hence social loss is less for the pair p/2,2f than for the pair p,f. Since the argument does not depend on the values of p and f, it follows that there is no optimal pair; we are driven into the corner p=0, f=[[infinity]] .
Suppose the criminal is risk averse. Decreasing the probability and increasing the punishment proportionally now reduces O (since it increases the undesirability of the punishment lottery from the standpoint of the criminal) while keeping pf the same; since all the terms in L are increasing functions of O, the result of the previous paragraph holds a fortiori. The only way out of the corner is to assume that when p and f have their optimal values criminals are, on the margin, risk preferrers. In that case a proportional increase in f and decrease in p increases O; the increase in L due to the increase in O can, with appropriate elasticities, balance the decrease resulting from the effect of a decrease in p on C, thus satisfying the first order conditions for a maximum. Hence for optimal --and finite--values of p and f criminals must be risk preferrers.
Two things are wrong with this argument. First is its general form. Becker shows that certain assumptions (constant b plus risk neutral or averse criminals) generate a corner solution; he concludes that one of the assumptions (risk neutral or averse criminals) is false. This makes sense only if the process of moving towards the corner solution somehow forces us out of a region where the assumption applies--otherwise we have to consider the possibility that the social loss due to criminal activity really does decrease as we lower p and raise f, however low p may be already, or alternatively consider dropping some other assumption. The fact that sitting in a corner makes us uncomfortable does not entitle us to assume corners away. There is no obvious reason to expect that criminals who are risk neutral or risk averse when confronted with large probabities of small punishments will become risk preferrers when confronted with sufficiently small probabilities of sufficiently large punishments[2], still less to assume that criminals choose their attitude towards risk with the convenience of models of the criminal justice system in mind.
The second defect in the argument is Becker's specification of the loss function, specifically his assumption that it "is identical with the total social loss in real income from offenses, convictions, and punishments... ." (italics mine). To see where the problem lies, it is useful to think of all punishments as taking the form of a money fine f paid by the criminal and a smaller fine r received by the court system, with the difference representing the punishment cost. In the limiting case of a fine collected costlessly, f=r. For a punishment that imposes a cost on the criminal but yields no gain to the court system, such as execution (ignoring court costs), r=0 . If the punishment imposes costs on the court system as well as the criminal then r<0. Becker's f'=bf=f-r.
In Becker's specification of L, the cost of punishment is bpfO = pf'O. So long as we are dealing with risk neutral criminals, that is perfectly reasonable. Once we consider risk preferring or risk averse criminals, pf'O is no longer equal to the cost of punishment because it does not include the cost (or benefit) to the criminal of the risk inherent in the punishment lottery. Suppose, for example, that criminals are risk preferrers, and consider again the effect of doubling f and halving p. Further suppose (implausibly) that, for the particular values of p and f we are considering, O and C are perfectly inelastic with respect to p. Since D, C, and bpfO are all unaffected by replacing p,f with p/2, 2f, so is L. But note that under these assumptions everyone except the criminal is just as well off after the change as before, and the criminal, who by assumption is a risk preferrer, is better off. We have here a social loss function which has the same value for two situations, one of which is pareto superior to the other!
The problem is that Becker has assumed that the cost of punishment to the criminal is pf. That is appropriate if the criminal is risk neutral, but a probability p of a fine f is a cost to a risk preferrer of less than pf, and to a risk averse criminal of more than pf. The loss function specified by Becker and used by him to evaluate situations involving risk preference and aversion omits costs and benefits associated with risk preference and risk aversion.[3]
A specification of the loss function that takes account of gains and losses associated with risk preference is developed in Friedman (1981). It goes as follows:
Let p and f be defined as they were above. For any pair p,f define E as the certainty equivalent to the criminal of a fine f imposed with probability p. Call F=E/p the amount of punishment. Define F'=F-r=BF. We now reproduce Becker's specification of L, replacing f with F and b with B, to give us
L = D(O(E)) + C(p,O(E)) + BpFO(E). (Eqn. 2)
Since the amount of crime O is determined by the cost that the criminal justice system imposes on criminals, it is a function of E=pF. Since the final term in L equals BPO(E) and E has been defined to include costs or benefits associated with risk, two different pairs p,f that lead to the same E and hence the same value for the final term in L are also equivalent from the standpoint of the criminals being punished--as they were not for Becker's specification of L.
If we parallel the earlier argument by assuming that B is a constant, the argument given above to demonstrate that risk neutral criminals would imply a corner solution can now be applied whether criminals are risk neutral or not. Consider a pair p,f which purports to be optimal. Consider a new pair, p/2, f*, where f* is chosen so that E(p,f)=E(p/2,f*). The new pair results in the same O and the same punishment cost as the old but lower enforcement cost C, hence it is superior. The argument applies to any initial p,f>0, so we are driven to the corner solution p=0, f=[[infinity]] ; this is still an unconvincing picture of an optimal enforcement system, and there is no way we can get out of it by assumptions about the tastes of criminals with regard to risk.
We can get out of it, however, by looking more carefully at the assumption that B is a constant. The corresponding assumption in Becker's analysis was that b is constant. Since b =f'/f=(f-r)/f =1-(r/f), this corresponds to assuming that the ratio of fine collected to fine paid is constant; while the assumption becomes implausible for large values of f, it is at least a natural way of simplifying the problem. The corresponding assumption for B, however, is not merely implausible but virtually impossible. Since B = F'/F = (F-r)/F = 1 - r/F, assuming B is constant is equivalent to assuming r/F is constant. But F is only equal to the fine paid in the case of risk neutral criminals; otherwise it is the certainty equivalent of the lottery p,f divided by p. For a risk preferring criminal, F is less than f and increases more slowly than f, the exact relation between them depending on the details of the criminal's taste for risk (corresponding to the details of his Von Neumann-Morgenstern utility function for income); for a risk averter F is greater than f, and again the details of the relation depend on the details of the utility function. In order for r/F, and hence B, to be a constant the fine collected must vary with p and f in such a way as just to cancel the varying punishment costs associated with imposing different lotteries on criminals who are not risk neutral. In other words, the fraction of the fine that can be collected from a criminal must depend, in a particular detailed way, on his taste for risk. In the case of changes in p and f that leave pf fixed, r/f must increase with increasing f if criminals are risk averse and decrease with increasing f if criminals are risk preferring!
What happens if we retain the assumption in Becker's original form? Suppose that b=1-(r/f) is constant. Consider a pair p,f which purport to minimize L as specified in equation 2. In order to avoid the corner solution, we require B to increase as p decreases and f increases, E constant. But B=1- (1-b)f/F. Since b is assumed constant, B increasing corresponds to f/F decreasing if b<1 and to f/F increasing if b>1. But f/F=pf/pF=pf/E. So if f/F decreases as f increases with E constant, the ratio of the certainty equivalent of the lottery to its expected value is increasing with increasing risk, which means that the criminal is risk averse; if f/F increases with f, the criminal is risk preferring. It follows that if b is a constant other than 1, avoiding the corner solution requires criminals to be risk averse if b<1 and risk preferring if b>1.[4]
It appears that I have restored Becker's conclusion in a slightly modified form; criminals are either risk preferrers or risk averters according to whether the cost to the court system of imposing a punishment is positive (imprisonment) or negative (fine). But earlier I claimed that Becker's argument is wrong in form as well as in substance, and while correcting the substance I have retained the form. I have shown that if b is constant and greater than 1 (less than 1) and criminals are not risk preferrers (averters) there is no pair p>0, f that minimizes L. In order to conclude that criminals are risk preferrers (averters) I would have to show that as we move towards the corner at p=0, f=[[infinity]] criminals who are not initially risk preferrers (averters) must become so. That I cannot do. It is time to look more carefully at the other assumption--that b is constant.
Consider the choice of punishments available to the court system for different values of f, the equivalent fine. If f is small it can be imposed as a fine, with r>0 and possibly r=f. As f becomes larger more and more criminals become judgement proof; the fine must be replaced or supplemented by less efficient punishments such as execution (r=0) or imprisonment (r<0). Hence we would expect b =1-(r/f) to increase as f increases; higher punishments are less efficient.[5]
The argument can be made more rigorous if we allow the punishment itself to be a lottery. Suppose there exist two punishments f and g, g>f, for which inefficiency does not increase with increasing punishment; b(g)<b(f). Since g>f>0, there exists some lottery p',g, such that the criminal is indifferent between p',g and a certainty of f. Since, for reasons discussed in Friedman (1980), risk can be generated almost costlessly, the court system can impose the punishment p',g with b(p',g)=b(g) instead of f.6 It follows (if we neglect the cost of installing a roulette wheel in the courthouse) that as long as the court system chooses the most efficient way of imposing any level of punishment, b is a non-decreasing function of f.
Once we replace the assumption that b is constant with the more plausible assumption that it increases with f, the entire argument for why criminals should be risk preferring or averse collapses. If we decrease p and increase f the cost of catching criminals goes down while the cost of punishing them goes up; for marginal changes about the optimal pair p,f the two effects just cancel. If criminals are risk preferring the optimal values will be different, ceteris paribus, than if they are risk averse or risk neutral, but as long as b increases sufficiently fast when f gets large there will always be an interior solution.
The conclusion of my argument is that the analysis of optimal punishment tells us nothing about whether criminals will exhibit risk preference, aversion, or neutrality under an optimal system. In fairness to Becker, I must add that at one point in Becker (1968) he considers the possibility that the loss function might be increased by a "compensated" reduction in p, and that if so that could provide a different way out of the corner. What he does not seem to realize is that if criminals have preferences with regard to risk those preferences must be included in the social loss function in order to make the rankings it implies consistent with those implied by pareto superiority, and that doing so automatically makes the loss function increase or decrease with compensated reductions in p, according to the risk preferences of the criminals. Similarly, Becker notes that b=0 for fines and b>1 for many other punishments, but he does not consider the consequence of that for his argument on risk aversion.
References
Becker, Gary, "Crime and Punishment: An Economic Approach," 76 JPE 169 (1968).
Block, Michael K. and Lind, Robert C., "Crime and Punishment Reconsidered," JLS 1975
Carr-Hill, R. A. and Stern, N.H., "Theory and Estimation in Models of Crime and its Social Control and Their Relations to Concepts of Social Output," in The Economics of Public Services, M. S. Feldstein and R. P. Inman, editors. Macmillan, l977.
Friedman, David D., "Why There Are No Risk Preferrers," 89 JPE 600 (1981).
Friedman, David D., "Reflections on Optimal Punishment, or: Should the Rich Pay Higher Fines," Research in Law and Economics 3, 185 (1981).
Polinsky, A. Mitchell and Shavell, Steven, "The Optimal Tradeoff
Between the Probability and Magnitude of Fines," AER 69 (1979)
Back to the list of articles.
Back to my home page.