Should the Characteristics of Victims and Criminals Count?

Payne v Tennessee and Two Views of Efficient Punishment

Boston College Law Review

XXXIV No.4

pp.731-769

(July 1993)

David D. Friedman

Olin Fellow in Law and Economics

The University of Chicago Law School

Chicago, IL 60637[1]

The purpose of this paper is to investigate two interrelated issues. One is the question of how to use economic theory to construct an efficient set of criminal punishments: I will argue that a simple rule-set expected punishment [2] equal to the damage done by the crime-provides a useful first approximation, but only a first approximation, to the correct answer. The other question is how, if at all, punishment should be affected by the characteristics of criminal and victim. In answering that question, I hope to demonstrate both the usefulness and the limitations of the simple version of the economic theory of punishment, and the simple rule it implies, for dealing with several of the issues raised in a recent and controversial Supreme Court case-Payne v Tennessee.[3]

Part I of the article attempts to work out the economics of efficient punishment. Part II applies the analysis to the question of whether punishment ought to be affected by characteristics of the criminal-whether, for example, rich criminals should pay larger fines than poor criminals for the same crimes. Part III applies it to the parallel question raised in Payne-whether punishment ought to be affected by characteristics of the victim. Part IV expands, from an economic viewpoint, on one issue raised by Payne-the possibility of varying punishment according to consequences, in order to selectively deter criminals who have some but not perfect knowledge of what the consequence of their crime will be. Part V considers a constitutional issue raised by Payne and by the analysis of this article-whether making punishment depend on the characteristics of the victim violates the requirement of equal protection, applied not to criminals but to victims. Part VI considers a problem in moral philosophy raised by Payne and this article-whether it is just to make punishment depend on consequences of the crime that the criminal may not have anticipated.

Part I. The economics of Efficient Punishment

A legal system may be evaluated in a variety of ways by economists, legal scholars, or moral philosophers. Through most of this article, however, I shall assume that it has only one purpose: economic efficiency.[4] I view a legal system as a set of rules designed to affect behavior; a change in legal rules is an improvement if the summed benefits to those affected, measured by their money equivalent, is larger than the summed losses, where the money equivalent of a benefit or loss is the largest sum the affected party would pay to receive the benefit or avoid the loss.

Seen from this perspective, what is wrong with crimes is that they occur even if they are inefficient. My willingness to buy a television set demonstrates that it is worth more to me than to its present owner, so a voluntary sale is an improvement and should be permitted. But I may be willing to steal a television set even if it is worth much less to me than to its present owner. So inefficient theft may occur, and should be prevented.[5]

It seems to follow from this argument that we want to prevent only inefficient theft. If my stealing your television set produced a net benefit, even after allowing for associated costs (my time burgling, your expenses on burglar alarms), then changing the legal system to permit me to steal it would be an improvement.[6] One way of doing so is to set the expected punishment equal to the damage done. The criminal will commit the crime only if the value to him is greater than his expected punishment, hence greater than the damage, so efficient crimes and only efficient crimes will prove worth committing. Many discussions of efficient punishment argue either that this is what our legal system does or that it is what it should do.[7]

According to this view, a punishment for a crime[8] is simply a Pigouvian tax; like an emission fee for pollution, it forces an actor to bear the cost of his action. If a criminal commits a crime even though he knows that he will suffer an expected punishment equal to the damage done by the crime, that demonstrates that his benefit is greater than the victim's loss; the crime is efficient and ought not to be deterred.[9]

If this is right, the optimal expected punishment is simply equal to the damage done. All potential offenders whose benefit from committing the offense is less than the damage done will be deterred; they will face a punishment greater than their benefit, making the return from the offense (benefit minus punishment) a net loss. Potential offenders for whom the benefit is greater than the damage done will face a punishment less than their benefit, making the offense a net gain, so they will commit it. Inefficient offenses are deterred, efficient offenses are not deterred, so we have the efficient outcome.

If we are talking about speeding tickets, this sounds plausible enough. Presumably one reason we do not confiscate the cars of convicted speeders is that that might be too effective a punishment; we are not sure we want everyone always to keep to the speed limit. When applied to offenses such as rape or murder, however, the efficient crime paradigm of enforcement strikes many legal scholars, especially those who are not economists, as both unrelated to the real legal system and morally bizarre. It implies, among other things, that the reason we do not impose stiffer penalties on convicted murderers is that we are afraid of having too few murders.[10] It also seems to imply that the optimal punishment for murder is an increasing function of the value of the victim-an issue that was central to the recent Supreme Court case of Payne v. Tennessee.

The Inefficiency of Preventing All and Only Inefficient Crimes

The argument given above for setting expected punishment equal to damage done is wrong. The reason it is wrong is that it ignores the cost of preventing crime.[11] In order to impose a given expected punishment, we must catch some fraction of offenders and punish them. Both activities are costly. Typically, the cost per offense increases with both probability of apprehension and severity of punishment.[12]

It is obvious why the cost per offense increases with probability of apprehension; it takes more police to catch fifty murderers out of a hundred than to catch only twenty-five, and it takes more prosecutors and court time to convict them. To see why it also increases with the severity of the punishment, it is worth thinking a little about what, from an economic standpoint, the "cost of punishment" means.

Suppose the punishment for an offense simply consists of the convicted offender paying a thousand dollar fine to the state. The cost to the criminal, which is what gives the punishment its deterrent effect, is a thousand dollars. But the net cost, what economists call "social cost," is zero. Every dollar the criminal loses the state collects. In this case punishment cost, defined as the difference between the cost the punishment imposes on the criminal and the benefit it provides to others,[13] is zero.

What if the criminal cannot pay a fine high enough to provide the amount of deterrence we want to impose? In that case, instead of (or addition to) fining him, we imprison him-say for a year. Suppose a year's imprisonment is equivalent, from his standpoint, to a ten thousand dollar fine.[14] The cost the punishment imposes on him is ten thousand dollars, but the enforcement system receives none of that. Instead, the enforcement system must spend money-say another ten thousand dollars-to pay the cost of his imprisonment. So the net cost of the punishment, the criminal's loss plus the enforcement system's loss, is twenty thousand dollars.

As we increase the size of the punishment we wish to impose, the number of offenders who can pay it as a fine decreases, forcing us to shift to more costly punishments such as imprisonment. So increasing the severity of the punishment typically increases the punishment cost per offense punished.[15]

It is inefficient for me to steal a television set that is worth five hundred dollars to you and only four hundred dollars to me. But it is still more inefficient to prevent me from stealing the set if the cost of doing so is two hundred dollars additional expenditure on police, courts, and prisons. The rule "prevent all inefficient offenses and only inefficient offenses" is correct only if doing so is costless. The economically correct rule is to prevent an offense if and only if the net cost from the offense occurring is greater than the cost of preventing it. It follows that if there is a positive cost to preventing an offense, an efficient legal system will let some inefficient offenses occur.

We now have an answer to one of the criticisms of the economic approach. The reason we do not increase the punishment for murder need not be that we are afraid we would then have too few murders. It may be, and probably is, that although we would like to prevent more murders than we do prevent (indeed, we might like to prevent all murders), the cost of doing so is more than we are willing to pay.[16]

The cost of preventing an offense may sometimes be negative. While cost per offense increases with increases in expected punishment, number of offenses decreases, since the higher expected punishment deters some offenses that would otherwise have been committed. The fewer offenses occur, the less must be spent to apprehend and punish offenders. If this second effect outweighs the increase in cost per offense, then raising the expected punishment lowers the total enforcement and punishment cost-a system with higher punishments (and fewer offenses) costs less than a system with lower punishments (and more offenses). In such a situation, the additional cost of deterring one more offense is negative, so it is efficient to prevent not only all inefficient offenses but some efficient ones as well. In the extreme, one could imagine a society where the penalty for shoplifting was death, with the result that there were no shoplifters and nobody ever had to be caught, convicted, and executed.

As a less extreme example of a situation where the cost of preventing an offense is negative, consider an offense with the following characteristics:

Cost to victim: $1000 per offense

Cost per offense (enforcement plus punishment costs) of imposing an expected punishment of P:

Number of offenses if P=$1000: 100/year

Number of offenses if P=$1100: 50/year

To simplify the exposition, let the probability of conviction be one, making expected punishment P equal to actual punishment.

We begin with a penalty of $1000; a hundred offenses are occurring each year. They are all efficient offenses; the fact that they are committed despite the penalty means that the offenders are getting more than $1000 by committing them, so the offenders gain more than the victims lose.

If we raise the penalty to $1100 we will deter fifty efficient offenses a year. Each would have harmed the victim by $1000 and benefitted the criminal by something between $1000 and $1100. We know that the benefit to the criminal is at least $1000 because he still commits the offense even when the expected punishment is $1000. We know it is no more than $1100 because he does not commit it when the expected punishment is $1100. The net gain from each of those offenses is between zero and $100, so the loss from deterring fifty of them is between zero and $5000.

But by deterring those offenses, we save the cost of catching and punishing the offenders. Imposing a $1000 punishment on 100 criminals costs us $50,000. Imposing an $1100 punishment on 50 criminals costs $27,500. By raising the punishment we have saved $22,500 in punishment and enforcement cost. On net we are better off.[17]

The situation is shown by Table 1; everything except punishment is per year. Cost to victims is $1000 (the injury per victim) times the number of offenses. Gain to criminals is the value to them of committing the offenses. Net cost is the loss to victims plus enforcement and punishment cost minus the gain to the criminals; our objective is to minimize it.

X is the total gain to the criminals if the punishment is $1000 and 100 offenses occur each year. The value of the offense to the 50 offenders who would be deterred if we raised the punishment to $1100 is between $1000 and $1100. For simplicity, assume it is $1050, making the total value to the criminals of the 50 offenses equal to $52,500. So the total gain to the criminals falls to X-$52,500 when we raise the punishment to $1100, as shown in the table.

Table 1

Expected Number of Cost to Gain to Enforcement and Net not Including Net

Punishment Offenses Victims Criminals Punishment Cost E&P Cost Cost

$1000 100 $100,000 X $50,000 $100,000-X $150,000-X

$1100 50 $50,000 X-$52,500 $27,500 $102,000-X $130,000-X

Expected Punishment	Number of Offenses	Cost to Victims	Gain to Criminals	Enforcement and Punishment Cost	Net not Including E&&P Cost	Net Cost
$1000	100	$100,000	X	$50,000	$100,000-X	$150,000-X
$1100	50	$50,000	$X-$52,500	$27,000	$102,000-X	$130,000-X

If we increase the punishment from $1000 to $1100, gain to criminals falls by more than cost to victims, since we are deterring efficient offenses, so net cost not including enforcement and punishment cost is higher with the higher punishment, as shown in the next to last column. But that is more than balanced by the drop in enforcement and punishment costs, so net cost, the final column, is lower with the higher punishment.

The table does not show the cost to the criminals of paying the punishment. If included, it would appear twice, once as a cost and once as a benefit, and so have no effect on the net cost. It is a cost to the criminals; if the punishment is $1000, the net gain to the criminals is only X-$100,000, since they are paying $100,000 in fines as punishment for their offense. It is a benefit to the enforcement system that collects the fines. Punishment cost is the difference between what the criminal pays and what the enforcement system receives. With a punishment of $1000, for example, the enforcement system receives $100,000, $50,000 of which goes to pay the cost of catching and punishing criminals.

We have now seen an example of a situation in which it is efficient to set an expected punishment higher than the damage done by the offense, thus deterring some efficient offenses. Generalizing the argument, we can show that the level of expected punishment should be set equal to the damage done by an offense only if the marginal cost of deterring one more offense is zero. If the marginal cost of deterring one more offense is positive, then expected punishment should be less than damage done; offenses that are only slightly inefficient, that injure the victim by only a little more than they benefit the criminal, are not worth the cost of deterring. We expect marginal cost of deterrence to be positive for crimes for which an increase in expected punishment deters only a small fraction of offenses, so that we end up with almost as many offenses as before the increase and a substantially larger enforcement and punishment cost per offense. Such crimes are described, in economic terms, as in very inelastic supply.

If, on the other hand, the marginal cost of deterring one more offense is negative, if the sum of enforcement and punishment costs decreases as we increase the level of punishment, due to the decrease in the number of offenses to be punished, then the level of punishment should be more than damage done. In such a situation, as in the example of Table 1, we are willng to deter a few efficient offenses in order to avoid the cost of punishing them. We would expect that situation to occur for crimes for which a small increase in expected punishment produces a large reduction in offenses-crimes in very elastic supply. For such crimes the large reduction in number of offenses as we increase the punishment outweighs the increase in enforcement and punishment cost per offense.[18]

This solution to the problem of setting optimal punishments combines elements of two different intuitions: punishment equal to damage done and enough punishment to deter. If imposing punishment is inexpensive, the optimum is about equal to damage done-enforcement and punishment costs are unimportant, so we simply design our system to deter all inefficient and only inefficient crimes. If the supply of offenses is highly elastic at some particular level of punishment, so that below that level there are many offenses and above it very few, the optimal punishment is at the point where any further increase would have very little deterrent effect to balance its cost-just enough punishment to deter most offenses.

Efficient Punishment: A Formal Treatment

The same argument can be put in a more precise mathematical form as follows. We define:

(b): the density of offenses per year as a function of the gain b to the offender of committing the offense.[19]

: the number of offenses per year whose perpetrators gain more than P by committing them. Since an offense will be committed only if the gain is at least as great as the expected punishment, O(P) is the number of offenses that occur annually if the expected punishment is P.

C(P): the cost per offense of imposing an expected punishment P, using the least costly combination of actual punishment and probability. I assume that this does not depend on the number of offenses.

D: the damage done per offense. For simplicity this too is assumed independent of the number of offenses.

We wish to find , the expected punishment which minimize a social cost function:

SC(P) = O(P) [D+C(P)] - (Equation 1)

The first term on the right hand side is the cost of crime-number of offenses multiplied by damage per offense plus enforcement cost (the cost of catching, convicting, and punishing offenders) per offense. The second term is the benefit of offenses to the offenders. The integral starts at b=P because only crimes for which benefit to the criminal is at least equal to expected punishment will be committed.

Setting the derivative of SC(P) with regard to P equal to 0, we have, for P equal to its optimum value :

0 = - D()+ + ( ) =( )[ -D] +

Solving for the optimal punishment we have:

(Equation 2)

Equation 2 is the mathematical equivalent of the result derived in the earlier verbal argument. O(P)C(P) is the total cost of imposing an expected punishment of P on Q(P) offenses. Deterring one more offense requires an increase in P of , so is the cost of deterring one more offense. If >0 at P= , then total enforcement cost is increasing with increasing punishment, and, as can be seen from Equation 2, the optimal punishment is less than the damage done. If <0 at P= , then total enforcement cost is decreasing with increasing punishment (due to the decrease in the number of offenses) and the optimal punishment is more than the damage done.[20]

Wrong Argument, Right Answer?

My analysis so far implies that the simple description of efficient punishment is wrong. If our objective were economic efficiency, we would not, even if we could, choose to punish all inefficient offenses and only inefficient offenses.

Although this way of looking at efficient punishment is wrong, it is also useful. It provides a simple model that can be applied to a wide range of legal regulation of behavior. At some extremes, the model's description is a deceptive one-as when it implies that we are concerned about not deterring too many murders. But for much behavior-speeding tickets, pollution charges, library fines, arguably most of civil law-preventing inefficient behavior is a fairly good, although somewhat oversimplified, description of our objective.

Even in cases, such as murder, where the literal application of the model may seem absurd, it still contains a considerable element of truth. The limiting factor in how many murders we deter is not our fear of deterring efficient murders. But, seen from the standpoint of economic efficiency, the reason we are willing to bear substantial costs in order to deter murder is that we believe it is (very) inefficient-that the gain to the murderer is typically much less than the loss to his victim.

Even those who reject economic efficiency as a complete description of the objectives of our legal system should not reject it as a partial description. It may well be true that we would want to deter all murders (supposing we could do so costlessly) even if we believed that some were, in the strict economic sense, efficient.[21] But we would be a great deal less concerned with deterring murders if we did not believe that the costs of murder were large compared to the benefits.

Consider the following as evidence for that claim. There have been several famous shipwreck cases involving murder and cannibalism.[22] People who write and think about such cases find the punishment of such behavior much more troubling than the punishment of ordinary murder. That suggests that if the benefit of committing murder were much higher relative to the cost, if situations where an individual could preserve his own life only at the cost of someone else's were common, we might have substantially different attitudes toward murder.

Alternatively, imagine a society where everyone regarded life after death-perhaps reincarnation-as a proven fact. To the members of that society the cost imposed by a murderer on his victim would seem much lower than it does to most of us. I conjecture that in such a society murder would be considered less serious relative to other crimes-more nearly comparable to, say, grand larceny-than it is in ours.[23]

So one reason the efficient crime model is useful is that it provides a simple picture that helps unify our view of legal sanctions. Its simplicity is an advantage, especially for expository purposes, over the more complicated, more correct, and more general model that I have set out above. It is also an advantage over models that treat specific moral judgements, such as our opposition to murder or theft, as givens, rather than as conclusions to be derived from more general considerations such as economic efficiency. Its generality is an advantage over the alternative of considering separately offenses that we do not deter because they are efficient (some speeding) and offenses that we do not deter because it would cost too much to do so (some murders).

A second reason why the model is a useful one is that although it is quantitatively wrong, it is, for a wide range of cases, qualitatively right. It does not tell us what the punishment for any particular offense should be. But it does tell us, in most cases correctly, in what direction changes in the characteristics of the offense will move the optimum punishment.

If we actually used our theory to pick out offenses that should or should not be deterred, the two models would give different results. But in practice that is not how we use our theory, because we usually do not have an accurate measure of the benefit to the offender or the cost to the victim. Rather, we use the theory to produce qualitative conclusions, to argue, for instance, that certain offenses or certain offenders will, in an efficient system, be punished more severely than others.[24] As we will see, arguments of this sort can be transferred intact from the first model (efficient crimes) to the second. The quantitative conclusions change and additional factors become relevant, but the qualitative argument remains.

I have spoken in the abstract of how moving from one model to the other affects the conclusions. The rest of this essay provides a series of examples, showing both how an argument formulated in terms of the prevention of inefficient crimes remains relevant under the more sophisticated analysis and how the change introduces additional factors that might change the conclusion. I start with the question of how punishment should be affected by the income of the offender, and then go on to consider how it should be affected by the characteristics of the victim-the central issue in Payne v. Tennessee.

Part II: Should the Rich Pay Higher Fines or Receive Shorter Sentences?

In an article published a few years ago in the Journal of Political Economy,[25] John Lott argued that the tendency of our legal system to produce lower probabilities of conviction for higher income defendants is evidence for, not against, the economic efficiency of the criminal justice system.[26] His analysis used a model of efficient law enforcement in which expected punishment was set at a level designed to deter only inefficient crimes. His argument may be summarised as follows:

A month in jail, or a week in court, represents a larger dollar cost to someone with a higher income; measured in money, his time is more valuable. If rich defendants receive the same jail sentences with the same probability as poor defendants, then they are actually paying a higher (dollar) penalty. If the efficient penalty is equal to the damage done, it should be the same for rich and poor. It follows that an efficient legal system will either impose lower (non-money) penalties on richer defendants or impose them with lower probability. Our legal system does in fact impose lower expected (non-money) punishments on richer defendants; that is evidence in favor of the thesis that our system is economically efficient.

How does the inclusion of punishment costs affect the conclusion that richer people should receive lower expected jail sentences? In the simplest case, it does not. If all the relevant functions-cost of apprehension, cost of punishment, and elasticity of the supply of offenses-are the same for rich and poor, then Lott's argument goes through in this more complicated case. The optimal expected punishment is a particular amount of money, hence fewer days in jail (or an equal fine) for people with higher incomes.

Intuitively, that result makes sense. In Lott's model, imposing equal jail terms on rich and poor would mean either that rich people were being charged more than the damage done by their offenses (and hence that some efficient crimes were being deterred) or that poor people were being charged less than the damage done (and hence that some inefficient crimes were occurring). In my model, equal jail terms would mean that the marginal offense committed by a rich man, while perhaps inefficient, would be less inefficient than the marginal offense committed by a poor man-hence less worth the cost of deterring. Both models imply equal fines for rich and poor, or unequal jail sentences.

The assumption that the functions are independent of income is, however, an implausible one, for several reasons.[27] One, at least, brings us back to one of the intuitions of those who believe that rich and poor should receive the same jail sentences-and that the rich should pay higher fines.[28] The supply function for offenses shows the number of offenses as a function of the expected punishment. If punishments are in money, and rich and poor people have different values for money, we would expect the deterrent effect of a given punishment to vary with income.[29]

To make the argument more rigorous, it is worth distinguishing between two sorts of offense-those that have a roughly equal payoff in utility for rich and poor and those that have a roughly equal payoff in money. Stealing $100 provides the same amount of money to a rich man as to a poor man, so we would expect that the same fine would deter it. Indeed, since the time of the rich man is worth more dollars per hour than that of the poor, we would expect that if they are equally good thieves, so that it takes each the same amount of time to steal $100, the rich man would be deterred by a lower fine than the poor.

Consider, however, an offense whose payoff, measured in money, is higher for richer offenders. One example would be saving ten minutes by speeding; another would be slugging someone you were mad at. The money value of the offence is higher to the richer offender, so it will require a higher (money) punishment to deter him.

Whether this implies a higher efficient punishment depends, in a somewhat complicated way, on the shape of the supply function for offenses and the related cost functions for deterrence.[30] Where the efficient rule comes close to "impose just enough punishment to deter all offenders," then the efficient system would impose higher (dollar) punishments on higher income offenders, since higher punishments are needed to deter them. The opposite result occurs if imposing the high expected punishment necessary to deter high income offenders is so costly that it is not worth deterring those crimes.

So far, the only difference between high and low income offenders I have considered is in the supply function for offenses. There is a second difference with less ambiguous implications. A fine is a more efficient punishment than a prison term, and richer offenders can pay higher fines. Even if neither offender can pay a sufficiently high fine, imposing a given dollar punishment via imprisonment requires fewer days in jail for a higher income offender, and is therefore cheaper. So punishment costs (per dollar of punishment) should decrease as income rises, which implies a higher efficient dollar level of punishment for richer offenders.[31]

Part III: Payne v Tennessee: Does the Value of the Victim's Life Matter?

"Today's majority has obviously been moved by an argument that has strong political appeal but no proper place in a reasoned judicial opinion. Because our decision in Lockett ... recognizes the defendant's right to introduce all mitigating evidence that may inform the jury about his character, the Court suggests that fairness requires that the State be allowed to respond with similar evidence about the victim. ... This argument is a classic non sequitur: The victim is not on trial; her character, whether good or bad, cannot therefore constitute either an aggravating or mitigating circumstance."

(Justice STEVENS, dissenting in Payne v. Tennessee )

On the face of it, Justice Stevens' argument seems compelling. Permitting the character of the victim, like the character of the defendant, to be introduced in evidence may be fair as between victim and defendant, but the victim in a criminal case is not a party to the suit. Insofar as fairness is relevant in that context, it is fairness between the defendant and the state. And, as pointed out elsewhere in the dissent, the usual policy in criminal law is to try to tilt in favor of the defendant, in order to balance the superior power of the state.

There is, however, a sense in which the Court's position is correct. If, as I have been assuming, criminal law is intended to produce an efficient outcome, then decisions such as whether to impose the death penalty involve balancing costs and benefits. One of the benefits is saving the lives of potential victims by deterring crimes that might have been committed against them.[32] One of the costs is executing criminals. The value of saving lives depends on the value of the lives saved; the cost of execution depends on the value of the life ended. So a correct decision requires the jury to balance the value of the victim's life against the value of the defendant's life.[33] To that extent, the Court is right and Justice Stevens is wrong.[34]

This does not mean that murderers should be executed if and only if their lives are deemed by the jury less valuable than their victims' lives. Executing a particular murderer will not save his victim's life-that is already lost. The jury's willingness to execute a particular murderer for killing a particular sort of victim may, however, affect how many similar murders occur in the future. So there is a tradeoff between murderers' lives and victims' lives, but not necessarily at a rate of one for one.[35]

To the extent that potential murderers know the value of the lives of their potential victims, the rule announced by the court means that expected punishment as perceived by the offenders is an increasing function of the damage done by the offense, as efficiency requires. The murderer in Payne was aware of the fact-that his victim was a mother with two small children-that the prosecution used in persuading the jury to sentence him to death.[36] In such cases the court's rule will tend, cæ teris paribus, to increase the protection that the law provides to mothers of small children, and to other victims whose death will impose large costs on their survivors. Someone contemplating killing such a person will expect a more severe penalty, and thus be more likely to be deterred.

The rule established by Payne would also permit such evidence to be introduced in cases where the offender was not aware of the relevant facts at the time of the murder. The dissent argued that this feature of the rule violated the eighth amendment, since it could make the application of the death penalty depend on something irrelevant to the wickedness of the murderer's act.[37] A similar argument could be made from an economic standpoint. If the murderer does not know the value of his victim's life, then selective punishment will not provide selective deterrence. Even if the murderer knows that he will be punished more severely for killing certain kinds of victims, he does not know whether his potential victim is one of them.[38]

In this case, however, the Court's position can be defended in a slightly different way. Even if all victims are identical, so that the issue of selective deterrence does not arise, we still have the problem of deciding what the penalty for murder should be. A more severe penalty imposes larger costs on convicted murderers in order to deter crimes and reduce the cost to potential victims. Where the decision is whether to impose capital punishment for murder, the jury is deciding whether to sacrifice the lives of murderers in order to save the lives of (generic) victims. In choosing a penalty, the jury is implicitly balancing those costs and benefits.

If the legal rules present the defendant as a living, breathing human being with parents who care about him, while presenting the victim as a shadowy abstraction, the result will be to overstate, in the minds of the jury, the cost of capital punishment relative to the benefit. So the rule announced in Payne can be interpreted, not as a way of giving the jury information about the special value of one victim relative to other victims, but as a way of reminding the jury that victims, like criminals, are human beings with parents and children, lives that matter to themselves and others. That seems relevant information, if the jury is to decide whether the benefit of deterring some murders is worth the cost of executing some murderers.[39]

So far in this section I have not distinguished between the simple version of the efficient punishment model and the correct version. The reason is that both lead to the same conclusion. If our objective is to prevent all inefficient murders by setting punishment equal to damage done, then the punishment for destroying a life should be higher the more valuable the life. If our objective is to prevent murders whenever the cost of prevention is less than the net damage done by the murder, then we should be willing to impose higher punishment costs-execution rather than imprisonment, for example-for murders that do more damage. So the result in Payne v. Tennessee makes sense in terms of both the simple and the complicated versions of the model.

One feature of the decision that does not seem to fit either version of the efficient punishment model, however, is the Court's discussion of what "value of life" means. The Court explicitly rejected the idea of comparing the value of one life to the value of another, and seemed to reject the idea of evaluating lives on any economic basis.[40] The dissent responded by arguing that without such a comparison evidence about the victims would tell the jury members nothing they did not already know, and would introduce "such illicit considerations as ... the status of the victim in the community." (Justice Marshall).[41]

One way of making sense out of the Court's position has already been suggested. If the objective of victim impact statements is not to give the jury special information about why one victim is more deserving than another, but rather to remind the jury of the value of the lives of victims, then no comparative judgement among victims is required. The comparative judgement is rather between the lives of victims and the lives of their murderers. This interpretation seems more consistent with what the Court actually said than the alternative, in which victim impact statements are intended to provide the information necessary for selective deterrence.[42]

A second possible justification for the Court's position was implied by the Attorney Generals of Tennessee and the U.S. in oral argument.[43] Even if juries cannot compare the life of one victim to the life of another, the victim is not the only injured party. If the effect of one murder is simply to kill the victim, while the effect of another is to kill the victim and orphan her three small children, one can argue that the latter is a more serious offense even though the lives of the victims themselves are equally valuable.[44]

If either of these interpretations is correct, then the Court, like the dissent, is rejecting one of the implications of the economic approach to criminal law. Where criminals are, or might be, aware of characteristics that affect the value of the lives of their victims, selective punishment would provide selective deterrence and thus make the criminal law more efficient. The result in Payne v. Tennessee will allow that to happen but only, to judge by the Court's dicta, as an unintended consequence.

Part IV: Punishment by Consequences:

The Selective Deterrence of Imperfectly Informed Criminals

One argument made repeatedly in both Payne and the prior literature is that it is unjust to make the punishment of the criminal depend on factors, such as characteristics of his victim, of which he was unaware when he committed the crime. A similar argument applies if one's concern is efficient deterrence.

In most real cases, however, criminals are neither perfectly informed nor perfectly ignorant. Even someone who murders a stranger in the course of a robbery is likely to have some idea of the age and sex of his victim-which is relevant to the probability that the victim is a mother with small children. In less anonymous cases, the criminal is likely to have more information. In the actual case of Payne, the only relevant pieces of information the criminal did not have when he committed the murder were the fact that one of his victims would survive and the details of how that victim would react to the death of his mother and sister.

This raises the question of how the economic analysis of selective deterrence applies to a criminal with some, but imperfect, information.[45] The answer to that question provides another example of the general thesis of this essay-that the simple version of the economic analysis of optimal punishment gives a first approximation, but only a first approximation, to the result of the correct model.

In order to see that, let us consider a simple case. There are two types of victims-low value victims and high value victims.[46] The total damage done to everyone affected by a murder-the victim, survivors, other members of society-is H for a low value victim and 2H for a high value victim. Each potential murderer i has a probability p_i that his victim is a low value victim and 1-p_i that his victim is a high value victim.[47] Each actual murderer has a .5 probability of being apprehended and convicted.[48] What is the consequence of making the punishment of a convicted murderer depend on the value of his victim? How does that legal rule compare, from the standpoint of economic efficiency, with the alternatives of either imposing the same punishment on all murderers or making the punishment depend upon what the court believes the murderer knew at the time of the crime-the court's estimate of p_i?

We first consider this question in the context of the simple model. We assume there are no costs of punishment[49] or apprehension; our objective is therefore to set the expected punishment equal to the damage done, deterring all inefficient crimes and and only inefficient crimes. We do so by setting the punishment at 2H for killing a low value victim (expected punishment = probability of conviction x 2H = H = damage done) and 4H for killing a high value victim.

Consider a potential criminal i. The expected harm his offense will do is the probability his victim is low value times H plus the probability his victim is high value times 2H, which is:

<Harm> = p_ixH+(1-p_i)x2H

If he commits the murder, his expected punishment is the probability his victim is low value (p_i) times the expected punishment for killing a low value victim plus the probability his victim is high value (1-p_i) times the expected punishment for killing a high value victim, giving:

So expected punishment equals expected harm, whatever p_i may be.

To put the same analysis verbally, expected damage is a weighted average of actual damage, expected punishment is a weighted average of actual punishment,[50] the weights (p_i and 1-p_i) are the same in both cases, so if actual punishment equals actual damage, expected punishment will equal expected damage. Selective punishment thus results in the schedule of expected punishments that the court would impose if it knew p_i and could calculate the expected damage imposed by each murder and adjust the punishment accordingly.[51] That is a more efficient result than could be imposed directly by a court with anything short of perfect information about what each criminal knew when he committed his crime.[52]

Consider the limiting case where potential criminals know nothing about their potential victims; p_i is the same for all i, say .5. Each criminal faces an expected punishment of (3/2)H, equal to the expected harm done by his crime. He has one chance in four of being convicted of killing a low value victim (punishment 2H) and one chance in four of being convicted of killing a high value victim (punishment 4H). Since the criminals are assumed to be risk neutral, this is equivalent to a system where all criminals who were convicted (probability one half) received a punishment of 3H.

So in the worst case for selective punishment (criminals have no information about the victims) or the best case for punishment based on criminal's knowledge (the court has perfect information about the criminals) punishment according to outcome (what sort of victim actually got killed) is no worse than the alternatives; in any other situation it is better.

What happens to this result in the more sophisticated model, where we include in our calculations the cost of catching and punishing criminals? The answer is that the argument carries over in a qualitative but not a quantitative sense. It is still true that selective punishment results in a higher expected punishment for criminals whose victims are more likely to be of high value, and that a higher punishment for those criminals is desirable. But it is no longer true that selective punishment produces the optimal result, nor that it is better than the alternatives as long as criminals have some information, however little, about their victims, and courts have less than perfect information about criminals.

This is true for two reasons. The first is that, although an efficient system will, cæ teris paribus, impose higher punishments on offenses that do more damage, the relation is no longer one of simple proportionality between damage done and efficient punishment. The optimal expected punishment, for reasons explained in an earlier part of this essay, is damage minus the cost of adjusting the schedule of punishments (and enforcement) to reduce the number of offenses by one.[53] That cost will generally be different at different levels of punishment. There is no reason to expect that an offense doing twice as much damage should be punished exactly twice as severely. The optimal punishment might be three times as large, or only one and a half times.

The criminal, in calculating the expected punishment he faces, averages the punishments for killing the two different kinds of victims, using as weights the relevant probabilities. But the optimal punishment calculated using the more sophisticated model is not simply the weighted average of the two punishments. So the expected punishments that criminals calculate will vary in the right qualitative way-they will be higher for criminals who have a higher probability of killing high value victims and thus doing more damage-but they may well be different from the optimal punishments that would be set by a court that had all the information the criminals had and used that knowledge to make punishment depend on what the criminal knew at the time of his offense.

The second reason why selective punishment is no longer necessarily optimal is that, once we introduce punishment costs, different patterns of punishment that are equivalent from the standpoint of the criminal may no longer be equivalent from the standpoint of the rest of society-they may have different costs. To see the relevance of this, again consider the case where criminals have no information about their victims, with p_i=.5 for all i.

With selective punishment, a criminal who is convicted faces a .5 chance of the punishment for killing a high value victim plus a .5 chance of the punishment for killing a low value victim. Even if this punishment lottery happens to produce the right expected punishment, it may not be the least expensive way of doing so. It might be less expensive to choose an intermediate punishment and impose it on all offenders.

Consider the following example. It may well be that execution, because of the repulsion towards killing in our society, is a much more inefficient punishment than imprisonment-one that imposes a larger cost per unit of deterrence.[54] Suppose that, from the standpoint of the criminal, life imprisonment is exactly equivalent-exerts the same deterrent effect-as a fifty percent chance of execution combined with a fifty percent chance of a ten year sentence. If so, and if the social cost of the latter alternative is higher than the social cost of the former, then selective punishment of completely ignorant criminals (execution for killing a high valued victim and ten years for killing a low valued victim) provides the same deterrence as unselective punishment (life for all murderers), but at a higher cost.[55]

In this case, as in the case discussed earlier where punishment might vary with the income of the criminal, the simple model of deterring all inefficient crimes and only inefficient crimes gives us an approximation of the right answer, but only an approximation. The argument and the conclusion carry over to the sophisticated model, but only approximately. If criminals know a good deal about their potential victims (p_i varies substantially with i), and courts do not know much about what criminals know (courts do not have good information about p_i), selective punishment based on victim characteristics is probably more efficient than either unselective punishment (all murderers get treated equally) or selective punishments based on the court's estimate of the criminal's knowledge at the time of the crime. If criminals are badly informed, or if courts are well informed about what criminals know, selective punishment based on victim characteristics is still superior in the simple model, but not in the sophisticated model.[56]

I have discussed this question in the context of capital punishment for murder, since that was the issue raised by Payne, but the analysis applies more generally. The argument of this section provides both an economic justification for making the severity of the punishment imposed for a crime (or the amount of damages awarded for a tort) vary with the damage done and a qualification to that justification in situations where offenders are badly informed about the consequences of their acts and courts are well informed about the minds of offenders. In the context of tort law, the same argument provides a justification for the familiar rule that the tortfeasor takes his victim as he finds him.[57]

Throughout the discussion, I have assumed that after the offense has occurred it is possible to measure the consequences, and that it is therefore at least possible, although not necessarily desirable, to make the punishment depend on the damage done. There are some interesting cases where that is not possible. Many, such as pollution, are handled through the regulatory system. The emission of a particular pollutant at a particular place and time may do no damage at all, or it may result in someone dying who would otherwise have lived. Punishment is based on some ex ante estimate of expected cost, since actual cost can usually not be measured. A similar problem occasionally arises in tort, as in the DES cases,[58] where it was impossible to assign liability for particular injuries to particular defendants.

Part V: Payne, McClesky, and Equal Protection for Victims

Neither the Court nor the dissent discussed in detail the reasons for rejecting comparative judgements among victims, and hence selective deterrence. One obvious candidate is the general norm of equal protection, as embodied in the Fourteenth amendment to the U. S. Constitution.[59] This possibility is suggested by the evidence offered by the defense in another case, McClesky v. Kemp.[60] From the standpoint of our present discussion, one striking feature of that case is the failure of either the majority or minority opinions to consider the application of the principle of equal protection to the protection of potential victims. To see why one might have expected that issue to arise, it is worth reviewing the evidence offered:

"In support of his claim, McCleskey proffered a statistical study performed by Professors David C. Baldus, Charles Pulaski, and George Woodworth (the Baldus study) that purports to show a disparity in the imposition of the death sentence in Georgia based on the race of the murder victim and, to a lesser extent, the race of the defendant. The Baldus study is actually two sophisticated statistical studies that examine over 2,000 murder cases that occurred in Georgia during the 1970's. The raw numbers collected by Professor Baldus indicate that defendants charged with killing white persons received the death penalty in 11% of the cases, but defendants charged with killing blacks received the death penalty in only 1% of the cases. The raw numbers also indicate a reverse racial disparity according to the race of the defendant: 4% of the black defendants received the death penalty, as opposed to 7% of the white defendants.

...Baldus subjected his data to an extensive analysis, taking account of 230 variables that could have explained the disparities on nonracial grounds. One of his models concludes that, even after taking account of 39 nonracial variables, defendants charged with killing white victims were 4.3 times as likely to receive a death sentence as defendants charged with killing blacks. According to this model, black defendants were 1.1 times as likely to receive a death sentence as other defendants."[61]

The defense argued that this evidence showed an unconstitutional discrimination against black defendants. On the evidence presented, the direction of the discrimination is ambiguous. Black murderers appear slightly more likely to receive a death sentence than white murderers, all other things held constant-including the race of the victim. But black murderers, on average, kill black victims[62]-with the result that actual black murderers are substantially less likely than actual white murderers to receive a death sentence-4% vs 7%.[63]

What is unambiguous is the discrimination against black victims.[64] The evidence suggests that, all other things held constant, the murderer of a white victim is more than four times as likely as the murderer of a black victim to receive a death sentence. If we take murders as they occur, rather than trying to use statistical methods to control for factors that correlate with race, the actual murderer of a white victim (in Georgia) was about eleven times as likely as the murderer of a black victim to receive the death penalty.[65]

The fourteenth amendment to the constitution provides that: "... nor shall any State ... deny any person within its jurisdiction the equal protection of the laws."[66] Part of the protection I receive from the law, arguably the most important part, is the protection provided by a legal system that punishes crimes committed against me. One important argument in favor of the death penalty is that it deters more effectively than lesser punishments. If so, then the evidence presented in McClesky strongly suggests that blacks in Georgia get substantially less protection of the law from murder than do whites. It seems odd that neither the Court nor (with one partial exception) the minority in the case discussed that issue.[67]

The Court in McClesky neither explicitly accepted nor rejected the proposition that a judicial system whose policies resulted in less protection for blacks than for whites was in violation of the fourteenth amendment. If they had rejected that proposition, they might still have accepted the weaker claim that features of a legal system deliberately designed to provide different levels of protection to different potential victims were unconstitutional. Even if it is obvious that the law does not, in practice, protect everyone equally, it may still be improper to make stronger protection for more valuable lives an explicit justification for a legal rule.[68] If so, that would provide an explanation of the Court's unwillingness to base its defense of victim impact statements on their ability to provide selective deterrence.

A slightly different reason is hinted at by the dissent, and was raised explicitly in the briefs and in oral argument.[69] If it is appropriate to impose especially high punishments on the murderers of especially valuable victims, then it would seem equally appropriate to impose especially low punishments on the murderers of especially worthless victims. This raises the specter of a system where sufficiently unpopular people-prostitutes, drug users, members of unpopular religious, racial, or political groups-could be killed with impunity.[70]

One way that a court might try to deal with this problem would be by creating a legal rule that permitted victim impact statements by the prosecution but not by the defense-a possibility discussed in the oral argument.[71] Since the prosecution is presumably trying to get as high a punishment as possible, only evidence favorable to the victim would be introduced.[72] Legal problems aside,[73] this raises interesting difficulties of a game-theoretic nature.

Will Prosecutors Tell All?

Suppose we have a legal system in which the prosecution, but not the defense, may introduce evidence on characteristics of the victim. Further assume that the objective of each prosecutor is to get as severe a sentence as possible in the case currently being prosecuted, and that juries are fully rational and aware of how prosecutors behave. Finally, assume that the characteristics of victims can be ranked by their potential effect on the jury, and that prosecutors are aware of the ranking; they know how juries will react to the facts about particular victims. How will prosecutors behave?

Suppose a prosecutor follows a policy of only introducing evidence on the characteristics of a victim if the victim is "above average"-if the information will lead the jury to impose a more severe sentence than if the jury knew nothing at all about the victim. The problem with that policy is that having the jury know nothing at all about the victim is not one of the prosecutor's options, since the jury can get information not only from what the prosecutor says but from what he does not say. When the prosecutor chooses not to introduce evidence on the characteristics of the victim, a rational jury will deduce that the victim must be below average. The jury will therefore treat the victim about whom it has been told nothing not as an average victim but as an average unattractive victim-and reduce its sentence accordingly.

To make the argument more precise, imagine that we rank the victims on a percentile scale, with the most attractive victim rated 1.00, the median victim 0.50, and the least attractive 0.00. Prosecutors pick some X between 0 and 1 and introduce evidence on the victim's characteristics if and only if the victim rates above X. If the prosecutor does not introduce such evidence, a rational jury aware of how prosecutors behave will conclude that the victim ranks between 0 and X, and will base the verdict on an average victim, ranking about X/2.[74]

Consider a prosecutor in a case where the victim ranks slightly below X but above X/2. The prosecutor can expect to get a more severe sentence if he reveals the victim's characteristics to the jury than if he does not. So a strategy of only introducing evidence for victims who rank above X is unstable-it pays a prosecutor to break the rule by introducing evidence on any victim slightly below the cutoff. The argument applies as long as X is greater than zero. The only stable strategy is for prosecutors to introduce evidence on the characteristics of all save the least attractive victims.

This analysis assumes that prosecutors, in deciding whether to introduce evidence in a case, consider only the effect on the outcome of that case. If there were a single prosecutor controlling all cases, he would realize that lowering X in order to get a better result in one case would produce a worse result in cases where he chose not to reveal the information, since a lower X would result in a lower estimate by the jury of the attractiveness of victims whose characteristics were not revealed.

In such a situation, the two effects on the average verdict of changes in X tend to balance each other. For each victim, there is a penalty the jury fully informed as to the characteristics of that victim would give to his killer. If the penalty chosen by a jury ignorant of the victim's characteristics is simply its desired penalty averaged over the characteristics the victim might have had,[75] the balancing is exact; with more complicated jury preferences it is not.[76] In the simple case, at least, a single prosecutor controlling all cases would be indifferent to the level of X, unless he himself, like the jury, was in favor of giving more severe punishments to defendants who had killed more attractive victims, in which case he would set X=0 and reveal all.

In a system with many prosecutors, each more concerned with the severity of his verdicts than with the severity of the verdicts gotten by other prosecutors, the incentive for a prosecutor to lower X is stronger.[77] Most of the undesirable effect of lowering X in a particular case is born by other prosecutors in other cases, so the gain to a prosecutor from lowering X for his cases will almost always be less than the loss, making the only stable situation one in which X=0 and juries are fully informed of the characteristics of victims.

So there is reason to believe that rational prosecutors, dealing with rational juries, would find themselves driven to tell all-to reveal the characteristics of all but the least attractive victims. Rational juries would then deduce that any case in which the prosecution did not provide a victim impact statement involved an extraordinarily unattractive victim, and set the sentence accordingly. In such a situation, giving only the prosecution the power to introduce evidence on victim characteristics would not be sufficient to protect unattractive victims.[78]

If this argument is correct, there may be no way of achieving the efficiency gains of selective deterrence without the cost of permitting juries to effectively nullify the law against murdering unpopular people.[79] This would be a strong argument against the result in Payne if there were no other way in which juries could achieve that undesirable result. Unfortunately, that is not the case. One of the costs of a jury system is the potential for jury nullification of good laws as well as bad ones. The experience of German Americans during World War I, Japanese Americans during World War II, and Black Americans during much of the past century demonstrates that our legal system provides very limited protection to sufficiently unpopular minorities.

Part VI: Punishment by Desert or Punishment by Consequence:

The Problem of Moral Luck

A second issue suggested by Payne v. Tennessee raises some philosophical puzzles about what ought to determine punishment. The dissent argued that, at least in the case of capital punishment, the only thing that matters is the blameworthiness of the particular defendant. If, as may often be the case, the murderer does not know much about the defendant when he decides to kill her, then her characteristics are irrelevant to how wicked he is and should be irrelevent to his punishment. A victim impact statement, in those cases where the criminal did not know the victim, makes the murderer's punishment depend on morally irrelevant factors, and should therefore be prohibited.

In order to fit this argument into the discussion of this article, I now drop the assumption that economic efficiency is the only value by which legal rules ought to be judged. Suppose we assume, instead, that efficiency and justice are distinct goals, both desirable; we may be willing to accept some reduction in efficiency in order to make our system more just, and we may be willing to accept some reduction in justice in order to make the system more efficient.[80]

How will this change in our assumptions alter our conclusions so far as the question of punishing by results is concerned? If it is just to impose a more severe punishment on a criminal who has done more damage, then that reinforces any efficiency arguments in favor of punishment by consequences. If, on the other hand, the only just basis of punishment is the nature of the act as perceived by the criminal when he committed it, that is an argument for basing punishment on the best estimate the court can make of ex ante expected injury rather than on the court's observation of ex post actual injury-even if we conclude that the latter rule would be more efficient.

On the face of it, the moral argument against basing punishment on actual consequences seems to apply to all crimes and punishments, not merely murder and execution.[81] If punishment ought to be a function of how blameworthy the criminal is, then punishment should never be affected by factors that the criminal did not know about or could not control. That sounds persuasive but, as the Court points out, it does not describe how our system actually works.[82]

Indeed, it probably does not describe how any legal system actually works. A drunk driver who runs into a tree is subject to considerably less severe sanctions than one who runs into a pedestrian. A gunman whose victim survives is guilty of attempted murder; one whose victim dies is guilty of actual murder. The blameworthiness is the same, but the penalty is different.[83] In a wide range of civil and criminal cases, the sanction visited upon an offender depends in part on things that have little or nothing to do with how bad a person he is.[84]

This paradox-that punishment does, and that to most people it seems that punishment should, depend on factors unrelated to how wicked the crime shows the perpetrator to have been-has long concerned philosophers writing about both moral desert and legal punishment. Current discussions often include it in the more general category of "Moral Luck."[85] The case in favor of ignoring luck in moral judgements was made by Kant, who wrote:

"The good will is not good because of what it effects or accomplishes or because of its adequacy to achieve some proposed end; it is good only because of its willing, i.e. it is good of itself. ... Usefulness or fruitlessness can neither diminish nor augment this worth."

Kant does not go on to apply the argument to bad will. Adam Smith, however, made a strong argument against the moral relevance of luck, for good or ill:

"Whatever praise or blame can be due to any action, must belong, either, first, to the intention or affection of the heart, from which it proceeds; or, secondly, to the external action or movement of the body, which this affection gives occasion to; or, lastly, to the good or bad consequences, which actually, and in fact, proceed from it. These three different things constitute the whole nature and circumstances of the action, and must be the foundation of whatever quality can belong to it."

"...To the intention or affection of the heart, therefore, to the propriety or impropriety, to the beneficence or hurtfulness of the design, all praise and blame, all approbation or disapprobation of any kind, which can justly be bestowed upon any action, must ultimately belong."[86]

"That the last two of these three circumstances cannot be the foundation of any praise or blame, is abundantly evident; nor has the contrary ever been asserted by any body. ... The consequences which actually, and in fact, happen to proceed from any action, are, if possible, still more indifferent either to praise or to blame, than even the external movement of the body. As they depend, not upon the agent, but upon fortune, they cannot be the proper foundation for any sentiment, of which his character and conduct are the objects."

"... yet, when we come to particular cases, the actual consequences which happen to proceed from any action, have a very great effect upon our sentiments concerning its merit or demerit, ... . Scarce, in any one instance, perhaps, will our sentiments be found, after examination, to be entirely regulated by this rule, which we all acknowledge ought entirely to regulate them."[87]

The next two chapters of Smith's discussion of the paradox[88] provide an explanation of why we feel this way-an explanation of moral sentiments not moral facts. He concludes with a consequentialist argument, designed to show that our feelings, although irrational, are useful, and thus evidence of the divine wisdom:

"Sentiments, designs, affections, though it is from these that according to cool reason human actions derive their whole merit or demerit, are placed by the great Judge of hearts beyond the limits of every human jurisdiction, and are reserved for the cognizance of his own unerring tribunal. That necessary rule of justice, therefore, that men in this life are liable to punishment for their actions only, not for their designs and intentions, is founded upon this salutary and useful irregularity in human sentiments concerning merit or demerit, which at first sight appears so absurd and unaccountable. But every part of nature, when attentively surveyed, equally demonstrates the providential care of its Author; and we may admire the wisdom and goodness of God even in the weakness and folly of men."[89]

Smith's argument is that, since we can observe outcomes but not intentions, it is sensible to base human punishments on outcomes and leave the punishing (and rewarding) of intentions to God.[90] It is equivalent, in a less mathematical form, to my earlier discussion of punishing imperfectly informed criminals. If the court had the information that God has, it would know the ex ante probabilities facing the criminal; by basing its punishment on that information, it could (in a world of costly punishment) do better than if it based punishment on actual outcome. Since courts do not have that information, they are better off basing punishment on outcome.

For Smith, and similarly for Beccaria,[91] this provides a moral as well as a prudential argument for punishment according to outcome. It is the best that man can do, and God will take care of correcting the inevitable errors in both directions. For those of us who are concerned with providing justice without divine assistance, however, the prudential argument still leaves a moral problem.[92] Even if it is prudent to use selective punishment to provide selective deterrence, is it just to punish differently offenders who may be equally wicked, merely because one had the good luck to miss his intended target or to choose a less attractive victim?

Thomas Nagel discusses this problem at considerable length.[93] His analysis, applied to the sort of situation considered here, implies not only equal punishment for the murderer who succeeds and the murderer who fails, but also equal punishment for the person who, yielding to a particular temptation, commits murder and the person who would have committed murder if faced by the same temptation, but had the good luck never to be so tempted. After discussing the clash between such conclusions and our moral intuitions, he concludes that:

"I believe that in a sense the problem has no solution, because something in the idea of agency is incompatible with actions being events, or people being things. But as the external determinants of what someone has done are gradually exposed, in their effect on consequences, character, and choice itself, it becomes gradually clear that actions are events and people things. Eventually nothing remains which can be ascribed to the responsible self, and we are left with nothing but a portion of the larger sequence of events, which can be deplored or celebrated, but not blamed or praised."

...

"The problem of moral luck cannot be understood without an account of the internal conception of agency and its special connection with the moral attitudes as opposed to other types of value. I do not have such an account. ..." (pp. 37-38)

One possible answer to these problems is that Nagel and others are too quick to assume that what people deserve depends only on what they are. They, following Smith and Kant, take it for granted that differences in outcome due to factors beyond the agent's control cannot be morally relevant. To put it differently, they take it for granted that the answer to the question "what ought to happen to you" can depend only on the anwer to the question "what sort of a person are you" and not on such extraneous issues as what consequences your actions have caused.

My point here is closely related to one raised by Robert Nozick in a different context. In discussing the problem of defining a just society,[94] he distinguished between ethics of desert and ethics of entitlement. The distinction can be shown with a simple example:

Suppose we have a society in which everyone has what he deserves, however that is correctly calculated. In this society, two people decide to bet a dollar on a flip of a coin. The loser pays the winner.

If justice is a matter of desert, the society is now unjust. The winner did not deserve to win- which way the coin fell was a matter utterly unrelated to his moral worthiness. Since it was unrelated to moral worth, it cannot have increased what the winner deserved by a dollar and decreased what the loser deserved by a dollar. Yet most of us would say that it is just for the loser to pay off his voluntarily occurred debt.

Nozick deals with this problem by the idea of entitlement-a moral category different from desert. I am entitled to something if I have acquired it in a morally legitimate way from someone who legitimately owned it. Mutual assent, as in the case of the bet, is a morally legitimate form of transfer, and the starting situation was, by assumption, just, so the winner is entitled to his dollar.

This simple example brings up an important tension in our moral intuitions. On the one hand, we feel as though reward and punishment ought to be deserved. On the other hand, we feel as though certain acts create obligations or entitlements, not because of what they tell us about the moral worthiness of those who take them but because of their consequences.

I offer the conjecture that these two different approaches to moral desert are ultimately grounded in two different ways of thinking about the moral problem. One approach considers the problem from the viewpoint of God judging mankind. Actual consequences are irrelevant-God can cancel them, if he wishes, with a wave of his hand. Desert is entirely a question of how good or bad a person is, and that is a matter that God is competent to judge.

The other approach assumes moral judgements are to be made within a society of equals. My opinion about how good or bad a person you are has no special status-there is no reason to believe that it is more accurate than anyone else's opinion, including yours. The consequences of your acts, on the other hand, are there to be observed by everyone.[95] Thus a moral system that makes punishment and reward depend on outcomes seems more appropriate to such a society than one in which they depend on someone's opinion of moral merit.

Furthermore a society of equals, unlike a society ruled by divine providence, faces a budget constraint. If my careless driving results in an accident that damages your car, somebody is going to have to pay for fixing it.[96] Bad outcomes that occur without any wicked intention still result in costs that must be paid by someone; wicked intentions alone, without bad consequences, do not. So it again makes sense for the system of moral obligations to be based on outcomes, not merely intentions.[97]

We are left with two different sorts of rules. One sort allocates punishments and rewards according to moral merit-a sort of divine report card. The other bases them on something more like a system of accounts. Certain acts under certain circumstances result in some people having obligations to others-obligations that may be entirely independent of moral merit.

While I have presented the former approach as theist and the latter as humanist, that is a description of the pattern of the rules, not the beliefs of those that hold them. Rules suitable to be applied by humans may seem appropriate to a theist considering human institutions. That is the position of both Smith and Beccaria. They reject punishment by moral desert not because it is inappropriate but because it is inappropriate to human courts and should therefore be left to divine justice. Similarly, one may believe in reward and punishment based on moral merit even if one does not believe in the existence of a God with the knowledge and power necessary fully to carry out such a program.

Seen from this standpoint, the dissent's claim that whether or not someone is executed should depend only on his blameworthiness seems problematic. One factor relevant to punishment is how bad a person the criminal has revealed himself to be, but another may be how much damage he has done.[98]

[1] I would like to thank Gary Becker, David Emmanuel, Wendy Gordon, William Landes, James Lindgren, Larry Lessig and Richard Posner for useful comments and suggestions.

[2] Expected punishment is probability of punishment times amount of punishment. If an offender faces a .1 probability of having to pay a $1000 fine, his expected punishment is .1x$1000 = $100. If there are several different possible punishments for the same offense, then the expected punishment is probability times punishment summed over all the punishments. Thus if the offender faces a .1 probability of a $1000 fine and a .2 probability of a $100 fine, his expected punishment is .1x$1000 + .2x$100 = $120.

[3] 111 S.Ct. 2597, 115 L.Ed.2d 720, 59 U.S.L.W. 4814, reversing Booth v. Maryland, 482 U.S. 496, 107 S.Ct. 2529, 96 L.Ed.2d 440 (1987), and South Carolina v. Gathers, 490 U.S. 805, 109 S.Ct. 2207, 104 L.Ed.2d 876 (1989)

[4.]This is the same objective that Richard Posner describes as "wealth maximization." (Richard Posner, Economic Analysis of Law, 1992, pp 12-16, "The Problems of Jurisprudence," 1990, pp. 356-357.) For a more detailed discussion of what it means and why it might be a desirable objective, see Chapter 15 of Price Theory: An Intermediate Text (2nd edn.), David Friedman, 1990. The assumption that efficiency is the only purpose of the legal system will be dropped in the final section of the article, where I discuss some philosophical issues relevant to the decision in Payne

[5] One reason the television set is worth less to me may be that its value is net of the cost to me of stealing it-burglar's tools, time and effort spent breaking into your house, and the like. Economic analysis of the market for theft implies that the marginal thief gets no net benefit; the cost to him of being a thief equals the value to him of what he steals, so the cost to his victim is a net loss with no benefit to balance it. The analysis is worked out in Friedman (1990) Chapter 20, pp. 565-569.

[6 ]This particular example is an implausible one. If your television set is worth more to me, there is no need for me to steal it; I can buy it instead. My gain from stealing it is only the money I save by not buying it from you. But that is equal to your loss, so after including the associated costs the theft is inefficient. It follows that if a crime is simply an involuntary substitute for a voluntary transaction, we would never expect it to be efficient. See Friedman (1990) pp. 569-573, Posner (1992), pp. 14-16,206-211, 220-222.

There are, however, involuntary transactions that have no voluntary substitute. Many are things we usually classify as torts, but some are crimes. If, for instance, I drive home after having two glasses of beer, I save myself a taxi fare but impose a cost, in possible death or injury, on every driver, rider, and pedestrian along my route. Even if the savings to me is larger than the cost to them, there are severe transactional problems with trying to get all of them to agree to allow me to drive. Or consider an efficient assault. One can imagine a situation where one person is so angry at another that he is willing to attack him, even though he knows he will be fined an amount equal to the full damage done. A more exotic example would be efficient theft-by someone who enjoyed the excitement enough to more than make up for the associated costs. See Posner (1992) p. 218.

[7] See, for example, the discussion in Chapter 7 of Posner (1992). Posner suggests that expected punishment should be slightly above damage done to deter inefficient crimes and force criminals whose crimes would be efficient to substitute still more efficient market transactions, while permitting efficient crimes for which no good market substitute exists.

[8 ]Or a damages award for a tort. The analysis applies to civil damages as well as to criminal penalties, as I discuss in "An Economic Explanation of Punitive Damages," Alabama Law Review 40 (spring 2989) 1125-114. The same analysis could also be applied to administrative penalties and to sanctions used by a firm to control the behavior of its employees. In this article I will be concentrating on the application of the analysis to punishing crimes, since that is the particular issue raised by Payne.

[9] Throughout my analysis, I assume that costs and benefits to criminals count, in social welfare calculations, just like costs and benefits to anyone else. This assumption has been questioned by a few scholars in the law and economics field, most notably George Stigler and Gordon Tullock, who suggest that benefits to criminals ought to be given no weight in such calculations. My reasons for rejecting this position are discussed at some length in David Friedman, "An Economic Explanation of Punitive Damages," Alabama Law Review 40, 3 (spring 1989).

[10] This point, and to some extent this article, were suggested to me by discussions with Stephen Schulhofer, themselves arising out of a correspondence between Stephen Schulhofer and John Lott.

[11] I am ignoring in this essay two other problem with the argument. The probability of apprehension, and hence the expected punishment, is different for different criminals, so even if we wanted to prevent inefficient and only inefficient crimes, there is no pattern of enforcement that would do so. And some punishments, such as imprisonment or execution, not only provide an incentive not to commit a crime but also make it more difficult to commit further crimes. I am considering only deterrence, not incapacitation.

[12 ]This argument appears in David Friedman, "Reflections on Optimal Punishment or Should the Rich Pay Higher Fines?," Research in Law and Economics, 1981. A more recent discussion is in Friedman (1989). See also A. Mitchell Polinsky and Daniel L. Rubinfeld, "The Welfare Implications of Costly Litigation for the Level of Liability," XVII JLS 1, (1988).

[13] The benefit considered here is the direct benefit of the punishment-a fine received by the state, tort damages received by the victim in a civil case, the cost of running a prison (a negative benefit) or the like. It does not include the deterrent effect of the punishment, which is considered separately in the analysis. It does include benefits or costs that the victim, or others, receive from knowing that the punishment has been imposed. The death penalty might be a very inefficient punishment if many people in the society were made unhappy by the knowledge that a criminal had been executed.

Deterrence depends on expected punishment, but punishment cost per unit of punishment typically depends on the actual punishment employed. This is probably true of costs such as public disapproval as well as costs such as maintaining a prison. While economic theory focusses on the appropriate expected punishment for a given crime, the public, which observes punishment but not probability, may well judge the system by the relation of the actual punishment to the crime. If so, then one effect of the victim impact statements discussed below may be to raise the perceived wickedness of the crime in the eyes of the jury, and thus raise the ceiling on the maximum punishment the jury is willing to impose. The effect is the same as if the jury were adjusting expected punishment-probability times actual punishment-in response to an increase in its perception of the damage done by the offense, since in either case the probability of imposing the death penalty rises, but the reason for the effect is different. This point was suggested to me by James Lindgren.

[14] When I say that one punishment is equivalent to another, I mean that they have the same deterrent effect. From the standpoint of utility theory, this is equivalent to saying that both punishments have the same disutility for the offender.

[15] A more rigorous form of the argument appears in Friedman (1981).

[16. ]While my discussion will focus on direct costs of enforcement and punishment, one should also include costs such as the possibility that more severe enforcement will result in more innocent parties being convicted, that increases in governmental powers designed to catch more criminals may be used in other and less desirable ways, etc.

[17] An even better solution would be to punish only the inefficient offenses. We would thus avoid both the cost of punishing the efficient offenses and the inefficiency of preventing them. But in order to know which offenses are efficient, we must somehow find out whether the criminal's gain is more or less than the cost imposed on the victim. The way we find out, just as on ordinary markets, is by charging a price-an expected punishment in the case of offenses-and seeing whether the criminal is willing to pay it. In order to do that we must impose the punishment on inefficient as well as efficient offenses.

Where there is some other way of identifying the efficient offenses, we can save the expense of punishing them. One example is the excuse of necessity. The hunter who, lost in the woods and starving, breaks into a locked cabin in order to telephone for help will not be treated like an ordinary trespasser.

[18] The earliest mention of the effect of the elasticity of the supply function for offenses on optimal punishment that I have come across is in Gary Becker, "Crime and Punishment an Economic Approach," 76 JPE 169 (1968).

[19] I am making no assumption as to whether or not each offense is committed by a different offender. Since I am considering only deterrence and not incapacitation, the analysis is the same for the case where all offenses are committed by the same criminal, the case where each offense is by a different criminal, or anything in between.

[20] It is worth noting that the relationship between optimal expected punishment and damage done depends on how enforcement cost changes with expected punishment at the optimum. In general, the elasticity of the supply of offenses will be different at different values of P.

[21] Readers who cannot imagine how a murder could be efficient may find the following hypothetical of interest. A wealthy and bored big game hunter decides that the only animal dangerous enough to be really worth hunting is man. He accordingly makes the following offer to a group of adventurers:

"I will pay ten of you a million dollars each. In exchange, each agrees that I may choose one of the ten at random and attempt to kill him."

Ten adventurers accept the offer. Ex ante, the contract is a Pareto improvement-everyone concerned is better off, since the adventurers are each willing to accept a ten percent chance of being picked in exchange for a million dollars (assume nobody else knows about the contract). Yet many people would still believe that such a contract should not be enforced-that murder ought to be illegal even between consenting adults.

[22 ]Regina v Dudley and Stevens 1884, 14 QBD 273 is one example. See the discussion in Posner (1992), pp. 241-242.

[23] This raises an interesting historical question. Was the differential punishment for murder less in societies that believed in either an afterlife or reincarnation? In England in the Middle Ages, all felonies, not only murder, were capital offenses-and punishment for all felonies, including murder, might sometimes be converted into a fine.

A further complication is that in such societies death may be a weaker sanction than in ours, at least if the offender expects an attractive afterlife-which may explain why heretics were often not merely executed, but executed in strikingly painful ways. See the discussions in Paul Brest, "The Misconceived Quest for the Original Understanding," in Boston University Law Review vol. 60, pp. 204-238 at 221 and in Posner (1992), p. 229-230.

[24] If we believe that efficiency is desirable, we might also use the analysis to make qualitative recommendations-to argue that certain offenses ought to be punished more severely than others.

[25] Lott, John, "Should the Wealthy be Able to `Buy Justice'," JPE 95, December 1987, pp. 1307-1316.

[26] We would also expect, from applying the simple model to differences in the money equivalent of the damage done rather than the money equivalent of the punishment, that fines for assaulting rich people would be higher than for assaulting poor people. Here again, one may believe that the result is unjust but also that it is correct-that in this regard our legal system does resemble an economically efficient one, whether or not it should. Such distinctions were an explicit element of the Anglo-Saxon law out of which our law developed. One argument against the Court's decision in Payne v. Tennessee, discussed below, is that one of its effects may be to make the punishment for murdering rich people higher than for murdering poor people.

[27] "But what is the same punishment? Is the same fine, for example, productive of the same effect on rich and poor? Or does the same number of years in prison have the same effect on different individuals regardless of their diverse temperaments or physique?" Morris Raphael Cohen, "Moral Aspects of the Criminal Law," Yale Law Journal, Vol. 49, pp. 987-1026 (1940).

[28]

"Some crimes are attempts against the person, others against property. The penalties for the first should always be corporal punishments. ... The great and rich should not have it in their power to set a price upon attempts made against the weak and the poor; otherwise riches, which are, under the laws, the reward of industry, become the nourishment of tyranny. ... I shall limit myself to considering only the punishments to be assigned to noblemen, asserting that they should be the same for the first as for the least citizen." (Cesare Beccaria, On Crimes and Punishments, 1764, p.40. Henry Paolucci translator, Bobbs Merrill, Indianapolis 1963. pp. 69-70.)

In this passage, Beccaria is arguing, in effect, for equal jail sentences rather than either equal fines or unequal jail sentences; he does not consider the possibility of unequal fines. He goes on to deal with the claim that the punishment really imposes a larger cost on a noble, because of his greater education and greater vulnerability to social stigma ("the disgrace that is spread over a noble family") by arguing that the proper measure of punishments is the public injury done and that greater damage is done by a crime "when committed by a person of rank"-presumably because of the bad example. His argument is in part consistent and in part in conflict with mine.

[29] This argument is worked out in considerably more detail in Friedman (1981).

[30] See Friedman (1981) for an explanation and formal analysis.

[31] A similar argument might apply to the enforcement of rules regulating the behavior of firms. Suppose that any judgement above $10 million will push a particular firm into bankruptcy. Further suppose that bankruptcy is a bad thing-the real value of the firm is greater as a going concern. In that case a punishment of $11 million (of which only $10 million will be paid) is much more costly than a punishment of $9 million. This would apply to criminal punishments, civil punishments, and administrative penalties. In each case, punishment cost becomes large when the punishment reaches a level that creates a significant probability of bankruptcy and becomes infinite when it exceeds the liquidation value of the firm. It follows that it may be efficient to impose larger punishments on wealthier firms, even if the offense is the same.

Another application of the analysis would be to a firm attempting to controll the behavior of its employees. Some employees can be punished for malfeasance by firing, denial of promotion or other internal sanctions. Others can only be punished by expensive legal procedures. The optimal sanctions for employee malfeasance and the appropriate level of precautions will vary accordingly.

[32] I, like most economists, assume that increasing the penalty for a crime will tend to decrease its occurrence; I realize that some people disagree, and that there are other grounds on which the case for and against capital punishment can be and is argued.

[33] Throughout my discussion, I assume that the sentence is set by the jury, as was the case in Payne. Essentially the same arguments would apply if it were set by the judge instead.

In order to avoid misunderstanding, I should make it clear that I am not claiming that jurors (or judges) are necessarily trying to produce the economically efficient result, still less that they always succeed in doing so. My claim is only that the relation between the value of the victim and the value of the defendant is relevant both to the efficient decision and to the actual behavior of the jury.

[34] For a general overview of the history of capital murder and the attempt to determine which murderers ought to be executed, see the discussion in the commentary on the Model Penal Code SS 210.6 (American Law Institute 1985).

[35]Isaac Ehrlich, in a widely discussed and widely criticized study of the effect of capital punishment ("The Deterrent Effect of Capital Punishment: A Matter of Life and Death, 65 Am. Econ. Rev. 397 (1975)), concluded that each execution deterred several murders.

[36] He was not aware that one of the children would survive, to be the subject of the prosecutor's oratory. The dissent did not, however, try to argue that, having done his best to kill all three victims, he should not be held morally responsible for the emotional pain to the one who survived.

[37] "Where, as is ordinarily the case, the defendant was unaware of the personal circumstances of his victim, admitting evidence of the victim's character and the impact of the murder upon the victim's family predicates the sentencing determination on "factors ... wholly unrelated to the blameworthiness of [the] particular defendant." Booth v. Maryland, supra, 482 U.S., at 504, 107 S.Ct., at 2534; South Carolina v. Gathers, supra, 490 U.S., 810, 109 S.Ct., at 2210." (Justice Marshall, dissenting in Payne v. Tennessee) The dissent also argued that, even if the defendant was aware of the relevant circumstances, presenting them to the jury would tend to produce a decision based on emotion rather than reason. The experience of reading the case, surely less moving than the experience of sitting through it, provides both evidence for this claim and a powerful argument in favor of the jury's decision.

[38] If the criminal has some information about the value of the victim's life-knows, for example, that she is of an age at which she is likely to be a mother with small children-then selective punishment produces some selective deterrence, although less than if the criminal were perfectly informed about the victim. This point is discussed at greater length below.

[39] The point is demonstrated, unintentionally, by one of the briefs opposing the result eventually reached by the court:

"In the case at bar, therefore, the prosecution could have argued to the jury that the perpetrator likely knew that, if by chance a child survived the attack, he or she would long for his or her mother or sibling.

The fact that the prosecution could have made this argument does not justify its formal presentation of Ms. Zvolanek's testimony in blatant violation of Booth. Her live emotional testimony that Nicholas did in fact cry for his mother, that he repeatedly asked for "my Lacie", and that he asked his grandmother if she "also missed Lacie" is markedly different from the prosecutor's merely drawing a general inference during an argument." (Petitioner's brief, Payne.)

The obvious response is that "drawing a general inference during an argument" presents a less, not more, accurate picture of the damage done by a murder than the sort of dramatic and emotional testimony that was actually introduced. This point is made in the amicus brief of the State of California: "Booth has relegated the victim of a capital crime to a faceless, undifferentiated mass ... ." and again in the amicus brief for The National Organization For Victim Assistance:

"Victims speaking of harm done, of the effect the crime has had on their lives, do not claim that one life is more valued than another, but rather bring into sharp focus for the judge, the jury, and society, the realities of what the aftermath of violent crime exacts on each of these essential parts of life in a free society. To muzzle all victims at capital sentencing hearings for fear that some may be more persuasive or express more eloquently the horrors of crime, is the truly arbitrary and capricious decision." (Judith Rowland)

"To require, as we have, that all mitigating factors which render capital punishment a harsh penalty in the particular case be placed before the sentencing authority, while simultaneously requiring, as we do today, that evidence of much of the human suffering the defendant has inflicted be suppressed is in effect to prescribe a debate on the appropriateness of the capital penalty with one side muted. If that penalty is constitutional, as we have repeatedly said it is, it seems to me not remotely unconstitutional to permit both the pros and the cons in the particular case to be heard." Booth at 520 (Scalia, J. dissenting).

"What Booth and Gathers ... are suggesting is a generic victim, an abstract victim, an invisible victim at the sentencing." (Burson, page 44 of the transcript provided by Alderson Reporting Company, Inc., hereafter referred to as "the transcript").

[40

]"Payne echoes the concern voiced in Booth's case that the admission of victim impact evidence permits a jury to find that defendants whose victims were assets to their community are more deserving of punishment than those whose victims are perceived to be less worthy. Booth, supra, 482 U.S., at 506, n. 8, 107 S.Ct., at 2534 n. 8. As a general matter, however, victim impact evidence is not offered to encourage comparative judgments of this kind-for instance, that the killer of a hardworking, devoted parent deserves the death penalty, but that the murderer of a reprobate does not. It is designed to show instead each victim's "uniqueness as an individual human being," whatever the jury might think the loss to the community resulting from his death might be. The facts of Gathers are an excellent illustration of this: the evidence showed that the victim was an out of work, mentally handicapped individual, perhaps not, in the eyes of most, a significant contributor to society, but nonetheless a murdered human being." (Rehnquist, C.J for the Court in Payne v. Tennessee)

The amicus brief by the State of California, on the other hand, argued for comparative judgements among victims:

"Contrary to the assumption in Booth, the harm to society may be greater depending upon the characteristics of the victim. The murder of a police officer, parent or child harms society more than the murder of a drug dealing child molester."

[41 "]The fact that each of us is unique is a proposition so obvious that it surely requires no evidentiary support. What is not obvious, however, is the way in which the character or reputation in one case may differ from that of other possible victims. Evidence offered to prove such differences can only be intended to identify some victims as more worthy of protection than others." (Stevens, J., dissenting in Payne v. Tennessee).

[42] Another possible interpretation of what is actually happening in cases like Payne is that the prosecution is establishing, not the value of the victim's life, but the innocence of the victim. Jurors may be more likely to identify with victims who are entirely innocent than with those who were, in some sense, partly the cause of their own deaths. One example would be a drug dealer killed by a rival; a less clear one would be the victim in a marital quarrel. This point was suggested to me by Wendy Gordon. Such considerations were not raised by either the Court or the dissent.

[43] Burson distinguished during the hearing between "worth and sanctity of a human life," which is the same for all lives, and societal harm, which might vary from one victim to another (p. 38 of the transcript). And Thornburgh responded to a question by Justice Scalia with "It's not the characteristics themselves but what has resulted from the death of that individual in a loss to the victim, the family, and the community" (p. 54 of the transcript).

[44] The civil law, in case of wrongful death, has traditionally carried this argument even farther, basing damages on the injury to everyone except the victim. The concept of "hedonic damages" repesents a recent attempt to include in the calculation the value of the victim's life to himself.

[45] In this discussion, I am taking the criminal's knowledge as given. One effect of a legal system that made the severity of punishment depend, in part, on the characteristics of the victim would be to give potential criminals an incentive to learn more about potential victims before deciding whether to kill them.

[46] At this point I am adopting the view the court rejected-that punishment should vary with the value of the victim's life. Readers who are uncomfortable with the idea that some victims are more valuable than others may wish to think of high value victims as mothers with small children and low value victims as ninety year old men with incurable cancer. Those who are still uncomfortable with the idea may wish to transfer the analysis to some less serious crime than murder, and consider it as applicable to the question of whether imperfectly foreseen harm should be considered in setting the sentence for that crime.

[47] The probability is a description of what the potential murderer knows when he decides whether to commit the murder. It is his knowledge that is relevant to his decision, and it is his decision that we are trying to affect by imposing a punishment in order to deter a crime.

[48] In a more elaborate analysis, one would want to let the probability of apprehension depend on the value of the victim; the police could, probably do, and in an efficient system probably would, try harder to apprehend murderers of victims whom they consider more valuable. A further possibility not considered here is that in some cases the difficulty of catching the offender may depend on the outcome of the offense. Consider, for example, attempted and actual murder. A successful murderer does not have to worry about being identified by his victim's testimony-an unsuccessful murderer does. For a situation where the offender who has done more damage is easier to apprehend, consider violations of safety regulations. It is harder to conceal a violation if it has killed someone.

[49] This assumption implies that criminals are risk neutral, since an uncertain but otherwise costless punishment imposed on a risk averse criminal would generate a cost of risk bearing.

[50] More precisely, actual expected punishment for killing a victim of a given value. It is still an expected punishment because it is an actual punishment if convicted times the probability of being convicted.

[51] This appears to be the policy advocated by most of the opponents of the Court's decision in Payne, insofar as they are willing to accept the idea that the consequences of some murders are predictably more heinous than the consequences of others. See the passage from the petitioner's brief in Payne quoted in fn 39 above.

[52]Throughout this analysis I assume that any attempt at selective deterrence by the court must be based on the criminal's beliefs about the victim. One could imagine a system where a court was better informed than the criminal, ex ante, about the costs imposed by a particular offense, and conveyed that information to the potential criminal by announcements about its penalty schedule. For example, the court might (and some legislatures, in effect, do) announce that "the lives of policement are especially valuable, and we will therefore execute you if you kill one." Such an announcement affects the incentives of a potential murderer, even if his subjective probability that the policeman he is contemplating killing is a particularly valuable person is very low, since what matters is what he thinks the court thinks.

In my analysis, I am concerned with how the court uses the criminal's knowledge about the victim to provide selective deterrence-either by basing punishment on what the court thinks the criminal knew, or by basing punishment on the actual outcome, and relying on the effect of that policy working through the criminal's probabilities for the alternative outcomes.

This issue is discussed at greater length in David Friedman, "Deterring Imperfectly Informed Tortfeasors: Optimal Rules for Penalty and Liability" (1992) (manuscript available from the author).

[53] This form of the result of the argument is worked out explicitly as Equation 2 in part I above and in Friedman (1981).

[54] Here, as in most (but not all) of the law and economics literature, the social cost of the punishment includes both the cost to the criminal (his life, in the case of capital punishment) and the cost to others, including moral revulsion, the cost of running prisons, the hangman's fee, etc.

[55] Such a situation is particularly likely if one of the alternatives has a very low probability but a very high cost. Consider a crime, such as replacing the medicine in a bottle with aspirin or putting a sub-lethal dose of poison on Chilean produce in a U.S. grocery store as a protest against the policies of the Chilean government, which usually does no significant damage but has a small probability of killing someone. The equivalent of selective deterrence would be a policy of punishing the perpetrator according to the damage done-a small fine most of the time, and execution if someone dies. It may be less costly and more effective to instead impose a moderately severe punishment based on the expected damage. A version of this example was suggested to me by David Emmanuel.

[56] This result must be stated in such an imprecise form because I have not specified the actual form of the relevant supply curve (of offenses as a function of expected punishment) and cost curves (for punishment, apprehension, and conviction). If the additional term in the optimal punishment calculation added by the existence of these costs varies only slightly over the relevant range of punishments, then the sophisticated model gives almost the same result as the simple model. In that case, either a very well informed court or a very badly informed set of criminals would be necessary to make selective punishment by victim less efficient than selective punishment by court's estimate of criminal's knowledge.

[57] This issue is discussed at greater length in Friedman (1992) (manuscript available from the author). For a somewhat different view, see A. Mitchell Polinsky, "Optimal Liability When the Injurer's Information about the Victim's Loss is Imperfect, IRLE(1987), 7 (139-47)

[58]Sindell v. Abbott Laboratories, 26 Cal. 3d 588, 607 P.2d 924, 163 Cal. Rptr. 132 (1980), Murphy v. E.R. Squibb & Sons, Inc., 40 Cal. 3d 672, 710 P. 2d 247, 221 Cal. Rptr. 447 (1985)..

[59] "Isn't the real problem with getting into the-or at least with the prosecution's taking the affirmative in getting into the character of the victim, that it implies that society is valuing victims differently?" " Isn't the real problem one, almost one, a kind of maybe a second-tier equality before the law argument, that society is placing different values on their victims-on victims?" (Remarks by Justices on pp. 26 and 27 of the transcript )

[60]481 U.S. 279, 107 S.Ct. 1756

[61] McClesky v. Kemp, Majority opinion by Justice Powell.

[62] Black murderers presumably differ in other statistically relevant ways from white murdereres. I have not worked with the original data, so do not know how much of the discrepency between the treatment of black murderers and white murderers is due to the difference in the race of their victims, but it seems likely that it is the major factor.

[63]

"Most black victims are killed by black murderers, and a disproportionate number of murder victims is black. Wherefore the discrimination in favor of murderers of black victims more than offsets, numerically, any remaining discrimination against other black murderers." Ernest Van den Haag, "The Death Penalty Once More," U.C. Davis Law Review, Vol. 18 p. 961 (1985).

"Those who demonstrated the pattern seem to have been under the impression that they had shown discrimination against black murderers. They were wrong. However, the discrimination against black victims is invidious and should be corrected." Van den Haag (1985), p. 961 fn 23.

Gary Kleck, in "Racial Discrimination in Criminal Sentencing: A Critical Evaluation of the Evidence with Additional Evidence on the Death Penalty," 46 Am. Soc. Rev. 783, 797-98 found that the risk of a death sentence was higher for a white defendant than a black defendant throughout the period 1967-1978, presumably because of discrimination by race of victim.

[64] This issue is raised in Norval Morris, "Race and Crime," Judicature vol. 72 p. 111. For a survey of the various studies, see Samuel R. Gross, "Evaluating Evidence of Discrimination," U.C. Davis Law Review, Vol. 18 pp. 1275-1325 (1985). The author concludes that "The scientific implications of these studies are simple. The evidence indicates, unmistakably, that there has been substantial discrimination in capital sentencing by race of victim, at least in those states that have been extensively studied."

[65] "All [Baldus's models] showed race-of-victim disparities, virtually all of which were highly statistically significant. Many showed race-of-defendant disparities as well." McClesky v. Kemp, Statement of the Case: Petitioner's Record Evidence, in Landmark Briefs and Arguments of the Supreme Court of the United States, Philip B. Kurland and Gerhard Casper, Editors, vol. 171 pp. 468-9. "In sum, most of Baldus' many measures revealed strong, statistically significant disparities in capital sentencing in Georgia homicide cases, based upon the race of the victim. (T. 726-28). Race of defendant disparities also regularly appeared, although not with the invariable consistency of the victim statistics.

[66] U.S. Constitution, Amendment XIV, Section 1.

[67] The one exception is a passage in Justice Blackmun's dissent (beginning "Moreover, the legislative history of the Fourteenth Amendment reminds us that discriminatory enforcement of States' criminal laws was a matter of great concern for the drafters," and including a footnote on discriminatory law enforcement during the post civil war period) which raises the issue of unequal protection of potential victims, but does not apply it to the case under discussion. The issue was also raised in the brief for the petitioner:

"The history of the Equal Protection Clause establishes that race-of-victim discrimination was a major concern for its Framers, just as Professor Baldus has now found that it is a major feature of Georgia's administration of the death penalty. Following the Civil War and immediately preceding the enactment of the Fourteenth Amendment, Southern authorities not only enacted statutes that treated crimes against black victims more leniently, but frequently declined even to prosecute persons who committed criminal acts against blacks. ... The congressional hearings and debates that led to enactment of the Fourteenth Amendment are replete with references to this pervasive race-of-victim discrimination; the Amendment and the enforcing legislation were intended, in substantial part, to stop it. As the Court recently concluded in Briscoe v. LahueI, 460 U.S. 325, 338 (1983), "[i]t is clear from the legislative debates that, in the view of the ... sponsors, the victims of Klan outrages were deprived of `equal protection of the laws' if the perpetrators systematically went unpunished."Landmark Briefs and Arguments of the Supreme Court of the United States, Philip B. Kurland and Gerhard Casper, Editors, vol. 171. pp. 647-9. "Similarly, if the death penalty is meant to deter capital crime, it ought to deter such crime equally whether inflicted against black or against white citizens." fn 13, pp. 651-652.

The Court, of course, mentioned in its opinion the evidence of a race-of-victim effect. But it did not discuss the implication that Georgia's law might be unconstitutional because it failed to protect black potential victims. Thus the court wrote:

"Similarly, since McCleskey's claim relates to the race of his victim, other claims could apply with equally logical force to statistical disparities that correlate with the race or sex of other actors in the criminal justice system, such as defense attorneys." (McClesky v. Kemp, Majority opinion by Justice Powell.)

Such claims would not apply "with equally logical force" if McCleskey's claim was seen as based on equal protection for victims from crime via deterrence. There is a very large and obvious difference between failing to protect someone from being murdered and failing to protect someone from not being hired as a defense attorney.

It is possible that the Court ignored the issue on grounds of lack of standing; McClesky was a murderer not a victim. But, as the recent case of Powers v. Ohio, 111 Sct 1364 shows, a convicted criminal can sometimes succeed in raising a ius tertii defense-a defense based on the violation of someone else's rights.

In oral argument, counsel for the petitioner dealt with the issue of standing by putting the argument in terms of unfairness to a defendant who had killed a white, not in terms of unfairness to black potential victims:

Question: But I am not sure how that supports a claim of discrimination against the defendant.

Mr. Boger: Well, if the question is one, if you would, of standing, a defendant - if I have two defendants at my right hand, and two at my left, and the two at my left have murdered blacks, surely my defendants on the right hand would have standing if Georgia had a statute that made killing a white person a more serious crime. They'd say that's unconstitutional. That's an invidious discrimination.

(Landmark Briefs and Arguments of the Supreme Court of the United States, Philip B. Kurland and Gerhard Casper, Editors, vol. 171. p. 970. )

Alternatively, the Court may have ignored the issue because it was not demonstrated that the discriminatory outcome was a result of discriminatory intent; see Village of Arlington Heights v. Metropolitan Hous. Dev. Corp., 429 U.S. 252, 264-66 (1977); Washington v. Davis, 426 U.S. 229 (1976); cf. Oyler v. Boles, 368 U.S. 448, 456 (1961) (selective enforcement of habitual criminal statute does not violate equal protection clause absent discriminatory intent), all cited in Gross (1985) at p. 1284 fn 43. This seems to have been the grounds for rejecting a 14th amendment claim in Spinkellink v. Wainright, 578 F.2d 582 (5th Cir. 1978), cert. denied, 440 U.S. 976 (1979).

[68] Higher civil damages for the wrongful death of more valuable victims might be justified as fair compensation, avoiding the issue of unequal protection. If the objective of criminal punishment is deterrence, then using selective punishment to produce selective deterrence implies that the law is deliberately choosing to protect some potential victims more than others.

One possible response would be to argue, along lines suggested earlier, that although selective deterrence aimed at better protection for richer or better educated victims was unconstitutional on equal protection grounds, selective deterrence in favor of victims whose deaths would impose severe costs on other people was not. From this standpoint, imposing a more severe penalty on the murderer of a mother with three children than on the murderer of a bachelor is analogous to imposing a more severe penalty on someone who kills one person and severely injures three others than on someone who simply kills one person.

[69]

"Prosecutors and juries would also be authorized to find that the lives of homeless people, prostitutes, the politically unpopular, or others who are different are not "worth" as much as other members of society. See, e.g., Belkin, Texas Judge Eases Sentence for Killer of 2 Homosexuals, N.Y. Times, Dec. 17, 1988, section 1, page 8, col. 5 (thirty-year sentence for murders of two homosexuals explained by: `I put prostitutes and gays at about the same level. And I'd be hard put to give somebody life for killing a prostitute.')" from the amicus brief of the SCLC in Payne.

Stevens: "Should the defendant be allowed to bring out evidence that the victim was an unworthy person?"

Burson (Attorney General of Tennessee): "No. That would invite `open season' on victims." (59 LW 3762, 5-14-91.) His comment is given at greater length in the transcript as: "For instance, a state may well conclude that to allow a defendant to put on a negative social impact evidence without the state opening it up, that that, in essence, would invite open season on victims." (p. 36 of the transcript).

The implication is that the defense could counter prosecution evidence about the characteristics of the victim, but could not introduce such evidence unless the prosecution did. There are similar comments by Richard Thornburgh on pages 47 and 49 of the transcript.

[70] From the standpoint of economics, if not of justice, this is a problem of jury error. There is nothing inefficient about a system where the punishment for ending a life with little value-say the life of someone who is dying from cancer and has only a few days left-is relatively low. The problem is that the jury may be measuring, not how much the victim's life is worth, but how much it is worth to the jury-which is a very different thing. "Justice Souter ... asked ... isn't the real problem of getting into victims' character that society is involving itself in evaluating the comparative value of victims' lives?" (59 LW 3762, 5-14-91.)

[71] A similar issue came up in a later Supreme Court case, involving offenders rather than victims. In Dawson v. Delaware the Court, in an 8-1 vote, ruled that certain negative information about the offender (his membership in a prison gang called Aryan Brotherhood) could not be introduced in the sentencing stage of the trial, even though all positive evidence could be. The Court based its decision on First Amendment grounds, but Justice Thomas, the lone dissenter, argued that the case created a double standard allowing defense lawyers to point out good associations but forbidding prosecutors from pointing out bad ones.

[72] This might not be the case if the prosecution, as well as the jury, held the victim in low regard and therefore wanted the defendant to get off as easily as possible. But in such a situation it seems unlikely that any prosecution would occur or, if it occurred, would result in a conviction.

[73] Arguably, such a rule would be inconsistent with due process. Richard Thornburgh, Attorney General of the U.S., argued during the case that it would be constitutional to allow the prosecution to introduce a victim impact statement and not to allow the defense to do so (p. 47 of the transcript) and that, in the absence of any specific state law on the subject, the defendant should not be allowed to raise the issue of victim characteristics (p. 49).

[74] I say "about X/2" because a jury that is trying to impose the efficient level of punishment will be making a calculation more complicated than simply averaging victim rankings; the argument could be made more precise but at considerable cost in clarity.

[75] Since the jury knows it has not been given evidence on the characteristics of the victim, only victims for whom the prosecutor would not have presented evidence on victim characteristics are possible victims, so only they are averaged over.

[76] For instance, jurors may be more or less willing to impose a given average level of punishment if they believe that it will go selectively to those who have killed particularly valuable victims. One could imagine a juror who considered it very important to deter killers of young mothers, but only moderately important to deter killers of old men. With no information on victim characteristics he would favor the death penalty for all murderers, in order to get a sufficiently high level of deterrence against those who killed young mothers.

[77] This assumes that the jury knows the behavior of prosecutors in general but not of each individual prosecutor-the X relevant to the jury's decision is an average of the values for the different prosecutors. Without that assumption, the situation is equivalent to having a single prosecutor for all cases.

[78]This conclusion is strengthened by the likelihood that a jury, during the course of a trial, will acquire information about the victim even if it is not introduced in the context of determining punishment.

[79] The argument can be generalized to any case in which juries are expected to do a bad job of evaluating the value of the lives of victims. If one believes, as the minority in Payne v. Tennessee and the majority in Gathers and Booth perhaps did, that this is the normal case, one will naturally be suspicious of both selective deterrence and the Payne result.

[80]Readers whose initial response is that we should never make such tradeoffs on any terms may wish to ask themselves whether they would favor spending an additional hundred billion dollars a year on the court system if the result was to eliminate one false, and thus unjust, conviction for illegal parking.

[81]

"if evidence of the full range of harm caused by a defendant is truly irrelevant because it does not inform the sentencer of the defendant's mental state, then it should be equally irrelevant in all criminal cases. While the severity of the penalty in capital cases requires greater procedural safeguards, the qualitative difference in penalty cannot justify any difference in the substantive determination of whether a particular class of evidence is relevant." (Charles W. Burson, Attorney General of Tennessee, for respondent in Payne.)

This issue is discussed in "The Significance of Victim Harm: Booth v. Maryland and the Philosophy of Punishment in the Supreme Court," Richard S. Murphey, 51 Univ. Chi. L.Rev. 1303 (1988). The author argues that, on a retribution theory of punishment, the harm the criminal actually caused is irrelevant to the punishment he deserves, and "Hence, the Supreme Court's decision in Booth, by holding that victim impact statements are per se irrelevant to the capital sentencing decision, is completely consistent with and in fact required by the retributivist model of punishment." He goes on to argue that "The weakness in the Booth Court's reasoning is that it fails to recognize that the criminal law categorizes punishments according to the actual results. Thus, to reject the degree of harm inflicted as irrelevant, when divorced from the defendant's intentions, is to reject a principle that pervades the criminal justice system." He concludes that "as a matter of constitutional interpretation the Court is misguided" in its rejection of utilitarian theories in favor of retributive theories of punishment.

As one example of how strong the intuition of "punishment by moral desert rather than by consequences" seems to some, consider the following quote from an English legal philosopher:

"The penalties for attempts used to be lower than those for successful crimes, and although this is no longer so in England, courts are still apt to take a more lenient view of them, illogical as this is. As for harms which are knowingly risked--for example by motorists who drive `recklessly,'--sentencers usually take a more lenient view of them if they do not actually happen (again the logic is questionable)." Nigel Walker, Why Punish, p. 96, Oxford University Press, Oxford 1991.

In an earlier and more moralistic statement of the case for punishment as a response to wickedness, Sir James Fitzjames Stephen wrote:

"Everything which is regarded as enhancing the moral guilt of a particular offense is recognized as a reason for increasing the severity of the punishment awarded to it... The criminal law thus proceeds upon the principle that it is morally right to hate criminals, and it confirms and justifies that sentiment by inflicting upon criminals punishments which express it.

I think that whatever effect the administration of criminal justice has in preventing the commission of crimes is due as much to this circumstance as to any definite fear entertained by offenders of undergoing specific punishment."( "Of Crimes in General and of Punishments," from History of the Criminal Law of England, Vol. II, pp. 75-93, included in Crime, Law and Society, readings selected by Abraham S. Goldstein and Joseph Goldstein, The Free Press, London, 1971, pp. 22-23. )

The second paragraph of the quote provides an old and important prudential argument for what is elsewhere presented as a moral principle. By punishing (and hating) the wicked we teach people to be less wicked.

[82] This point is discussed by H.L.A. Hart:

"The almost universal tendency in punishing to discriminate between attempts and completed crimes rests, I think, on a version of the retributive theory which has permeated certain branches of English law, and yet has on occasion been stigmatized even by English judges as illogical. This is the simple theory that it is a perfectly legitimate ground to grade punishments according to the amount of harm actually done, whether this was intended or not; `if he has done the harm he must pay for it, but if he has not done it he should pay less.' To many people such a theory of punishment seems to confuse punishment with compensation ... . Why should the accidental fact that an intended harmful outcome has not occurred be a ground for punishing less a criminal who may be equally dangerous and equally wicked? I may be wrong in thinking that there is so little to be said for this form of retributive theory. It is is certainly popular ... ." (H.L.A. Hart, Punishment and Responsiblity 130-131 (1968).)

He comments further on the issue of making punishment depend on ex post outcome rather than ex ante expectations, in the context of liability for negligence, on pp. 134-5, again without finding any justification for the existing law.

[83]

"Obviously this apportionment of punishment [for attempt] can be explained only by an assumption that to some extent it is designed for retribution. If the law's purpose were merely preventive, it would apply to the act done the same consequence, regardless of whether the act were successful or unsuccessful, since its objective would be the prevention of acts likely to result in harm. The fact that the punishment for success is twice as severe as the punishment for an unsuccessful attempt must mean that the additional suffering consequent upon success is a matter of expiation of retribution because of that success." J. Waite, The Prevention of Repeated Crime 8-9 (1943)

Waite's claim that a system of punishments designed only for deterrence must impose the same punishment for an unsuccessful attempt as for a completed crime is wrong. To see why, apply the analysis of optimal punishment for the killing of high value and low value victims given above to the case of murder (high injury--corresponding to killing a high value victim) and expected murder (low injury--corresponding to killing a low value victim). The analysis is the same, so the conclusion, that it may be optimal to base punishment on outcome ex post instead of expected outcome ex ante, remains. Waite's point is valid, however, if we take it as demonstrating that differential punishment for attempts, if based on desert rather than deterrence, implies that desert is affected by consequences.

The Model Penal Code provides, in section 5.05(1), that "Except as otherwise provided in this Section, attempt, solicitation and conspiracy are crimes of the same grade and degree as the most serious offense which is attempted or solicited or is an object of the conspiracy. An attempt, solicitation or conspiracy to commit a [capital crime or a] felony of the first degree is a felony of the second degree." Putting aside the exception for first degree felonies, this is consistent with the idea that punishment should depend only on intent, not outcome. Similarly, in discussing aggravating circumstances that may justify the death penalty, the Code does not seem to include the outcome of the crime, except insofar as it is foreseeable. SS 210.6(3) h gives, as an aggravating circumstance, that "the murder was exceptionally heinous, atrocious or cruel, manifesting exceptional depravity." The final requirement would seem to exclude a murder that was especially heinous for reasons of which the murderer was unaware when he committed the crime. (Model Penal Code Official Draft and Explanatory Notes, American Law Institute 1985).

[84] Stephen Schulhofer has raised this issue with regard to torts as well as crimes:

"Theoretically, it would be more appropriate for everyone to pay into an insurance fund a premium based on the risks he creates in the course of his activities. Those who suffer injury would then seek compensation from the fund rather than attempting to impose the entire loss on the negligent defendants who happened to cause their particular injuries. ... In the absence of such a framework, however, the law of torts can properly treat those who cause harm differently from those who do not, in order to allocate fairly the loss which has befallen the victim. This allocation of the loss, fortuitous as between risk-creators, is preferable to an allocation of the loss which would be fortuitous as between faultless victims."(Stephen Schulhofer "Harm and Punishment: A Critique of Emphasis on the Results of Conduct in the Criminal Law," U. of PA Law Review, 122 p. 5964, fn 64.)

The final conclusion seems problematic. If a driver is only liable for risk and not for result, then imposing costs on him beyond the amount of the risk is no more just than imposing them on another driver, or on the victim. The fact that this seems sharply contrary to our moral intuition strikes me as evidence against the thesis that, in determining the obligations of those who have done damage, justice requires that we ignore consequences insofar as they are due to events beyond the actor's control.

[85] See Bernard Williams, "Moral Luck," in Proceedings of the Aristotelian Society, supplementary Vol. I. (1976) pp. 115-35 and, in a slightly revised version, as Chapter 2 of Bernard Williams, Moral Luck, Cambridge University Press, Cambridge 1981.

[86] Foundations of the Metaphysics of Morals, first section third paragraph.

[87] Adam Smith, The Theory of Moral Sentiments, Part 2 Section III Introduction.

[88] Smith, op.cit., Part 2 Section III Chapters 1 and 2.

[89] Smith, op. cit., Part 2 Section III Chapter 3.

[90] One striking difference between Smith's discussion of these issues and more modern discussions is that Smith is concerned not with the possibility that punishment by desert will provide arguments against punishing with special severity those who (happen to have) committed crimes with particularly heinous consequences but with the possibility that it will provide arguments for punishing those who have not committed crimes but might, under other circumstances, have done so. He writes that, if we resented intentions as strongly as we resent actions, "Sentiments, thoughts, intentions, would become the objects of punishment; and if the indignation of mankind run as high against them as against actions; if the baseness of the thought which had given birth to no action, seemed in the eyes of the world as much to call aloud for vengeance as the baseness of the action, every court of judicature would become a real inquisition. There would be no safety for the most innocent and circumspect conduct. Bad wishes, bad views, bad designs, might still be suspected ... ." (Smith, op. cit., Part 2 Section III Chapter 3).

[91]

"Finally, some have thought that the gravity of sinfulness ought to enter into the measure of crimes. The fallacy of this opinion will at once appear to the eye of an impartial examiner of the true relations between men and men, and between men and God. The first are relations of equality. ...The second are relations of dependance on a perfect Being and Creator, who has reserved to himself alone the right to be legislator and judge at the same time, ... . If he has established eternal punishments for anyone who disobeys his omnipotence, what insect is it that shall dare to take the place of divine justice, ... . The weight of sin depends on the inscrutable malice of the heart, which can be known by finite beings only if it is revealed. How then can a norm for punishing crimes be drawn from this? Men might in such a case punish where God forgives, and forgive where God punishes." (Cesare Beccaria, On Crimes and Punishments, 1764, pp.65-66. Henry Paolucci translator, Bobbs Merrill, Indianapolis 1963.)

[92] But not for Holmes, who wrote: "On the one side is the notion that there is a mystic bond between wrong and punishment; on the other, that the infliction of pain is only a means to an end..." (Oliver Wendell Holmes, Jr., The Common Law, pp. 41-51) and, arguing from the fact that the reasonable man standard makes a less than reasonable defendant liable even though his action is not blameworthy, "If the foregoing arguments are sound, it is already manifest that liability to punishment cannot be finally and absolutely determined by considering the personal unworthiness of the criminal alone." (p. 32 ).

[93] in Chapter 3 ("Moral Luck") of Thomas Nagel, Mortal Questions, Cambridge University Press, Cambridge 1979.

[94] Anarchy, State and Utopia Chapter 7, especially pp. 155-164, where Nozick puts the distinction in terms of patterned principles (of which "to each according to his moral merit" is one example) vs entitlement principles.

[95] I am ignoring here difficult questions of evidence and causality which are important to the workings of real legal systems but not, I think, to the point made here.

[96] Or else somebody must bear the cost of having a car that has been damaged and not fixed.

[97] This need not imply a system based entirely on outcomes-we still need some way of deciding who bears the costs, and intention may be one way to decide. My point is only that in a society facing a budget constraint outcomes become morally relevant, if only because they constrain the range of possible allocations. Whatever we may all deserve, once my car has been destroyed either I do not have a car or someone pays to buy me another.

[98]A different approach to the problem of justifying different punishments for offenders who may be equally wicked is implied by Norval Morris' position that desert sets upper and lower bounds to appropriate criminal punishment, within which other considerations may determine actual punishment. If murderers deserve to die in Morris' sense-if, in other words, capital punishment is not unjust even though not morally required-then the court is free to decide on other grounds which murderers are to be executed. If the court's information about what murderers knew when they committed their crimes is imperfect, then the arguments given here in favor of selective deterence provide a reason for executing those murderers who have done the most damage, even if we believe that some of them did not know, ex ante, how much damage they were doing.

"By a limiting principle of punishment I mean a principle that, though it would rarely tell us the exact sanction to be imposed, as deterrence might, would neverthless give us the outer limits of leniency and severity which should not be exceeded. Desert, I will submit, is such a limiting principle." (Norval Morris, "Punishment, Desert & Rehabilitation" in Equal Justice Under Law: U.S. Dept of Justice Bicentennial Lecture Series, 1976. pp. 5-6 (pp. 141-2 in the collected papers version of the lectures.))

There is also a lengthy literature which tries to base appropriate punishments on a retributive principle, with a variety of different justifications and consequences. A recent example is Margaret Falls, "Retribution, Reciprocity, and Respect for Persons," in Law and Philosophy 6 (1987 pp. 25-51); the author writes that "Criminal deeds differ in the degree to which they involve morally relevant factors like harm to others, violation of rights, and perhaps wickedness of intent." (p. 45). She offers a theory of retributivism-punishment based on the seriousness of the crime-based on the argument that "One of the most fundamental duties of treating people as autonomous moral decisionmakers is to hold them responsible for their acts."

Back to the list of articles.

Back to my home page.