Chapter VIII: Watermarks and Barbed Wire

Authors expect to be paid for their work. So do programmers, musicians, film directors, and lots of other people. If they cannot be paid for their work, we are likely to have fewer books, movies, songs, programs. This creates a problem if what is produced can be inexpensively reproduced. Once it is out there, anyone who has a copy can make a copy, driving the price of copies down to the cost of reproducing them. Copyright law is an attempt to solve that problem by giving the creator of a work the legal right to control the making of copies. How well it works depends on how easily that right can be enforced.

To enforce his legal rights, the owner of a copyright has to be able to discover illegal copying and take legal action against those responsible. How easy that is depends in large part on the technology of copying.

Consider the old-fashioned printing press, circa 1910. It was large and expensive; printing a book required first setting hundreds of pages of type by hand. That made it much less expensive to print 10,000 copies of a book on one press than 100 copies each on a hundred different presses. Since nobody wanted 10,000 copies of a book for himself, a producer had to find customers – lots of customers. Advertising the book, or offering it for sale in bookstores, brought it to the attention of the copyright owner. If he had not authorized the copying, he could locate the pirate and sue.

Enforcement becomes much harder if copying is practical on a scale of one or a few copies – the current situation for digital works such as computer programs, digitized music, or films on DVD. Individuals making a copy for themselves or a few copies for friends are much harder to locate than mass-market copiers. Even if you can locate them, it is harder to sue 10,000 defendants than one. Hence, as a practical matter, firms mostly limit the enforcement of their copyright to legal action against large-scale infringers.^¹

The situation is not entirely hopeless from the standpoint of the copyright holder. If the product is a piece of software widely used in business – Microsoft Word, for example – there will be organizations that use, not one copy, but thousands. If they choose to buy one and produce the rest themselves, someone may notice – and sue.

Even if copying can be done on a small scale, there remains the problem of distribution. If I get programs or songs by illegally copying them from my friends I am limited to what my friends have, which may not include what I want. I may prefer to buy from distributors providing a wide range of alternatives – and they, being potential targets for infringement suits, have an incentive to buy what they sell legally rather than produce it themselves illegally. So even in a world where many expensive works in digital form – Word, for example – can easily be copied, the producers of such works can still use copyright law to get paid for some of what they produce.

Or perhaps not. As Napster and then its peer-to-peer successors have demonstrated, distribution over the internet makes it possible to combine individual copying with mass-market distribution, using specially designed search tools to find the individual who happens to have the particular song you want and is willing to let you copy it. A centralized distribution system is vulnerable to legal attack, as Napster discovered. But shutting down a decentralized system such as Gnutella or Freenet, which allows individuals on the net to make their music collections available for download in exchange for the ability to download songs from other people’s collections, is a more difficult problem. If each user is copying one of your songs once but there are 100,000 of them, can you sue them all?

Perhaps you can – if you take proper advantage of the technology. A decentralized system must provide some way of finding someone who has the song you want and is willing to share it. Copyright owners might use the same software to locate individuals who make their works available for copying and sue all of them, perhaps in a suit that joins many defendants. Since copyright law sets a $500 statutory minimum for damages, suing 10,000 individuals, each of whom has made one copy of your copyrighted work, could in principle bring in more money than suing one individual who had made 10,000 copies.

Recent attempts along these general lines by the Recording Industry Association of America (RIAA) have gotten a good deal of publicity, at least some of it negative. They also face some practical problems. For one thing, under current law, it is not entirely clear when noncommercial file exchanges are illegal – although that situation could be changed by Congress and probably will be changed by the courts. Also, it is hard to force multiple defendants into a single suit, so suing very large numbers of defendants can be expensive. On the other hand, if they expect to lose, one may not have to go very far with the suit before getting an out of court settlement. And one could imagine modifications in the relevant legal rules, perhaps applicable only to copyright suits, that would make the mechanics easier.

Although this approach may work for a while, its long-run problems should be clear from the earlier discussion of strong privacy. A well-designed decentralized system would locate someone willing to let you copy a song but would not let you identify the person from whom you were copying it. You do not need name, face, or social security number in order to copy the file encoding the song you want, merely some way of getting messages to and from him.^² This raises the possibility that the desire of people to download music without either paying for it or getting sued may be the key incentive that pushes us toward the strong privacy world of widespread encryption. As one webbed essay puts it, “to a first approximation, every PC owner under the age of 35 is now a felon.” It also raises the possibility that attempts to regulate strong encryption may ultimately be fought out, not between the government and individuals with unpopular views, but between the RIAA and people downloading music.

An alternative legal approach is to sue the provider of the file-sharing software for contributory infringement, an approach that finally succeeded, after extensive litigation, in MGM v. Grokster. But doing that requires a provider that still exists, is under the court’s jurisdiction, and has significant assets; none of those conditions can be guaranteed in future cases. A decentralized peer-to-peer system can continue to function long after the organization that created it has vanished.

There remains, for some forms of intellectual property, the possibility of collecting royalties from business customers – corporations that use Word, movie theaters performing movies. In the longer run, even that option may shrink or vanish. A world where strong privacy is sufficiently universal would permit virtual firms – groups of individuals linked via the net but geographically dispersed and mutually anonymous. Even if all of them use pirated copies of Word – or whatever the equivalent is at that point – no whistleblower can report them because nobody, inside or outside the firm, knows who they are or whether they have paid for their software.

Consider the problem in a different context – images on the world wide web. Each image originated somewhere and may well belong to someone. But once webbed, anyone can copy it. Not only is it hard for the copyright owner to prevent illegal copying, it may be hard for even the copier to prevent illegal copying, since he may not know to whom the image belongs or whether it has been put in the public domain.

One way of dealing with these problems is digital watermarking.^³ Using special software, the creator of the image imbeds in it concealed information identifying him and claiming copyright. In a well-designed system, the information has no noticeable effect on how the image looks to the human eye and is robust against transformation – meaning that it is still there after a user has converted the image from one format to another, cropped it, edited it, perhaps, if some claims are to be believed, even printed it out and scanned it back in.^⁴

Digital watermarking can be used in a number of different ways. The simplest is by embedding information in an image and making the software that reads the information widely available. That lowers the cost to users of avoiding infringement, by making it easy for them to discover that an image is under copyright and who the copyright owner is. It raises the cost of committing infringement, at least on the web, since search engines can search the web for copyrighted images and report back to the copyright owner – who checks to see if the use was licensed and if not takes legal action. The existence of the watermark will help prove both to whom the image belongs and that the user knew or should have known and so is liable for not only infringement but deliberate infringement.

A deliberate infringer might try to remove the watermark while preserving the image. A well-designed system can make this more difficult. But as long as the watermark is observable, the infringer can try different ways of removing it until he finds one that works. And making software for reading the watermark publicly available makes it harder to keep secret the details of how it works, hence easier to design software to defeat it. So this form of watermark provides protection against inadvertent infringement, raises the cost of deliberate infringement – the infringer must go to some trouble to remove the watermark – but cannot prevent or reliably detect deliberate infringement.

The obvious solution is an invisible watermark designed to be read only by special software not publicly available. That is of no use for preventing inadvertent infringement but substantially raises the risks of deliberate infringement, since the infringer can never be sure he has successfully removed the watermark. By imprinting an image with both a visible and an invisible watermark, the copyright holder could get the best of both worlds – provide information for those who do not want to infringe and a risk of detection for those who do.

There is another way in which watermarking could be used to enforce copyright, in a somewhat different context. Suppose we are considering, not digital images, but computer programs. Further suppose that enforcing copyright law against the sellers of pirated software is not an option – they are located outside of the jurisdiction of our court system, doing business anonymously, or both.

Even if the sellers of pirated copies of our software are anonymous, the people who originally bought the software from us are not. When we sell the program, each copy has embedded in it a unique watermark – a concealed serial number, sometimes referred to as a digital fingerprint. We keep a record of who got each copy and make it clear to our customers that permitting their copy of the program to be copied is a violation of copyright law for which we will hold them liable. If copies of our software appear on pirate archives we buy one, check the fingerprint, and sue the customer from whose copy it was made.^⁵

Digital watermarking is one example of a class of technologies that can be used to get back at least some of what other technologies took away. The ease of copying digital media made enforcement of copyright harder – at first glance, impossibly hard – by enabling piracy at the individual level. But the ability of digital technologies to embed invisible, and potentially undetectable, information in digital images, combined with the ability of a search engine to check a billion web pages looking for the one that contains an unlicensed copy of a watermarked image, provide the possibility of enforcing copyright law against individual pirates. And the same technology, by embedding the purchaser’s fingerprint in the purchased software, provides a potential way of enforcing copyright law even in a world of strong privacy – not against anonymous pirates or their anonymous customers but against the known purchaser from whom they got the original to copy.

While these are possible solutions, there is no guarantee that they will always work. Invisible watermarking is vulnerable to anyone sufficiently ingenious – or with sufficient inside information – to crack the code, to figure out how to read the watermark and remove it. The file representing the image or program is in the pirate’s hands. He can do what he wants with it – provided he can figure out what needs to be done.

An individual who wants to pirate images or software is unlikely to have the expertise to figure out how to remove even visible watermarks, let alone invisible ones. To do so he needs the assistance of someone else who does have that expertise – most readily provided in the form of software designed to remove visible watermarks and identify and remove invisible ones. That raises the possibility of backstopping the technological solution of digital watermarks with legal prohibitions on the production and distribution of software intended to defeat it. That is the approach used by the Digital Millennium Copyright Act of 1998.^⁶ It bans software whose purpose is to defeat copyright management schemes such as digital watermarking. How enforceable that ban will be, in a world of networks and widely available encryption, remains to be seen.

Each of the approaches to enforcing copyright that I have been discussing has serious limitations. The use of digital fingerprints to identify the source of pirated copies only works if the original sale is sufficiently individualized so that the seller knows the identity of the buyer – and while it would be possible to sell all software that way, it would be a nuisance. Perhaps more important, the approach works very poorly for software that is expensive and widely used. One legitimate copy of Word could be the basis for ten million illegitimate copies, giving rise to a claim for a billion dollars or so in damages – and if Microsoft limits its sales to customers both capable of satisfying such a claim and willing to put that much money at risk, it will not sell very many copies of Word. The use of digital watermarks to identify pirated copies only works if the copies are publicly displayed – for digital images on the web but not for a pirated copy of Word on my hard drive. These limitations suggest that producers of intellectual property have good reason to look for other ways of protecting it.

One way of solving these problems would be to convert cyberspace, at least the parts of it residing on hardware under the jurisdiction of U.S. courts, into a transparent society. My computer is both a location in cyberspace and a physical object in realspace; in the latter form it can be regulated by a realspace government, however good my encryption is. One can imagine, in a world run by copyright owners, a legal regime that required all computers to be networked and all networked computers to be open to authorized search engines, designed to go through their hard drives looking for pirated software, songs, movies, or digital images.

I do not think such a legal regime will be a politically viable option in the United States anytime in the near future, although the situation might be different elsewhere. There are, however, private versions that might be more viable, technologies permitting the creator of intellectual property to make it impossible to use it save on computers that meet certain conditions – one of which could be transparency to authorized agents of the copyright holder.

For a much simpler version of the same approach, consider possible copyright enforcement strategies if each computer’s central processing unit has a built-in serial number unique to that particular computer. A software company customizes each copy of its product to run on a single computer, identified by the serial number of its central processing unit (CPU). The user can freely make backups. The user can give copies to friends. But the copies will only run on the computer the original was bought for. Unless, of course, someone figures out a way to either modify the part of the program that checks the serial number or modify other software, perhaps part of the computer’s operating system, to lie to the program about what its serial number is.^⁷

Most readers would regard the idea of enforcing the terms of a software license by allowing a human being to randomly search their hard drive as outrageous, but might react very differently to the idea of allowing a program on their computer to check their CPU to see what its serial number is. Some may be worried about the problems that will arise if they get a new computer and want to transfer their old software to it. But nobody is likely to see such a system as an intolerable violation of privacy.

The two approaches appear very different – but consider something in between. Your hard drive must be open to searches – but the searches may be done only by computer programs. The only information the programs are capable of reporting to a human being is the fact that they found copyrighted software on your hard drive that you are not entitled to – at which point the copyright holder can go to court to ask for legal authority to look at your hard drive.

The issue raised by these examples – to what degree does being spied on by a machine violate your privacy – is one we will return to in a later chapter, where we consider the implications of using computers instead of human beings to listen to phone taps.

If using technology to enforce copyright law in a world of easy copying is not always workable, perhaps we should instead use technology to replace copyright law. If using the law to keep trespassers and stray cattle off my land doesn’t work, perhaps I should build a fence.

You have produced a collection of songs and wish to sell them online. To do so, you digitize the songs and insert them in a cryptographically protected container – what Intertrust, one of the pioneering firms in the industry, called a digibox and IBM a cryptolope. The container is a piece of software that protects the contents from unauthorized access while at the same time providing, and charging for, authorized access. Once the songs are safely inside the box you give away the package by making it available for download on your web site.

I download the package to my computer; when I run it I get a menu of choices. If I want to listen to a song once, I can do so for free. Thereafter, each play costs five cents. If I really like the song, fifty cents unlocks it forever, letting me listen to it as many times as I want. Payment is online by ecash, credit card, or an arrangement with a cooperating bank.

The digibox is a file on my hard disk so I can copy it for a friend. That’s fine with you. If he wants to listen to one of your songs more than once, he too will have to pay for it.

It may have occurred to you that there is a flaw in the business plan I have just described. The container provides one free play of each song. In order to listen for free, all the customer has to do is make lots of copies of the container and use each once. Alternatively, if I want to make copies for friends, I can pay fifty cents once to unlock the file and make copies – unlocked copies – for them. It might be prudent for the digibox to have some way of making sure that the computer it is running on is the same as the computer it was unlocked on.

Making a new copy every time you play a song is a lot of trouble to go to in order in order to save five cents. Intertrust does not have to make it impossible to defeat its protection, whether in that simple way or in more complicated ways, in order for it and the owners of the intellectual property it protects to make money. It only has to make defeating the protection more trouble than it is worth.

Between the time when I wrote the first draft of this chapter and the final revision, Intertrust went out of business, its particular approach to technological protection having failed to take off. The current incarnation of their approach is called “Digital Rights Management,” usually shortened to DRM. The underlying idea is still the same. Files, typically audio or video, are distributed in a form that is only accessible with a suitable key. Information about the key is provided only to manufacturers who agree to build into their equipment – a CD player, say – restrictions on what can be done with the file. Thus, in theory, the file can only be used on equipment designed to prevent copying or in other ways restrict its use.

One problem with this approach is that files can be played not only on dedicated equipment but on computers. The firm providing DRM will, of course, refuse to tell other people how to write software that unlocks their files. But the files themselves are there to be examined, as are the devices authorized to play them, which makes it hard to prevent a sufficiently ingenious programmer from reverse engineering the protection in order to build a suitable key into software without providing the restrictions on use that the owner of the intellectual property wants.

As in the case of digital watermarking, how easy it is to defeat the protection depends very largely on who is doing it. The individual customer is unlikely to be expert in programming or encryption, hence unlikely to be able to defeat even simple forms of technological protection. The risk comes from the person who is an expert and makes his expertise available, cheaply or for free, in the form of software designed to crack the protection.

One approach to dealing with that problem is by making it illegal to create, distribute, or possess such software – the strategy put into law by the Digital Millennium Copyright Act. That law currently faces legal challenges by plaintiffs who argue that publishing information, including information about how to defeat other people’s software, is free speech, hence protected. Even if the court declines to protect that particular sort of speech, the arguments of an earlier chapter suggest that in the online world free speech may itself be technologically protected – by the wide availability of encryption and computer networks – making the relevant parts of the act in the long run unenforceable.

If law cannot provide protection, either against piracy or against computerized safecracking tools designed to defeat technological protection, the obvious alternative is technological – safes that cannot be cracked. Is that possible?

For some forms of intellectual property – songs, for example – it is not. The problem, sometimes referred to as the “analog hole,” is that, however strong the protection, at some point in the process the customer gets to play the song or watch the movie – that, after all, is what he is paying for. But if a customer is playing a song on his own computer in his own home, he can also be playing it into his own tape recorder, giving him a copy of the song outside the box. If he prefers an MP3 to a cassette he can play the song back to the computer, digitize it, and compress it. To avoid distortion due to speakers and microphone he can short-circuit the process, feeding the electrical signals that normally go to the speakers back into the computer instead to be redigitized – outside the box. A similar approach could be used to hijack a book, video, or any other work that is presented to the customer in full each time it is used. Technological protection may make the process of getting the work out of the digibox and into some usable form a considerable nuisance – but once one person has done it, in a world where copyright law is difficult or impossible to enforce, the work is available to all. Short of making everybody’s hard disk searchable, the only way of protecting works of this kind is to limit their consumption to a controlled environment – showing the video in a movie theater with video cameras banned, for instance.

For other sorts of works, secure protection may be a more practical option. Consider, for example, an (imaginary) database compiled by Consumer Reports, designed to advise a user on what car to buy. A query describes price range, preferences, and a variety of other relevant information. The response is a report tailored to that particular customer. Having payed for and received the report, the customer can give a copy to a neighbor. But the neighbor is unlikely to want it, since he is unlikely to have all the same tastes, circumstances, and constraints. What the neighbor wants is his own customized report – which requires another payment.

With enough time, energy, and money, a pirate could ask a million questions and use the answers to reverse engineer the protected data – but why should he? The pirate can use the stolen information, can give it away, but has only a very limited ability to sell it. As long as the protection raises the cost of reconstructing the database high enough, it should be reasonably safe.^⁸ For a real-world example of almost precisely that strategy, consider LexisNexis and Westlaw, the legal databases on which lawyers and legal academics rely. There is nothing to keep me from downloading a law case from Lexis and then passing it on to a colleague who has not paid for the privilege – but the odds that my colleague is looking for the same case I am are low.

For a different approach to the problem of protecting intellectual property, consider a program that does something very useful – high-quality speech recognition, say. I divide it into two parts. One, which contains most of the code and does most of the work, I give away to anyone who wants it. The rest, including the key elements that make my program special, resides on my server. In order for the first part to work, it must continually exchange message with the second part – access to which I charge for by the minute.

One elegant feature of this solution is that the disease is also the cure. Part of what makes copyright unenforceable is the ready availability of high-speed computer networks, enabling the easy distribution of pirated software. But high-speed computer networks are precisely what you need for the form of protection I have just described, since they allow me to make software on my server almost as accessible to you as software on your hard disk – and charge for it.

Some years after I wrote the initial draft of this section, I realized that I myself was a customer of a very successful and innovative piece of intellectual property protected in just this way. World of Warcraft, a massively multiplayer online role-playing game with something over ten million customers, sells the client software that goes on the customer’s computer to anyone who wants to buy it. But the server software that make it possible for thousands of individuals to coordinate their activities, to interact in a common world, is sitting on Blizzard’s own servers. One can think of World of Warcraft and its competitors as the new technology’s equivalent of the movie – one that, unlike movies, can be technologically protected. Any player who wants can record his adventures and show them to his friends. But what the friends want is to have their own adventures, and to do that they will have to pay Blizzard’s monthly fee.

Another example of the same approach is provided by firms, such as Pandora and Last.fm. Instead of asking for a particular song, the customer rates songs as he hears them; the service uses the ratings to decide what to play next. Think of it as your own customized DJ.

Putting together everything in this chapter, we have a picture of intellectual property protection in a near future world of widely available high-speed networks, encryption, and easy copying. Intellectual property used publicly, such as images on the web, can be legally protected provided it is not valuable enough to make it worth going to the trouble of removing hidden watermarks and provided also that it is being used somewhere that copyright law can reach. That second proviso means that if we move all the way to a world of strong privacy such protection vanishes, since copyright law is useless if you cannot identify the infringer. But even in that world, some intellectual property can be protected by fingerprinting each original and holding the purchaser liable for any copies made from it.

Where intellectual property cannot be protected by law, it may still be possible to protect it by technology. That approach is of limited usefulness for works that must be entirely revealed every time they are accessed, such as a song. It may work better for more complicated works, such as a database or a computer program. For both sorts of works, protection will be easier if it is practical to use the law to suppress software designed to defeat it – but it probably won’t be.

Does this mean that, in the near future, songs will stop being sung and novels stop being written? That is not likely. What it does mean is that those who produce that sort of intellectual property will have to find ways of getting paid that do not depend on control over copying. For songs, one obvious possibility is to give away the digitized version and charge for concerts. Filmmakers can give away the film and make money on the toys – which, being physical objects sold in realspace, are still subject to intellectual property law.^⁹ Another possibility is to rely on the generosity of fans – in a world where it will be easy to email a ten-cent appreciation to the creator of the song you have just enjoyed. A third is to give away the song along with a digitally signed thank you to the firm that paid you to write it and hopes to profit from your fans’ goodwill, the modern equivalent of the old system of literary patronage.

Similar options are available for authors. The usual royalty payment for a book is between 5% and 10% of its face value. Many readers may be willing to voluntarily pay the author that much in a world where the physical distribution of books is essentially costless. Other books will get written in the same way that articles in academic journals are written now – to spread the author’s ideas or to build up a reputation that can be used to get a job, consulting contracts, or speaking opportunities.

Back in Chapter 4 I raised the possibility of treating transactional information as private property, with ownership allocated by agreement at the time of the transaction. Such information is a form of intellectual property and can be protected by the same technologies we have just discussed.

Suppose, for example, that you are happy to receive catalogs in the mail (or email) but do not want strangers to be able to compile enough information about you to enable identity theft, spot you as a target for extortion, or in some other way use your personal information against you. You achieve both objectives by making personal information generated by your transactions – purchases, employment, car rental, and the like – available only in a very special sort of database. The database allows users to create address lists of people who are likely customers for what they are selling but does not allow them to get individualized data about those people. It will be distributed inside a suitably designed and cryptographically protected container or on a protected server, designed to answer queries but not to reveal the underlying data. If the catalogs are going out by email, the database is combined with a forwarding service. One copy of the catalog goes to the service, along with suitable payment, and a thousand copies from there to a thousand email addresses – none of which need be revealed to the catalog company.

In the full-blown version of such a system, the company running the database doesn’t know who you are either, since the information goes out by email through a chain of remailers; your email address is buried under layers of encryption, with one layer removed by each remailer. In a simpler version, or one designed to forward physical products as well as messages, you are relying on a trusted intermediary, a firm in the business of keeping its customers’ secrets – a specialty that used to be associated with Swiss bankers.

The information in the database was created by your transactions. In the highest tech version, you conduct all of them anonymously, so nobody but you has the information to start with, and you can control who gets it thereafter. In a lower tech version, both you and the seller start with the information – what you bought and when – but the seller is contractually obliged to erase the record once the transaction is complete. In either version, you arrange for the information to be available only within the sort of protected database I have just described – and, if access to such a database is sufficiently valuable, get paid for doing so.

1 The RIAA has engaged in mass suits against people they suspect of downloading pirated music but is itself being sued for malicious prosecution on the grounds that it had inadequate reason to believe that the people it sued were guilty.

2 Two current projects alone these lines are WinW and BadBlue.

4 The final point is important, since if it works it provides a way of blocking the analog hole discussed later in the chapter.

5 Note that this is a higher tech equivalent of one way in which trade secret law, with licensing, is currently enforced.

6 I like to argue that since a K, a binary thousand, is actually 1024, the Digital Millenium Copyright Act doesn't go into effect until 2048. I doubt I could persuade a court.

7 Recently Apple, which had been selling songs on iTunes in a protected format, announced that it was now going to sell unprotected songs at a slightly higher price.

8 A possible approach to overcoming that form of protection would be open source piracy, with many users pooling the information they individually obtained.

9 To date, the five-film “Star Wars” epic has taken in some $12.4 billion in movie tickets and merchandise sales worldwide, delivering a heavenly sum to distributor Twentieth Century Fox and Lucasfilm, the films’ production company. Of that total, $3.4 billion has come from the worldwide box office and $9 billion from sales of “Battlefront” video games, Clone Trooper costumes, Obi-Wan Kenobi toy action figures, and other sundry gizmos, according to Lucasfilm.