Yearly Archives: 2019
Jul 26, 2019 Ted Sichelman
- Paul R. Gugliuzza, The Supreme Court at the Bar of Patents, 95 Notre Dame L. Rev. __ (forthcoming, 2020), available at SSRN.
- Paul R. Gugliuzza, Elite Patent Law, 104 Iowa L. Rev. __ (forthcoming, 2019), available at SSRN.
Christopher Langdell’s “case” method of teaching the law has dominated the law school classroom for over a century. In this pedagogical approach, students typically read appellate opinions, and professors tease “rules” from the opinions—often in concert with the so-called Socratic method, which enlists students to aid in this abstractive process. This approach is said to make students “think like lawyers,” but what’s typically ignored in the process is the role lawyers actually play in the very cases under consideration. Instead, the working assumption is that judges are presented with arguments and facts up high from anonymous sets of ideal lawyers, who never miss a key argument or forget a relevant fact.
Of course, the actual world of lawyering is much messier, and lawyers range from the glorious and gifted to the struggling and essentially incompetent. But exactly how does this variation in attorney quality affect case outcomes? This all-too-important question has scarcely been addressed, much less answered, by systematic academic study. In an outstanding duo of articles, Paul Gugliuzza shines newfound light on the issue by examining the role of “elite” advocates in the certiorari process at the U.S. Supreme Court.
Unlike actual case outcomes, which are often a poor test for attorney quality because of endogeneity concerns (the best attorneys often take the hardest cases), selection effects, and the lack of any “natural experiment” comparing a before-and-after “treatment,” certiorari in patent cases is in my view quite a worthy domain in which to suss out the effects of attorney quality.
As Gugliuzza recounts in exhaustive and well-researched detail, there is a major shift in patent action appeals in the participation of “elite” attorneys, particularly at the Supreme Court. (By “elites,” Gugliuzza refers to those attorneys who presented oral argument in at least five cases in that term and the ten previous terms combined.) Barring other explanations—which Gugliuzza does a thorough job in effectively eliminating—this sets up enough of a natural experiment to assess the causal role of elite attorneys in the fate of patent appeals, especially the grant (or denial) of cert petitions.
Notably, Gugliuzza finds that “the Supreme Court is 3.3 times more likely to grant cert. when a petition in a Federal Circuit patent case is filed by an elite advocate as compared to a non-elite.” (Supr. Ct., P. 34.) Specifically, while non-elite petitions are granted at a 4.7% rate, elite petitions are granted at a high 15.6% rate. Exactly how and why this occurs is complex. Part of the reason is the fact that in cases handled by elites, large numbers of amicus briefs are filed at the cert stage, and the presence of those briefs is even more strongly correlated with cert grant than the presence of elites.
Of course, it could be the fact that elites tend to work on more important cases, and it is precisely those cases that garner more amicus briefs. But as Gugliuzza explains—and which aligns with my own experience—it is the network and know-how of elites that drive the amicus filings, creating a causal link between elites and cert grants. Also, many elites are known to the justices and clerks. And elites know how to craft briefs to increase the odds of a cert grant. Thus, even more so than Gugliuzza, I think it’s fairly clear that elites are a substantial causal factor in the Supreme Court’s renewed interest in patent law issues.
What’s more incredible about Gugliuzza’s findings is that, in my view, they substantially understate the role “elites” are playing in patent cases at the Supreme Court, because Gugliuzza’s definition excludes attorneys who regularly draft briefs (but do not argue) Supreme Court cases and also excludes well-known academics (since none has argued 5 cases), who have increasingly played a role at the certiorari stage in patent cases over the past 10 years.
Gugliuzza plans to tease out some of these additional influences in a follow-on study, which I have no doubt will strongly support the causal role between elites and cert grants in patent cases. But where does all this leave us?
First and foremost, Gugliuzza’s study reminds us as law professors that attorneys really do matter and that we need to teach students as much, including the nitty gritty of why—not just in “skills” and “clinical” courses, but in “doctrinal” courses, too. It also opens the door for further empirical study on the role of attorney quality in outcomes (outside of mere win rates—which, as I noted above, is a difficult way to measure the effects of attorney quality) in many other areas of law.
Second, it raises important normative issues regarding the development of the law. As Gugliuzza rightly notes, elite advocates tend to have little training in science and technology, and instead are typically generalists. When both the advocates and judges are generalists in patent cases, this can lead to a “blind leading the blind” problem. As Justice Scalia aptly recognized in his Myriad opinion, he could not join certain portions of the majority opinion, stating “I am un-able to affirm those details on my own knowledge or even my own belief.” Personally, I find it hard to believe that any justice in the majority had any scientific knowledge substantially greater than Justice Scalia’s. Indeed, Gugliuzza documents cause for concern because most of the Supreme Court decisions have been in areas that are basic enough for the justices to understand, like procedure or statutory interpretation, rather than core substantive issues of patent law. Even the substantive cases, like KSR, Myriad, Mayo, Alice, Global-Tech, and the like, present relatively simple sets of facts, which in essence means the Court has eschewed many doctrinal areas in need of resolution, such as enablement, written description, and complex obviousness doctrines.
At the same, the elites arguably have stronger skills when it comes to law and policy than the usual patent litigator. Elites may help to correct for the sometimes tunnel-vision of patent litigators and, more importantly, “specialized” Federal Circuit judges. This may help avoid court capture and pro-patent biases, which tend to serve the economic aims of the patent bar.
As Gugliuzza perceptively notes, perhaps it’s too early to answer the normative question. There are decent arguments on both sides of the fence. My own instincts are that generalist elites—in concert with the elites that make up the Supreme Court—are mucking up patent doctrine to the point that the system isn’t working as it should. Most problematic are generalist opinions, which often don’t provide sufficient guidance to innovators and potential infringers, alike, to order their business affairs. More generally, the Supreme Court has produced many opinions that have weakened patents (e.g., KSR, Alice, Mayo, eBay, Global-Tech, and TC Heartland), which although not always intentional, is in my view the wrong policy choice.
In sum, I thoroughly enjoyed Gugliuzza’s insights on these important questions, and the more general question of the role of lawyers on the law, and I believe Gugliuzza’s articles and follow-on studies will surely play a critical role in resolving these thorny debates as the empirics continue to unfold.
Ted Sichelman,
How Elite Lawyers Shape the Law, JOTWELL (July 26, 2019) (reviewing Paul R. Gugliuzza,
The Supreme Court at the Bar of Patents, 95
Notre Dame L. Rev. __ (forthcoming, 2020), available at SSRN; Paul R. Gugliuzza,
Elite Patent Law, 104
Iowa L. Rev. __ (forthcoming, 2019), available at SSRN),
https://ip.jotwell.com/?p=1270.
Jun 28, 2019 Michael W. Carroll
What is the relationship between copyright law and artificial intelligence or machine learning systems that produce outputs biased by race, gender, national origin, and related aspects of being human? That is the question that Amanda Levendowski investigates and addresses in her refreshingly well-written, to-the-point article How Copyright Law Can Fix Artificial Intelligence’s Implicit Bias Problem. In a nutshell, she argues that: (1) these systems need large quantities of training data to be effective; (2) those building these systems rely on biased data in part because of their own biases but also because of potential risks of copyright infringement; and (3) more copyrighted works can legally be included as training data under the fair use doctrine and should be so used to selectively diversify the inputs to these systems to de-bias their outputs.
Levendowski starts with the problem in the form of Google’s natural language processing system word2vec. It is a form of neural word embedding that analyzes the context in which words appear in the source texts to produce “vectors,” which indicate word associations such as “Beijing” is to “China” as “Warsaw” is to “Poland.” Trained by analyzing the published news sources incorporated into Google News to which Google has obtained a copyright license, word2vec ingests the biases in those sources and spits out results like “man” is to “computer programmer” as “woman” is to “homemaker.” Levendowski acknowledges that those in the machine learning research community agree that this is a problem and are in search of a solution (including Google’s own researchers), but she responds that it should not be left only to developers at large technology companies with access to the training data to de-bias their own systems.
Levendowski further asserts that copyright law stands as a potential barrier, or at least a perceived barrier, to outside researchers’ ability to investigate and report on bias in these systems. Copyright reinforces incumbents’ advantages in three ways. First, while reverse engineering of the algorithms is protected by fair use, accessing those algorithms, if they are subject to technological protection measures under 17 U.S.C. §1201, is limited to the narrower § 1201(f) exception or the right to circumvent that the First Amendment may provide. Second, if a biased system’s underlying training data is copyrighted, journalists and other investigators who seek to expose the sources of algorithmic bias are likely to be chilled by the prospect of an infringement suit. Finally, the leading artificial intelligence developers have significant resource advantages that allow them to acquire enormous training datasets by building them (Facebook) or buying them (IBM).
This competitive advantage leads newcomers to rely on what Levendowski terms “biased, low-friction data” or BFLD; that is, data that are accessible and that carry little legal risk. (P. 589.) Here, her example is the 1.6 million emails among Enron employees made accessible by the Federal Energy Regulatory Commission in 2003. This is one of the only publicly-accessible large datasets of interlinked emails. Although these emails are technically works of authorship protected by copyright, the legal risk that any of these authors would sue an AI researcher for using these is close to nil. But, this is hardly a representative sample of people to study if one were to train a system to extract generalizable rules about how human beings communicate by email. Other examples of BFLD that have other forms of bias include public domain works published prior to 1923, which do not reflect modern language usage, and Wikipedia, which is legally low-risk because of its Creative Commons license but is a biased source of facts about the world because of the large gender imbalance among contributors. Levendowski argues that this imbalance biases the data in the language used to describe women in many Wikipedia entries, and the substance of these reflect male bias in terms of the subject matter covered and the subject matter omitted, such as key facts about women in biographical entries.
The article then argues that enlarging any of these datasets, specifically with diverse, copyrighted sources that are likely to mitigate or erase bias, is desirable and is legal as a fair use. Recognizing that access to these sources remains a challenge, Levendowski argues that at least the use of these sources should be cleared by fair use.
Here, I should disclose my bias. I have a forthcoming article that makes a related argument that copyright law permits the use of large sets of copyrighted works for text and data mining, so I am sympathetic to this article’s argument. Nonetheless, I think most readers will find that although the fair use analysis in this article is brief, perhaps too brief, it is supported by the case law and copyright policy.
The analysis argues that using copyrighted works as training data is a transformative use, and there is now substantial case law and scholarship that support this assertion. The use is for a different purpose than for which the works were published and the use adds something new through the system’s operation. The article then argues the second factor also favors the use because even creative works are being used for their “factual” nature; i.e., as examples of creative works by humans. Under the third factor, using the entirety of these works is necessary and appropriate for this purpose and has been approved in a number of cases involving computational processing of copyrighted works. Finally, under the fourth factor, even if some of the training data has been licensed in by current developers, the transformative purpose under the first factor overrides any negative impact that fair use may have on this market.
While this analysis is generally persuasive, I found this part of the article a little thin. I agree that a court would almost certainly characterize this use as transformative for the reasons stated. But, the second factor has traditionally been focused on how much expressive material is in the work being borrowed from rather than the borrower’s purpose. This move felt like giving the transformative purpose a second bite at the apple. While the second fair use factor does little work on its own, I think it is appropriate to consider as part of the balance how much original expression is at stake.
I will note that I wanted more discussion of the third and fourth factors. While it is easy to agree that use of entire works is likely to be permissible, the harder question is how much of that training data can be made publicly available under fair use by those seeking algorithmic accountability. I would have liked to know more about how and where Levendowski would draw this line. Similarly, the evidence of some licensing for this use, needs more elaborate discussion. I agree that the transformative purpose is likely to insulate this use, and that this licensing market is really one for access to, rather than use of, the training data, which diminishes the impact under the fourth factor.
With that said, I want to acknowledge the creativity of Levedowski’s thesis, and show appreciation for her clear, succinct presentation of the three stages of her analysis. This piece is a welcome contribution by an early-career researcher, and I look forward to reading her future work.
May 30, 2019 Dotan Oliar
Most people assume, if implicitly, that there is a substantial element of uniformity in our IP system. At first blush, our copyright and patent laws extend a (presumably) uniform set of rights to (presumably) uniform authors and inventors, who can then sue (presumably) uniform unauthorized users. Scholarship for some time now has already noted that the bundle of rights is not actually uniform, and has theorized on the optimal tailoring of rights to particular industries and subject-matters. More recently the literature has started to unpack the implicit assumption of creator uniformity using data on the demographics of authors and inventors. Statistically speaking, the data has shown that creators of different races, genders and ages diverge in the rate and direction of their creative efforts. In this new and exciting article, Libson and Parchomovsky begin to unpack the assumption of user uniformity using user demographics.
Legal enforcement of copyrights entails benefits and costs. On the benefit side, it provides authors with an incentive to create, by securing to them the exclusive exploitation of their works. On the cost side, it reduces access to creative works, by endowing the author with a monopoly-like power. Optimally, copyrights would only be enforced against high value consumers (thus achieving the incentive rationale), but not against those with valuations lower than the market price (thus achieving the access rationale). In theory, allowing free access to those who cannot afford the market price would be efficient, as it would allow them access without sacrificing the author’s incentive. In practice, however, this cannot be done because many who are willing and able to pay would masquerade as ones who are not, and authors have no crystal ball to reveal consumer valuation. Copyright enforcement thus makes sure that those who can pay would, realizing that the access cost is borne as a necessary evil.
Not necessarily so anymore, say Libson & Parchomovsky. Using data on the demographics of consumers of audio and video content, they show that certain cross-sections of users never enter the market. With regards to these users, it does not make a lot of sense to harshly enforce copyright law. Rather, treating infringement by these users leniently would have the benefit of increasing access to content without sacrificing incentives to the author, namely without the risk that otherwise paying users would masquerade as low-value ones.
To illustrate how this can be done, Libson and Parchomovsky use two data sets. First, they use data from the Consumer Expenditure Survey of the Bureau of Labor Statistics that give a general view of household consumption patterns. For example, they note that average household spending on online audio and video consumption varies considerably with household demographics, including income, age, race, education, marital status and geographical location. Second, they use panel data on online purchases of music and video of over 80,000 households. Various household demographics correlate with purchase decisions, including, most prominently, race and age. They report that about 1500 of the 80,000 households did not buy music and about 4500 did not buy video online.
Together, these datasets give a sense of certain user cross-sections that are highly unlikely to ever purchase copyrighted content. For example, none of the 176 households that are southern, without a college degree, aged 24 years-old or younger, with income less than $100,000, and are not African American purchased copyrighted audio content online in 2016. Also, none of the 72 households that are southern, without a college degree, aged 29 years-old or younger, with income less than $100,000, and who are not African American purchased copyrighted video content online in 2016. Accordingly, under certain assumptions and caveats, the authors maintain that it would make sense to reduce the copyright liability of such households, and even exempt them from liability, because doing so would not disincentivize authors but would increase household—and so social—welfare.
Libson and Parchomovsky present their data as a proof of concept and suggest that much more could be done to optimize copyright policy if and when better data became available. But even with their data, the authors spell out three policy implications: the use of personalized consumption data can reduce the deadweight loss associated with copyright protection, copyright enforcement should be limited with regards to consumer demographics that are unlikely to purchase content, and that sanctions can be varied based upon user characteristics. This paper thus makes a novel contribution on its own, and opens up the way for further empirical investigation of users in IP.
Apr 16, 2019 Christopher J. Buccafusco
Abhishek Nagaraj & Imke Reimers,
Digitization and the Demand for Physical Works: Evidence from the Google Books Project (2019), available at
SSRN.
From 2004 until 2009, the Google Books Project (GBP) digitized thousands of books from the collection of Harvard University’s library and made them available online. According to Google and proponents of the GBP, digitization would introduce readers to books that they otherwise couldn’t find or obtain, increasing access to and interest in the digitized works. But according to some authors and publishers, the creation of free digital copies would usurp the demand for print copies, undermining an important industry. This dispute was at the heart of a decade of litigation over GBP’s legality. After all of that, who was right?
According to a recent empirical study by economists Abhishek Nagaraj and Imke Reimers, the answer is: both of them. The paper, Digitization and the Demand for Physical Works: Evidence from the Google Books Project, combines data from several sources to reveal some key features about the effects of digitization on dead-tree versions of books. The story they tell suggests that neither of the simple narratives is entirely correct.
Google worked with Harvard to scan books from 2004 to 2009, proceeding in a largely random fashion. The only limitation was that Google only scanned books that had been published prior to 1923, because these works were in the public domain and, thus, could be freely copied. Works published in 1923 or later might still be covered by copyright, so Google chose not to scan those initially. Nagaraj and Reimers obtained from Harvard the approximate dates on which the pre-1923 books were scanned.
Harvard also provided them with the number of times between 2003 and 2011 that a book was checked out of the library. Book loans serve as one of the ways in which consumer demand for books is supplied, so these data enabled the researchers to test whether digitization affected demand for printed versions of works. The researchers also obtained sales data for a sample of approximately 9,000 books that Google digitized, as well as data on the number of new editions of each of these books. With these data, Nagaraj and Reimers engage a difference-in-differences method to compare loans and sales of digitized books to those of non-digitized books, before and after the year in which books were digitized.
If the GBP’s opponents are correct, then digitization should lead to a decrease in loans and sales, as cheaper and more easily accessed digital versions substitute for physical copies of books, especially for consumers who prefer digital to physical copies. According to the substitution theory, consumers basically know which books they want, and if they can get them for free, they will. If GBP’s proponents are correct, by contrast, consumers do not always know which books they want or need, and finding those books can entail substantial search costs. Digitization reduces the costs of discovering books and will lead some consumers to demand physical copies of those books.
Nagaraj and Reimers find that digitization reduces the probability that a book will be borrowed from the library by 6.3%, reducing total library loans for digitized books by about 36%. Thus, some consumers who can get free and easy digital access choose it over physical access. The figures for marketwide book sales are, however, reversed. Digitization increases market-wide sales by about 35% and the probability of a book make at least one sale by 7.8%. Accordingly, some consumers are finding books they otherwise wouldn’t have and are purchasing physical copies of them.
To further explore these effects, Nagaraj and Reimers disaggregate the data into popular and less popular books, and here the effects are starker. For little known works, digitization drastically decreases the costs of discovering new titles, and consumers purchase them at a 40% higher rate than non-digitized books. Discovery benefits trump substitution costs. But for popular works, where digitization does little to increase discovery of new works, sales drop by about 10%, suggesting substantial cannibalization.
What do these findings mean for copyright law and policy? One implication is that substitution effects may not be that great for many works even when the whole work is available. Thus, the substitutionary effect of Google’s “snippet view,” which shows only about 20% of a work should be much smaller still. Also, it’s important to realize that these data help prove that otherwise forgotten or “orphan” works still have substantial value, if only people can find them. Consumers were willing to pay for less popular works, once they discovered their existence.
Ultimately, however, because the data does not tell a simple story, they may not be able to move the legal debate much. The study confirms both publishers’ fears about the works they care the most about (popular works) and GBP’s proponents’ hopes about the works they care the most about (orphan works). One possibility, however, is that we may see a more sophisticated approach to licensing works for digitization. Publishers may be more willing to allow Google or others to digitize unpopular works cheaply or for free, while choosing to release popular titles only in full price editions. This could provide the access that many people want to see while enabling publishers to stay in business.
Mar 13, 2019 Lisa Larrimore Ouellette
Michael D. Frakes & Melissa F. Wasserman,
Irrational Ignorance at the Patent Office,
72 Vand. L. Rev. __ (forthcoming 2019), available at
SSRN.
How much time should the U.S. Patent & Trademark Office (USPTO) spend evaluating a patent application? Patent examination is a massive business: the USPTO employs about 8,000 utility patent examiners who receive around 600,000 patent applications and approve around 300,000 patents each year. Examiners spend on average only 19 total hours throughout the prosecution of each application, including reading voluminous materials submitted by the applicant, searching for relevant prior art, writing rejections, and responding to multiple rounds of arguments from the applicant. Why not give examiners enough time for a more careful review with less likelihood of making a mistake?
In a highly-cited 2001 article, Rational Ignorance at the Patent Office, Mark Lemley argued that it doesn’t make sense to invest more resources in examination: since only a minority of patents are licensed or litigated, thorough scrutiny should be saved for only those patents that turn out to be valuable. Lemley identified the key tradeoffs, but had only rough guesses for some of the relevant parameters. A fascinating new article suggests that some of those approximations were wrong. In Irrational Ignorance at the Patent Office, Michael Frakes and Melissa Wasserman draw on their extensive empirical research with application-level USPTO data to conclude that giving examiners more time likely would be cost-justified. To allow comparison with Lemley, they focused on doubling examination time. They estimated that this extra effort would cost $660 million per year (paid for by user fees), but would save over $900 million just from reduced patent prosecution and litigation costs.
Litigation savings depend on Frakes and Wasserman’s prior finding that time-crunched patent examiners make mistakes, and that they are more likely to erroneously allow an invalid patent than to reject a valid one. When examiners are promoted up a step on the USPTO pay scale, they suddenly receive less time per application. Frakes and Wasserman found that they manage the increased workload by spending less time searching prior art and granting more patents. Based on both subsequent U.S. challenges and comparisons with parallel applications at foreign patent offices, these extra patents seem to involve more mistakes. Patents rejected by time-crunched examiners, on the other hand, are no more likely to be appealed within the USPTO. Extrapolating from these results, Frakes and Wasserman estimate that doubling examination times would lead to roughly 80,000 fewer patents granted and 2,400 fewer patent/lawsuit pairs each year, translating to litigation savings above $490 million. Similar calculations suggest about 270 fewer instituted PTAB challenges, for an annual savings above $110 million.
These savings alone might not quite justify the $660 million pricetag. But Frakes and Wasserman also suggest that giving examiners more time may lead to decreased prosecution costs for applicants. In a different earlier paper, they found that examiners often make rushed, low-quality rejections under time pressure near deadlines, which increases the number of rounds of review and the time the application is pending at the USPTO. Here, they predict that doubling examination time would be associated with 0.56 fewer office actions per application, translating to around $300 million per year in additional savings. (If this is right, should applicants be allowed to pay the USPTO for a more thorough initial examination?)
As Frakes and Wasserman note, increasing examination time is even more likely to be justified under a correct application of cost-benefit analysis that accounts for the broader social costs of erroneously issued patents. Through the supracompetitive pricing they enable, patents impose costs on both end users and follow-on innovators. Patents that do not satisfy the legal standards of patent validity are less likely to have innovation incentive benefits that outweigh these costs. These costs are difficult to quantify (and are the subject of active study) but that does not mean the USPTO should ignore them.
To be clear, this doesn’t mean the USPTO should immediately double its workforce. There are a lot of assumptions built into Frakes and Wasserman’s estimates, including that the effects they observed from examiners before and after promotion are generalizable. Could the agency hire additional examiners of similar quality? How will recent changes in patent law and litigation practice affect the benefits of increasing examination time? Is it really true that increasing examination time leads to fewer office actions? On the cost side, the $660 million pricetag for doubling examination time seems plausible based on examiner salaries and overhead expenses, but is significantly less than the nearly $3 billion the USPTO currently budgets for patent programs. Could greater efficiency be achieved without raising user fees, or is $660 million too low? Empiricists will surely quibble with many details of their methodological choices.
But an immediate doubling of the examiner corps isn’t Frakes and Wasserman’s goal. Despite remaining empirical uncertainties, they have produced the most evidence-based estimates to date of the tradeoffs between ex ante administrative screening and ex post review during litigation. The USPTO should take notice. Examination effort can be increased gradually: Frakes and Wasserman argue that increasing examination time is even more likely to be cost-justified if one focuses just on a marginal dollar for more examination. And there are open questions on the best way to spend this marginal dollar. Which examiners should get more time? Does investing more time up front on “compact prosecution” help? Could errors be reduced more through internal peer review? Peer review from outside experts? Technical experts within the agency to help with difficult cases?
Most importantly, any of these interventions should be implemented in a way that aids robust empirical evaluation. The USPTO has shown an encouraging willingness to experiment with pilot programs that might improve examination, but has not implemented them in ways that make it easy to evaluate their effectiveness, such as by randomizing over applicants who want to opt in to the programs. Rigorous pilot programs may be both financially and politically costly, but how much effort to spend on examination is a core question of patent policy with tremendous financial implications. And I’m sure the USPTO could easily find free help from academics—perhaps including Frakes and Wasserman—excited to help design and evaluate these initiatives.
Cite as: Lisa Larrimore Ouellette,
Should Patent Examiners Get More Time?, JOTWELL
(March 13, 2019) (reviewing Michael D. Frakes & Melissa F. Wasserman,
Irrational Ignorance at the Patent Office,
72 Vand. L. Rev. __ (forthcoming 2019), available at SSRN),
https://ip.jotwell.com/should-patent-examiners-get-more-time/.