The proliferation of Internet access and mobile devices has led to an exponential explosion of content on the Web, creating a vast repository of "publicly available" information. This includes not only news, business, and financial information, but also personal data, movie and restaurant reviews, concert ticket sales, flight information, and a virtually endless array of other categories. This same technological explosion, however, has made it far easier for third parties to extract this data for commercial sale and use—and to do so for free and without authorization. This data extraction, commonly referred to as "scraping," "crawling," or "spidering" (collectively "scraping"),1 creates legal issues and concerns for both sides of this issue—those who want to scrape, and those who want to protect against scraping of their websites.
Scraping inherently involves copying, and therefore one of the most obvious claims against scrapers is copyright infringement. However, such claims are often open to attack on several grounds. First, in order to have standing to bring a claim for copyright infringement, the owner (or exclusive licensee) of the website being scraped must also be the owner of the copyrightable content that is the subject of the claim.2 This can pose a barrier to bringing a lawsuit if, for example, the content at issue is user-generated (such as videos or reviews), and the rights in the content have not been transferred to the website owner.
Second, copyright law does not protect ideas, but rather only tangible expression.3 Thus, the scraping of general factual data does not give rise to a viable claim for copyright infringement. For example, in Ticketmaster v. Tickets.com, the court rejected an infringement claim because the material being extracted—factual information regarding concerts and URLs—was not copyrightable.4
Third, even if the information copied by the scraper is protectable under copyright law, the defendant may be able to rely upon the "fair use" defense. Under the Copyright Act, courts are to consider the following factors to determine if a use is a fair use: (1) the purpose and character of the use; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the work as a whole; and (4) the effect of the use upon the potential market for or value of the copyrighted work.5 For example, in Kelly v. Arriba Soft, the court held that the use of scraping software by a search engine to reproduce images in thumbnail form was not a sustainable basis for a claim of copyright infringement, because the thumbnail images created from the full-size scraped images were "transformative" and qualified as a fair use of the images.6
Trespass to Chattels
A trespass to chattels is defined as intentionally dispossessing another of a chattel or using or intermeddling with a chattel in the possession of another.7 This legal theory applies to the Internet inasmuch as a website proprietor has a "fundamental property right to exclude others from its computer system[.]"8 Moreover, even if a website is publicly accessible, its servers are private property, and the proprietor may therefore grant conditional access to users, including prohibitions against scraping.9
For example, in Bidder's Edge, the court held that excessive scraping can support a claim for trespass to chattels if it taxes the plaintiff's computer system in such a way that would substantially impair it, and, if so, an injunction may be granted.10 Specifically, the court held that there was a viable trespass cause of action due to the excessive scraping of eBay's website at the rate of 80,000-100,000 times per day.11
Similarly, in Register.com v. Verio, the Court of Appeals for the Second Circuit held that Verio's use of search robots consumed a significant portion of the capacity of Register's computer system, and that Verio was therefore engaged in a trespass.12 The court reasoned that if it were to allow these queries, then it was "highly probable" that other companies would begin to do the same, which would likely result in Register's system being "overtaxed and [it] would crash."13 However, in Ticketmaster, the court held that the use of scrapers to extract data was not a trespass to chattels, because there was no evidence that the scraping caused any tangible interference with the operation of Ticketmaster's system.14
Breach of Contract
A clickwrap agreement is an online agreement that requires the user to consent to terms and conditions by affirmatively clicking a dialogue box agreeing to the terms before the user can proceed to use a website.16 Clickwrap agreements are generally enforceable, due to the user's clear manifestation of assent, so long as the terms do not violate other basic contract principles (e.g., unconscionability).17
Browsewrap agreements, on the other hand, involve the posting of a link to terms and conditions on a website for users to read, but do not require users to affirmatively manifest assent to the terms and conditions—instead, user consent is implied by continued use of the website.20
For example, in Hines the court held that the browsewrap agreement was not enforceable, because in this case the plaintiff had no actual or constructive notice of the terms and conditions of use.23 However, in Southwest Airlines v. BoardFirst, where there was evidence that defendant had actual knowledge of Southwest's terms and conditions, but nevertheless continued to use Southwest's website in violation of those terms, the court held that the browsewrap agreement was an enforceable contract.24
In sum, while statements of assent such as "I agree," which are often elicited through clickwrap agreements, are preferable and unequivocally reflect a manifestation of assent, the user need not necessarily state the magic words "I agree" (or some similar formulation).27 However, "the website user must have had actual or constructive knowledge of the site's terms and conditions, and have manifested assent to them" in some manner, implicit or explicit.28
Violation of the CFAA
For example, in Southwest Airlines v. Farechase, defendants scraped fare, route, and scheduling information from Southwest.com.31 The court denied a motion to dismiss the CFAA claim because Southwest alleged (i) damages of at least $5,000, and (ii) that it had put defendant on actual notice that scraping was prohibited.32
Anthony J. Dreyer is a partner, and Jamie Stockton is an associate, with Skadden, Arps, Slate, Meagher & Flom. Brittany Bettman, a summer associate, assisted in the preparation of this article.
1. See EF Cultural Travel BV v. Zefer, 318 F.3d 58, 60 (1st Cir. 2003) ("A scraper, also called a 'robot' or 'bot,' is nothing more than a computer program that accesses information contained in a succession of webpages stored on the accessed computer"); eBay v. Bidder's Edge, 100 F. Supp. 2d 1058, 1060 (N.D. Cal. 2000). While it is possible to embed instructions on websites that inform the scraping software whether scraping is permitted (called "robot.txt" files), compliance with such instructions is voluntary. See Bidder's Edge, 100 F. Supp. 2d at 1061.
2. See, e.g., Nautical Solutions Mktg. v. Boats.com, No. 8:02-CV-760, 2004 WL 783121, at *2-3 (M.D. Fla. April 1, 2004) (denying post-trial motion for declaration of copyright infringement, because, inter alia, the website that was being scraped did not own the copyright to the data and images that were being copied).
3. See Feist Publ'ns v. Rural Tel. Serv., 499 U.S. 340 (1991).
4. See Ticketmaster v. Tickets.com, No. 99-CV-7654, 2003 WL 21406289, at *4-6 (C.D. Cal. March 7, 2003); see also Nautical Solutions, 2004 WL 783121, at *2-3 (reaching similar result for scraping of information regarding the sale of yachts).
5. See 17 U.S.C. §107.
6. Kelly v. Arriba Soft, 336 F.3d 811, 819 (9th Cir. 2003). An in-depth discussion of the nuances of the fair use doctrine is outside the scope of this article. For a discussion of fair use, see Melville B. Nimmer, 4 Nimmer on Copyright §13.05 (Lexis 2013).
7. See Restatement (Second) of Torts §218 (Westlaw 2012); see also Bidder's Edge, 100 F. Supp. 2d at 1069.
8. Bidder's Edge, 100 F. Supp. 2d at 1067.
9. Id. at 1070.
10. Id. at 1071-72.
11. Id. at 1071.
12. Register.com v. Verio, 356 F.3d 393, 404-05 (2d Cir. 2004).
13. Id. at 404.
14. Ticketmaster, 2003 WL 21406289, at *3.
15. See, e.g., Bidder's Edge, 100 F. Supp. 2d at 1067; Zefer, 318 F. 3d at 62.
16. See Specht v. Netscape Commc'ns, 306 F.3d 17, 22 n.4 (2d Cir. 2002); Hines v. Overstock.com, 668 F. Supp. 2d 362, 366-67 (E.D.N.Y. 2009).
17. See Specht, 306 F.3d at 22 n.4.
18. Bidder's Edge, 100 F. Supp. 2d at 1060.
19. Id. at 1067.
20. See Specht, 306 F.3d at 25.
21. See, e.g., Specht, 306 F.3d at 35; Hines, 668 F. Supp. 2d at 367.
22. See Specht, 306 F.3d at 35 (finding a browsewrap agreement unenforceable).
23. See Hines, 668 F. Supp. 2d at 367.
24. Sw. Airlines v. BoardFirst, No. 3:06-CV-0891, 2007 WL 4823761, at *7 (N.D. Texas Sept. 12, 2007).
25. Register.com, 356 F.3d at 401-04.
27. See id. at 402-03.
28. Cvent v. Eventbrite, 739 F. Supp. 2d 927, 937 (E.D. Va. 2010); see also Hines, 668 F. Supp. 2d at 367.
29. 18 U.S.C. §1030(a)(4); see also 18 U.S.C. §1030(g) (providing for civil liability and a private right of action).
30. See 18 U.S.C. §1030(a)(4).
31. Sw. Airlines v. Farechase, 318 F. Supp. 2d 435, 440 (N.D. Tex. 2004).
32. Id. at 439-40; see also Zefer, 318 F.3d at 62-63 (upholding a preliminary injunction issued under the CFAA where defendant had knowledge that scraping was unauthorized).
33. Cvent, 739 F. Supp. 2d at 932-34.