UNIVERSITY of NOTRE DAME

PART II. Scraping Public and Private Data: Does the CFAA Apply?

Note: This post is a follow-up to an earlier post. At the time that post was written, only the Northern District of California had reviewed the case, ruling in favor of hiQ. Now, the Ninth Circuit has affirmed the district court’s grant of a preliminary injunction in favor of hiQ.

In early September of this year, the Ninth Circuit issued a monumental holding in hiQ Labs, Inc. v. LinkedIn Corp.,1 potentially opening the floodgates for scraping of public data. This post will briefly recap what data scraping is and the role of the CFAA. It will then discuss both the district court and appellate court hiQ cases. The overviews will be followed by an analysis of potential impacts of the Ninth Circuit’s ruling. 

First and foremost, data scraping is a method of gathering and automatically downloading targeted data from a source. Scraping is most commonly employed on third-party websites using specialized software (“bots”) without the owner’s permission. In essence, Entity A visits Company B’s website and downloads data without B’s permission. 

Why scrape? Usually, it’s to gain a competitive advantage. After scraping data from someone else’s website, companies can, in turn, aggregate and then repackage that same data to generate their own products and services. The data is presented as an entirely new product/service even though it was initially generated by (and then taken from) another party. As data scraping becomes increasingly common, lawmakers, lawyers, and judges have struggled to conceptualize these issues within the existing legal landscape.   

The recent hiQ case, first reviewed in the Northern District of California,2 speaks to the legality of public data scraping. The plaintiff, hiQ Labs, is a data science company that develops tools to help corporate HR departments by “providing information to businesses about their workforces based on statistical analysis of publicly available data.”3 HiQ’s entire business model involved scraping public data from LinkedIn’s site. After years of engaging in this practice, LinkedIn tried to stop hiQ by threatening action under the Computer Fraud and Abuse Act (“CFAA”),4 a federal cybersecurity and anti-hacking statute. In basic terms, the CFAA says that a person may be held liable if he or she accesses a computer without authorization or if he or she exceeds authorized access.5 While courts have generally held that certain uses of data-scraping software to gain access to private, password-protected data is actionable under the CFAA,6 the question of liability under the CFAA for public data, such as the data publicly available on LinkedIn, was previously unknown. 

Two weeks after receiving a cease and desist letter, hiQ labs filed suit in the Northern District of California, requesting a declaratory judgment finding that scraping LinkedIn’s data was lawful.7 HiQ further contended that LinkedIn’s actions “constitute[d] unfair business practices” and “contend[ed] that LinkedIn’s actions constitute[d] a violation of free speech under the California Constitution.”8

The district court ultimately found that the balance of hardships tipped heavily in hiQ’s favor and that the CFAA did not apply in this instance because the data at hand was public (i.e., not password-protected). Following the ruling, many practitioners were hesitant to interpret the holding as a green light for data scraping, particularly without review from the Ninth Circuit. That was, of course, until the Ninth Circuit affirmed the district court’s ruling.9

On September 9, 2019, the Ninth Circuit found that hiQ had, in fact, demonstrated a likelihood of irreparable harm absent a preliminary injunction.10 It further held that the district court’s finding that the balance of hardships tipped in hiQ’s favor was not “illogical, implausible, or without support on the record.”11 With regards to CFAA applicability, the Ninth Circuit found that “the legislative history . . . of the CFAA . . . support the district court’s distinction between ‘private’ computer networks and websites, protected by a password authentication system and ‘not visible to the public,’ and websites that are accessible to the general public.”12 Finally, the Ninth Circuit considered the public interest in granting or denying the preliminary injunction. Although each side asserted that “its own position would benefit the public interest by maximizing the free flow of information on the Internet,”13 the Ninth Circuit held that the district court had “properly determined that, on balance, the public interest favor[ed] hiQ’s position.”14

The ultimate consequence of this holding seems to be that scrapers have been given a green light to scape publicly available data, reformat that data, and then resell the information for their own purposes. While scraping of public data may not be actionable under the CFAA, it should be noted that scraping can still be challenged under other causes of action including state laws, copyright infringement, and unjust enrichment. However, for now, this holding appears to be a sweeping ruling in favor of an open internet. In its broadest interpretation, any “public” online data – i.e., any data not password-protected – can be freely captured and used by third parties. The long-term impacts of this holding, particularly on internet business competition, remain largely unknown.

Notre Dame Journal on Emerging Technologies ©2020  

Scroll to Top