Is Scraping Google SERPs Legal?

Is Scraping Google SERPs Legal?
Digital marketing professionals have long been using data to make better-informed decisions. In the world where search engines represent the massive source of data, incorporating valuable data obtained from search engine results pages has become a mainstream business practice.

A common way for businesses to extract data from Google, Bing, Yahoo, and other search engines is scraping. This process gives businesses significant commercial benefits, but it can also raise certain legal concerns. In this article, weโ€™ll break down some key aspects of scraping Google SERPs from the legal viewpoint.

TL;DR

  1. Thereโ€™re no precedents of Google suing businesses over scraping its results pages.
  2. Scraping of Google SERPs isnโ€™t a violation of DMCA or CFAA.
  3. However, sending automated queries to Google is a violation of its ToS.
  4. Violation of Google ToS is not necessarily a violation of the law.
  5. Itโ€™s highly unlikely Google will find a technical way to block DataForSEO from collecting necessary data in the nearest future.
  6. Using services of DataForSEO is legal and isnโ€™t in violation of the law.
  7. Our customers donโ€™t violate Google ToS or Webmaster Guidelines by using any of DataForSEO services.

What is web scraping?

In a broad sense, web scraping is a form of data scraping used for extracting data from websites. Hereโ€™s a brief overview of what data scraping is from Wikipedia:

โ€œData scraping is a technique in which a computer program extracts data from human-readable output coming from another program.โ€

Scrapers can access website data either through โ€œreadingโ€ data that users see on a screen or from the underlying HTML code. Since websites are designed to be easily accessible by users, it is technically simple to pull data from them.

Scraping has existed for decades and is often cited as a key concept underpinning the Internet. It is widely used in different branches of the web economy. To give you an example, search engines use scraping to index content available on the web, media outlets fetch data from blogs and other news websites, and digital marketers rely on data extracted by scraping to gauge different website performance metrics.

Here, at DataForSEO we use web scraping to provide SEO software developers, digital marketers, and researchers alike with up-to-date and accurate SEO data.

Whatโ€™s wrong with web scraping?

The reputation of web scraping has gotten significantly deteriorated over the last several years.

  • First of all, scraped data is used by many businesses to gain an advantage over their competitors. Instead of making stuff up and spending lots of money in the process, why couldnโ€™t you just scrape data, add value to it, and sell something better to your customers? To give you an example, in 2011 Bing was caught red-handed in copying Googleโ€™s search results.
  • Secondly, companies that use web scraping completely ignore copyright of scraped data and Terms of Service (ToS) of resources they scrape it from. For instance, in April 2016 Getty Images filed a completion law compliant, accusing Google of scraping copyrighted content and using it in Google Images without prompting users to visit the original source website.
  • Web scrapers often send much more requests per second that what humans would do, creating a huge load on scraped sites. Scraping can potentially harm critical website infrastructure (which sometimes is also called โ€œelectronic trespassโ€) and breach its security measures. Back in 2001, eBay won a lawsuit against Bidderโ€™s Edge, preventing the latter from scraping data off of its pages. Bidderโ€™s Edge was accessing eBay listings about 100,000 times a day, constituting about 1.53% of eBayโ€™s total daily requests. Although it may seem like a relatively small number, itโ€™s big enough to imply electronic trespassing.

Hundreds and thousands of companies and individuals leverage web scraping. Search engines, for instance, rely on it to index content on the web, what generally benefits owners of scraped websites. That, however, doesnโ€™t mean that this technique isnโ€™t being used in an abusive manner and wonโ€™t create any legal issues for users of scrapers. In the following paragraphs, weโ€™ll discuss the most common legal issues of scraping and try to figure out whether scraping search engine (Googleโ€™s, in particular) results pages is legal.

Is web scraping legal?

Using web scraping or developing a scraper in itself isnโ€™t at all illegal. For example, you may want to scrape content on your own website. However, problems are likely to arise if you try scraping the content of somebody elseโ€™s website without permission. All that boils down to breaking the sourceโ€™s ToS, violating copyright legislation (i.e., DMCA โ€“ Digital Millennium Copyright Act) and CFAA (Computer Fraud and Abuse Act).

1. Web scraping and CFAA

The Computer Fraud and Abuse Act (CFAA) was introduced back in 1986 as an anti-hacking measure that forbids unauthorized access to computers, which imposes a criminal penalty on โ€œa party who intentionally accesses a computer without authorization or exceeds authorized access, and thereby obtains information from any protected computer. โ€ CFAA-based claims are popular among data hosts because they provide for pressing criminal charges against scrapers.

In 2016 LinkedIn started sending cease-and-decease letters to hiQ Labs, the startup that was relying on data scraped from the popular social network to provide analysis of publicly available user profiles. Letters contained warnings that unauthorized access to a LinkedIn website was the violation of CFAA. Instead of ceasing its operations, the team of hiQ took LinkedInโ€™s accusations directly to court, claiming that the automated access of publicly available data isnโ€™t a violation of the CFAA, while also asking the court to prohibit LinkedIn from โ€œpreventing hiQโ€™s access, copying, or use of public profiles on LinkedInโ€™s website (i.e., information that LinkedIn members have designated public).โ€

The court decided in favor of hiQ, allowing the company to scrape LinkedInโ€™s public, non-password protected data. In this article we wonโ€™t dig deeply into the ruling โ€“ you can view the full text here. Itโ€™s sufficient to note that although scraping in many cases breaks ToS of the scraped website, itโ€™s not necessarily the violation of the Computer Fraud and Abuse Act.

Since information on Google SERPs is also publicly available and non-password protected, DataForSEO doesnโ€™t break CFAA by scraping it.

2. Web Scraping and DMCA

Copyright claims are often brought by data hosts against scrapers. In the United States, copyrighted work is protected by the Digital Millennium Copyright Act (DMCA).

Nonetheless, itโ€™s widely known that facts alone canโ€™t be copyrighted, so DMCA and similar legislation wonโ€™t protect data hosts against scrapers unless they have full control over the copyright of the stored content. The point is that the transfer of copyright ownership generally requires a written agreement signed by the copyright owner. On the other hand, however, itโ€™s true that โ€œthe creative selection, coordination and arrangement of information and materials forming a database or compilation may be protected by copyright.โ€ โ€“ cendi.gov. However, this protection doesnโ€™t extend to the facts stored in the database. Put simply, copyright is meant to protect originality and creativity, not facts.

Letโ€™s take a Google results page as an example. Its design and layout may be regarded as creative work and hence can be copyrighted. At the same time, facts contained in a results page, including headlines and snippets, are not owned by Google; in fact, they are being pulled from other peopleโ€™s websites without transferring copyright ownership.

Given that results on Google SERPs are not protected by copyright, itโ€™s only logical that DataForSEO doesnโ€™t violate DMCA by scraping them.

3. Web Scraping and ToS

Nowadays almost every well-established website has a Terms of Service (ToS) page, which โ€“ in most cases โ€“ prohibits scraping, and crawling their content. Google isnโ€™t an exception, with its ToS explicitly prohibiting โ€œthe sending of automated queries of any sort to our system without express permission in advance from Google.โ€

While the need to comply with the Terms of Service is widely perceived as a must, a lid on scraping appears to be an exception. There are hundreds and thousands of companies making use of scraping, including search engines, news aggregators, SEO software companies, etc. So, does it mean that all these companies are doing it illegally? And if so, are they getting sued over it?

Well, this is a rather grey area. On the one hand, by violating websiteโ€™s ToS scrapers may also break the CFAA, which โ€“ as we already explained in the previous paragraphs โ€“ can result in criminal charges against scrapers. On the other hand, there have been cases in which a court dismissed CFAA violation claims and ruled that people are authorized to access publicly available information (even if they might be scraping it).

For example, in the Craiglist v. 3taps litigation, the court held 3taps liable under the CFAA, but also acknowledged that scrapers are authorized to access publicly available data for as long as their access isnโ€™t restricted by companies that host the data. Such restrictions can include different measures, such as cease-and-decease letters, IP blocking, captchas, etc. However, all three are seldom considered as legitimate access restrictions that implicate CFAA violations.

  1. Cease-and-decease letters are often cited as use restrictions, as opposed to access restrictions.
  2. IP blocking might be a good way to block a scraper from accessing data, but masking your IP address isnโ€™t a crime. So, itโ€™s only logical that switching IPs when scraping websites isnโ€™t hacking and therefore canโ€™t be deemed a CFAA violation.
  3. Captchas are not the same as passwords, so bypassing them isnโ€™t a CFAA violation as well.

The LinkedIn v. hiQ case outcomes prove that the mere violation of websites ToS might be a breach of contract, but doesnโ€™t constitute a crime. Whatโ€™s more, if we take a look at Googleโ€™s attitude towards violations of its Terms of Service, we can clearly see that the search engine has never taken any legal actions against scrapers.

Nevertheless, there are other measures Google can resort to, including revoking access to its APIs.

  • In 2012 the team of Google Adwords warned Raven Tools about the possibility of revoking the SEO softwareโ€™s AdWords API token if they donโ€™t comply with Googleโ€™s anti-scraping policy. Raven Tools eventually was forced to remove its SERP Reporting feature, which was using data scraped from Googleโ€™s results pages to provide a ranking and keyword data to its customers.
  • By the same token โ€“ and also in 2012 โ€“ the team of Moz lost access to Adwords API. The company eventually came up with a workaround, substituting clickstream data for that of AdWordsโ€™.

At the same time, however, what started as Googleโ€™s crackdown on SEO software developersโ€™ access to AdWords API ended up being an inconsistent series of ToS enforcement. Whatโ€™s more, to date, the search engine continues to avoid the full-scale enforcement of its anti-scraping policies.

Here, at DataForSEO, weโ€™ve taken all the necessary measures to ensure that our clients keep receiving data they need, regardless the Googleโ€™s attitude to continue enforcing its Terms of Service.

The bottom line

Although scraping is legal by itself, itโ€™s possible for data hosts to mount legal defenses against scrapers, including CFAA and DMCA violation claims.

On the other hand, the outcomes of recent lawsuits filed against scrapers prove that there are a lot of grey areas in current legislation on this matter, and courts may stand in favor of open access to publicly available information (see Sanding v. Sessions ruling). And even though data hosts may prevail against scrapers in courts, itโ€™s often against their interest to sue. For example, if it werenโ€™t for crawling public websites and scraping data from them, Google probably wouldnโ€™t even exist.

After all, we donโ€™t violate CFAA or DMCA when scraping publicly available information from Google SERPs. You can rest assured: using DataForSEO services is legal, itโ€™s not the violation of the law. By the same token, you canโ€™t break Google ToS by simply getting data through our APIs.

Disclaimer: This article does not constitute legal advice. You should seek the counsel of an attorney on your specific matter to comply with the laws in your jurisdiction.

George Svash

George is the Director of Content Marketing at DataForSEO, an API suite designed to help SEO software companies and agencies gather the SEO data they need for their projects. George is a tech and marketing geek with a deep passion for Big Data and SEO. Having a broad experience in content marketing and a degree in engineering, he is particularly good at explaining complex concepts.

No Comments

Sorry, the comment form is closed at this time.

Embed DataForSeo widget on your website


Embed code:
Preview: