Is Web Scraping Legal?

Is Web Scraping Legal?
Lena Fisher's Profile Image
Lena Fisher

Content Manager, Octo Browser

Web scraping, also known as web parsing, is the automated collection of online data. It is widely used for marketing, price analysis, brand monitoring, and many other tasks. The amount of information on the Internet grows every year, and website parsing becomes a powerful tool for working with large amounts of digital content. Is web scraping legal? Yes, but there are some details you need to consider. In this article, we look at which legal rules in the USA, the European Union, the UK, and Russia can affect web scraping.

Web scraping, also known as web parsing, is the automated collection of online data. It is widely used for marketing, price analysis, brand monitoring, and many other tasks. The amount of information on the Internet grows every year, and website parsing becomes a powerful tool for working with large amounts of digital content. Is web scraping legal? Yes, but there are some details you need to consider. In this article, we look at which legal rules in the USA, the European Union, the UK, and Russia can affect web scraping.

Contents

Is Web Scraping Legal?

A simple example: when you search online for a product and compare prices on different websites, you are basically doing manual scraping. Automated web scraping does the same task faster. It helps collect large amounts of data according to specific criteria and organize it into files for analysis. Using this method, you can scrape prices, delivery terms, store assortments, contacts, and much more.

Is it legal? Yes, if we are talking about collecting publicly available information, similar to manually checking prices on different platforms. Legal issues arise when scraping involves:

  • copyrighted materials;

  • personal data (phone numbers, email addresses);

  • Information hidden from unregistered or unauthorized users.

Bypassing a website’s technical protection measures — CAPTCHAs, logins, bot blocks — can also be illegal.

How Privacy Laws Affect Web Scraping

Most countries do not have direct regulations concerning web scraping. However, many rules apply indirectly if scraping involves copyrighted materials or hidden content. It is also risky to break a website’s terms of use, security rules, or to collect personal data.

Any information that can identify a specific person is considered personal data. Different countries define their own categories, but most include:

  • full name;

  • address, phone number, email;

  • ID numbers;

  • IP address and cookies;

  • location data;

  • financial information.

Some countries also have a category of sensitive data. Usually, this includes information about a person’s ethnicity, religion or political views, sexual life and orientation, as well as biometric and medical data.

Note: In this article, we look at the potential risks of web scraping from the perspective of the laws in different countries. Before starting to scrape, we recommend carefully studying the laws of the region you are working in and assessing possible risks. It is important to remember that even if you perform actions from one country, they can affect users or resources in other regions and fall under the laws of multiple countries. For example, if a user from Europe collects data from American websites, both EU and US rules may apply at the same time.

What Are the Laws Related to Web Scraping in Different Countries?

USA

  • CFAA (Computer Fraud and Abuse Act) — protection against unauthorized access and bypassing technical protection measures.

  • DMCA (Digital Millennium Copyright Act) — protection of copyrights in the digital environment.

  • FTC Act (Federal Trade Commission Act, Section 5) — prohibition of unfair business practices.

  • State Data Breach Laws — state laws on personal data.

  • First Amendment and Fair Use Doctrines — principles of fair use of materials.

  • ToS (Terms of Service) — website terms of use.

European Union (EU)

  • GDPR (General Data Protection Regulation) — protection of personal data.

  • Database Directive 96/9/EC — protection of databases.

  • Copyright Directive — unified copyright standards.

  • ePrivacy Directive — privacy protection and rules for using cookies.

  • DSA (Digital Services Act) — rules for safety and content control on platforms.

  • P2B Regulation (Platform-to-Business Regulation) — transparent conditions for business users.

United Kingdom

  • UK GDPR (United Kingdom General Data Protection Regulation) — protection of personal data.

  • DPA 2018 (Data Protection Act 2018) — also protects personal data.

  • CDPA (Copyright, Designs and Patents Act 1988) — copyright protection for original content.

  • Database Right — protection of databases.

  • CMA (Computer Misuse Act 1990) — prohibition of unauthorized access to systems.

Russia

  • Federal Law on Personal Data No. 152‑FZ — protection of personal data.

  • Civil Code of the Russian Federation, Part IV — copyrights and databases.

  • Federal Law on Information, IT and Information Protection No. 149‑FZ — access to information and protection of IT systems.

  • Federal Law on Protection of Competition No. 135‑FZ — unfair competition.

  • Federal Law on Protection of Consumer Rights — regulates commercial services.

  • Federal Law on Communications — protection of infrastructure and networks.

How Web Scraping Is Regulated in the USA

Web scraping is legal if you follow the rules on data access, copyrights, fair competition, privacy, and website terms of use. Risks arise if a scraper bypasses technical restrictions or violates the rights of third parties.

Area

Regulations

Allowed

Not Allowed

Note

Data Access and System Protection

CFAA, ToS

  • Scrape public pages.

  • Make requests without bypassing logins, CAPTCHAs, paid subscriptions, or IP blocks.

  • Bypass technical protection measures.

  • Hack databases.

  • Use someone else’s passwords, accounts, or cookies.

  • Break a website’s rules or use its vulnerabilities.


Personal Data and Privacy

CCPA, CPRA, State Laws

  • Collect anonymized data, public information, and reviews.

  • Secretly sell information.

  • Scrape email addresses, phone numbers, behavioral profiles, or location data without informing the user and without giving them a way to opt out

The law requires notifying users about data breaches. Users must also have the option to opt out of data collection and processing.

Copyright and Content Use

DMCA, Fair Use

  • Extract facts, prices, catalogs, statistical data, product descriptions, and analytical results.

  • Transform information into a new format — for example, charts or infographics.

  • Quote collected information in a limited way.

  • Publish texts, photos, or reviews from other websites without permission.

  • Bypass technical protection of digital content.


Fair Business Practices

Section 5 of the FTC Act

  • Use public data for analytics, product ratings, or reviews.

  • Distort information.

  • Present automated access as real user activity.

The FTC can take action if a company secretly processes or sells personal data while claiming otherwise. Companies are also required to clearly state what information they collect, for what purpose, and with whom it is shared.

How Web Scraping Is Regulated in the European Union

Web scraping is allowed in the European Union. Risks arise when bypassing technical restrictions on platforms, accessing closed sections, or faking cookies, tokens, or sessions. It is also important to follow the request frequency and website terms of use. These rules are controlled by the GDPR, Database Directive, Copyright Directive, ePrivacy Directive, DSA, and P2B Regulation.

Area

Regulations

Allowed

Not Allowed

Note

Personal Data and Privacy

CDPR, ePrivacy Directive, DSA, P2B Regulation 

  • Collect non-personal data — prices, product specifications, ratings, number of reviews.

  • Process public personal data if a legitimate interest is proven.

  • Manipulate cookies or bypass cookie restrictions.

  • Access data stored on a user’s device without their consent.

  • Collect personal data — email addresses, names, photos, social media profiles, or other private information.

  • Extract information from private profiles or premium‑only areas.

  • Ignore platform prohibitions on automated data collection.

Legitimate interest is a valid reason for working with personal data. If you work with personal data, it is important to follow the main principles of the GDPR: minimize data collection, ensure transparency, have a specific purpose, notify the user, and delete data upon request.

Copyright and Content Use

Copyright Directive

  • Extract facts and general information without creative content — opening hours, prices, number of reviews, product specifications.

  • Use small content fragments for analysis.

  • Copy and publish texts and images.

  • Upload content from other websites or post articles without significant modification.


Databases

Database Directive 96/9/EU

  • Collect small parts or individual elements of databases

  • Copy a substantial part of a database — both in volume and in significance.

  • Extract content in bulk.

  • Republish content.

  • Create a product that is entirely based on someone else’s database.


Technical Access Restrictions

Directive 2013/40/EU, Directive 2001/29/EU

  • Visit public pages via HTTP requests.

  • Use the official API.

  • Follow request limits.

  • Scrape data according to the rules stated in the robots.txt file.

  • Bypass a platform’s technical protection.

  • Spoof cookies, tokens, sessions, or the User-Agent.

  • Emulate a device.

  • Bypass authentication.

  • Access premium‑only data or restricted areas.

  • Overload a website with too many requests.


Platform Rules and Market Relations

DSA, P2B Regulation, ToS

  • Collect public data through official APIs.

  • Scrape data while respecting rate limits and the platform’s technical requirements for bots.

  • Overload the service.

  • Ignore platform rules against bots.

  • Bypass the site’s protection.

  • Imitate real user behavior.


How Web Scraping Is Regulated in the United Kingdom

There are no laws in the UK that directly regulate web scraping. However, its legality depends on whether it involves personal data, databases, or copyrighted materials. It is also important to follow website rules and not bypass a platform’s technical protections.

UK GDPR is the UK version of the European GDPR, adapted after Brexit.

Area

Regulations

Allowed

Not Allowed

Note

Personal Data

UK GDPR, Data Protection Act 2018 

  • Scrape non-personal and anonymized public data — prices, product specifications, event schedules.

  • Collect email addresses, names, photos, social media profiles, and other personal data without consent.

  • Scrape public accounts for marketing, user profiling, or facial recognition.

In the UK, scraping and processing personal information must have a legal basis — for example, the person’s consent. Automated web scraping of personal data can lead to criminal liability.

Copyright

CDPA 1988

  • Collect facts — prices, ratings, product specifications and assortments, event dates, or numerical data.

  • Copy protected materials in their original form — texts, photos, infographics, or code.

  • Republish third-party materials.

  • Aggregate articles on your own platforms.

  • Create catalogs that are entirely based on third-party content.


Databases

Database Right

  • Collect fragments for personal use, statistics, analysis, and research.

  • Use data for non-commercial purposes.

  • Collect non-substantial parts of a database.

  • Copy a substantial part of a database.

  • Create a competing database based on third‑party data.

  • Bypass a database’s technical protection measures.

A non-substantial part of a database is no more than 30–50% and does not include key catalog elements.

Technical Protection Measures and Access

Computer Misuse Act 1990

  • Scrape data from publicly accessible pages.

  • Bypass technical protection measures.

  • Spoof cookies.

  • Bypass authentication and IP blocks.

  • Break CAPTCHAs.

  • Mask a bot as a real user to access restricted systems.


How Web Scraping Is Regulated in Russia

There is no law in Russia that directly regulates web scraping. However, several legal acts affect the scraping of personal data, databases, commercial information, as well as information systems or copyrighted materials.

Area

Regulations

Allowed

Not Allowed

Note

Personal Data

Federal Law on Personal Data No. 152‑FZ

  • Collect public non-personal data — prices, product specifications, event schedules, news, statistics.

  • Scrape anonymized social media data — numbers of likes and reposts, anonymous usernames, links to pages without identifying information.

  • Collect personal information — names, phone numbers, geolocation, email and home addresses, photos, social media profile IDs.

  • Build contact databases — for example, of Avito users.

  • Collect and aggregate personal data from multiple profiles without consent.

  • Share personal data with third parties.

Any collection of personal data must have a legal basis — such as the user’s consent. Alternatively, it must meet another purpose provided by law. For example, to save a person in an emergency, you may share their medical information without consent.

Copyright and Databases

Civil Code of the Russian Federation, Part IV

  • Scrape factual information — prices, technical specifications.

  • Use factual information for analysis, statistics, and research.

  • Mass-copy and publish someone else’s materials — original texts, descriptions, photos, articles, images, and software code.

  • Fully scrape databases.

  • Extract substantial parts of databases, even if individual pieces of information inside are not protected.

Databases are protected as independent objects.

Technical Protection Measures

Federal Law on Information, IT and Information Protection No. 149‑FZ

  • Scrape publicly accessible pages and collect webinar schedules or product specifications.

  • Bypass technical protection measures.

  • Automate access to restricted systems or protected databases.

  • Spoof cookies.

  • Use other people’s tokens or passwords.

  • Bypass authentication and CAPTCHAs.

  • Overload a website, similar to a DDoS attack.


Unfair Competition and Consumer Protection

Federal Law on Protection of Competition No. 135‑FZ, Federal Law on Protection of Consumer Rights

  • Work with competitors’ public data for market monitoring.

  • Create clones of services.

  • Pass off someone else’s content as your own.

  • Show old or incorrect data — for example, on aggregator sites.


Infrastructure and Telecommunications

Federal Law on Communications

  • Collect public data.

  • Send large volumes of requests similar to a DDoS attack.


Best Practices for Safe and Ethical Web Scraping

Use APIs When Available

APIs are an official and safe way to access data from a website without violating its protections or rules. With an API, the site owner determines what information can be collected, how often, and in what format, which minimizes the risk of violations. Many social media and services provide APIs for accessing posts, comments, ratings, or statistics. You can usually find them in sections like API, Developers, Documentation, Integrations, or by searching for “Site name + API.”

Follow the Website’s Rules

Before scraping, review the website’s Terms of Service (ToS). They usually explain whether automated data collection is allowed and under what conditions. Also check the robots.txt file — you can access it at https://domain/robots.txt. It shows which parts of the site can be visited by scraper bots.

Be respectful of the platform’s resources and scrape responsibly. Limit your request rate — for example, make one request per second. Add random delays between requests and pay attention to server response codes like 429 or 503. If you see them, reduce your request frequency. This helps avoid technical violations and lowers the risk of being blocked.

Minimize Data Collection

Collect only the data that is truly necessary for your task. This reduces risks, simplifies storage, and shows respect for the website owners and users.

Before scraping, define your goal and make a list of required fields. Do not collect anything that does not help meet it. For example, when analyzing news, it is enough to collect the headline, date, and category. The author’s name or links to their social media are not necessary.

Also, avoid collecting personal data such as names, email addresses, geolocation, photos, or reviews with personal information.

Document the Data You Collect

Record the sources of your data and how you process it. This helps maintain transparency and, if necessary, demonstrate the legality of your work. If you have collected more data than needed, delete the excess data.

Transform Data to Avoid Copyright Issues

Use the collected data to create a new result — such as analysis, statistics, visualizations, or your own content. For example, if a bot collects MacBook Air prices from different stores, it is fine to use this information to create a price trend graph. However, publishing other people’s product descriptions without modification is not recommended. It may violate copyright.

Risks and Consequences of Not Following the Scraping Rules

Criminal or Regulatory Sanctions (GDPR, CCPA)

GDPR (EU) specifies fines of up to €20 million or 4% of a company’s global annual turnover. CCPA (USA) allows financial penalties of up to $7,500 for each violation. Risks can arise even when working with public data if it can be used to identify individuals or is processed unlawfully.

Regulators actively enforce these measures. In 2024, total GDPR fines exceeded €1.2 billion. Some of the most notable recent sanctions include:

  • Meta — around €1.2 billion for illegal transfer of data from the EU to the USA.

  • Amazon — €746 million for breaching GDPR principles.

  • LinkedIn — €310 million for processing data without a sufficient legal basis.

  • TikTok — €530 million for transferring data to China and insufficient privacy policy transparency.

These fines show that violating data processing and transfer rules is a potentially costly risk for scraping specialists and businesses.

Operational and Business Risks

Beyond fines, proven violations in web scraping can lead to serious business threats. Companies may face consequences such as:

  • IP access blocks and restrictions on data use;

  • lawsuits from competitors or users demanding compensation for unlawful use of personal data, content, or databases;

  • loss of partnerships and reputation if it is revealed that data was obtained or used improperly.

Breaking the rules also leads to operational costs. Businesses may need to:

  • review their architecture;

  • change data storage and processing workflows;

  • delete unlawfully collected datasets;

  • implement compliance processes;

  • maintain logs and manage user consents.

In some cases, companies have completely shut down a product after discovering violations in the collection of a key data source.

Sometimes companies and specialists working with automated data collection use additional solutions — for example, anti-detect browsers, such as Octo Browser. They help manage network parameters more selectively, e.g., use different IP addresses and change the device’s digital fingerprint. These tools also make it possible to control the request rate during web scraping to distribute the load across sessions. All of this enables more responsible scraping. This reduces the risk of automatic platform blocks and additional checks, like CAPTCHAs. However, from a legal perspective, using these solutions does not exempt you from liability if scraping violates the website’s rules or the laws of the country.

Court Cases Related to Web Scraping

LinkedIn vs. hiQ Labs (USA, 2019–2022)

This case is a key precedent in the United States. It established that collecting publicly available data does not violate the CFAA. hiQ analyzed public LinkedIn profiles, while the social network attempted to block the scraping, claiming it constituted unauthorized access. The Ninth Circuit Court of Appeals ruled that if the data is public and does not require authorization, collecting it is legal.

This decision set a standard: scraping public pages with public access (as a user without logging in) is not considered a violation. However, the court noted that attempting to access the private areas of the site qualifies as unauthorized access.

Craigslist vs. 3Taps (USA, 2013)

The Federal Court for the Northern District of California ruled that web scraping violated the CFAA due to bypassing technical restrictions. 3Taps collected listings from Craigslist and reposted them on its own platform. Even after an official cease-and-desist and IP blocks, the company continued scraping pages through proxies.

The court held that any subsequent access after a clear prohibition and blocking is considered unauthorized. This case demonstrated that scraping itself is not always illegal, but bypassing technical protection measures constitutes a serious violation.

Facebook vs. Power Ventures (USA, 2009)

Power Ventures scraped data about users’ friends and activities on Facebook without the social network’s consent, including bypassing authentication. Moreover, Power Ventures ignored warning notices from Facebook.

The court ruled that this violated the CFAA as well as computer security laws. Even with a user’s consent to access their data (granted to Facebook), a third party cannot bypass a platform’s technical protection for mass data collection. The decision became a key precedent for assessing the legality of scraping private systems and complying with platform rules.

Ryanair vs. Booking.com (USA, 2025)

Ryanair accused Booking.com of unauthorized scraping of flight and pricing data, despite explicit prohibitions and technical restrictions. Initially, a jury found the access to be unauthorized. However, in 2025, the judge reviewed the case and noted that Ryanair had not demonstrated actual harm. Therefore, the CFAA could not be applied in this case.

Finally, the parties reached an agreement. Booking.com can legally resell Ryanair tickets as long as it complies with access rules and maintains price transparency. The case showed that bypassing restrictions during scraping is risky, and that proving actual harm and negotiating settlements can often be decisive.

Conclusion

Web scraping in itself is not considered illegal. When used ethically, it is a powerful tool for collecting and analyzing data, as well as improving business processes. However, safe scraping requires a careful approach. To make the process less risky:

  • use official APIs of platforms whenever available;

  • follow rate limits and request frequency rules;

  • collect only the data you actually need;

  • do not bypass a platform’s technical protection measures;

  • avoid scraping personal data;

  • respect copyright and intellectual property.

Before starting web scraping, always review the applicable laws and regulations, the website’s ToS, and potential risks.

FAQ

Is Web Scraping Illegal?

No, web scraping itself is not prohibited. However, its legality depends on what data is being collected and how. It is allowed to collect publicly available factual information. Problems can arise if the scraper violates a website’s rules, processes personal data without a legal basis to do so, or accesses copyrighted or restricted materials. It is also important to use transparent scraping methods without bypassing technical protection measures.

Is Web Scraping Legal in the USA?

The legality of web scraping in the USA depends on whether access to the site violates the CFAA. Public pages can be analyzed, but bypassing logins, paid subscriptions, IP blocks, or other barriers can be considered a violation. A well-known example is the LinkedIn vs. hiQ Labs case. The court allowed collecting data from public profiles but emphasized that any attempt to access private website areas turns scraping into illegal activity.

Can Web Scraping Be Used for Commercial or Research Purposes?

Yes, these are among the most common web scraping purposes. However, there are several conditions that need to be met. Commercial projects must respect copyright, follow platform rules, and avoid collecting personal data. For research purposes, it is important to work with public or anonymized information, avoid accessing protected website areas, and transform the data during its analysis for publishing purposes. The key requirement in both cases is not to bypass technical restrictions or extract data for which there is no legal right or authorization.

Is Web Scraping Legal?

A simple example: when you search online for a product and compare prices on different websites, you are basically doing manual scraping. Automated web scraping does the same task faster. It helps collect large amounts of data according to specific criteria and organize it into files for analysis. Using this method, you can scrape prices, delivery terms, store assortments, contacts, and much more.

Is it legal? Yes, if we are talking about collecting publicly available information, similar to manually checking prices on different platforms. Legal issues arise when scraping involves:

  • copyrighted materials;

  • personal data (phone numbers, email addresses);

  • Information hidden from unregistered or unauthorized users.

Bypassing a website’s technical protection measures — CAPTCHAs, logins, bot blocks — can also be illegal.

How Privacy Laws Affect Web Scraping

Most countries do not have direct regulations concerning web scraping. However, many rules apply indirectly if scraping involves copyrighted materials or hidden content. It is also risky to break a website’s terms of use, security rules, or to collect personal data.

Any information that can identify a specific person is considered personal data. Different countries define their own categories, but most include:

  • full name;

  • address, phone number, email;

  • ID numbers;

  • IP address and cookies;

  • location data;

  • financial information.

Some countries also have a category of sensitive data. Usually, this includes information about a person’s ethnicity, religion or political views, sexual life and orientation, as well as biometric and medical data.

Note: In this article, we look at the potential risks of web scraping from the perspective of the laws in different countries. Before starting to scrape, we recommend carefully studying the laws of the region you are working in and assessing possible risks. It is important to remember that even if you perform actions from one country, they can affect users or resources in other regions and fall under the laws of multiple countries. For example, if a user from Europe collects data from American websites, both EU and US rules may apply at the same time.

What Are the Laws Related to Web Scraping in Different Countries?

USA

  • CFAA (Computer Fraud and Abuse Act) — protection against unauthorized access and bypassing technical protection measures.

  • DMCA (Digital Millennium Copyright Act) — protection of copyrights in the digital environment.

  • FTC Act (Federal Trade Commission Act, Section 5) — prohibition of unfair business practices.

  • State Data Breach Laws — state laws on personal data.

  • First Amendment and Fair Use Doctrines — principles of fair use of materials.

  • ToS (Terms of Service) — website terms of use.

European Union (EU)

  • GDPR (General Data Protection Regulation) — protection of personal data.

  • Database Directive 96/9/EC — protection of databases.

  • Copyright Directive — unified copyright standards.

  • ePrivacy Directive — privacy protection and rules for using cookies.

  • DSA (Digital Services Act) — rules for safety and content control on platforms.

  • P2B Regulation (Platform-to-Business Regulation) — transparent conditions for business users.

United Kingdom

  • UK GDPR (United Kingdom General Data Protection Regulation) — protection of personal data.

  • DPA 2018 (Data Protection Act 2018) — also protects personal data.

  • CDPA (Copyright, Designs and Patents Act 1988) — copyright protection for original content.

  • Database Right — protection of databases.

  • CMA (Computer Misuse Act 1990) — prohibition of unauthorized access to systems.

Russia

  • Federal Law on Personal Data No. 152‑FZ — protection of personal data.

  • Civil Code of the Russian Federation, Part IV — copyrights and databases.

  • Federal Law on Information, IT and Information Protection No. 149‑FZ — access to information and protection of IT systems.

  • Federal Law on Protection of Competition No. 135‑FZ — unfair competition.

  • Federal Law on Protection of Consumer Rights — regulates commercial services.

  • Federal Law on Communications — protection of infrastructure and networks.

How Web Scraping Is Regulated in the USA

Web scraping is legal if you follow the rules on data access, copyrights, fair competition, privacy, and website terms of use. Risks arise if a scraper bypasses technical restrictions or violates the rights of third parties.

Area

Regulations

Allowed

Not Allowed

Note

Data Access and System Protection

CFAA, ToS

  • Scrape public pages.

  • Make requests without bypassing logins, CAPTCHAs, paid subscriptions, or IP blocks.

  • Bypass technical protection measures.

  • Hack databases.

  • Use someone else’s passwords, accounts, or cookies.

  • Break a website’s rules or use its vulnerabilities.


Personal Data and Privacy

CCPA, CPRA, State Laws

  • Collect anonymized data, public information, and reviews.

  • Secretly sell information.

  • Scrape email addresses, phone numbers, behavioral profiles, or location data without informing the user and without giving them a way to opt out

The law requires notifying users about data breaches. Users must also have the option to opt out of data collection and processing.

Copyright and Content Use

DMCA, Fair Use

  • Extract facts, prices, catalogs, statistical data, product descriptions, and analytical results.

  • Transform information into a new format — for example, charts or infographics.

  • Quote collected information in a limited way.

  • Publish texts, photos, or reviews from other websites without permission.

  • Bypass technical protection of digital content.


Fair Business Practices

Section 5 of the FTC Act

  • Use public data for analytics, product ratings, or reviews.

  • Distort information.

  • Present automated access as real user activity.

The FTC can take action if a company secretly processes or sells personal data while claiming otherwise. Companies are also required to clearly state what information they collect, for what purpose, and with whom it is shared.

How Web Scraping Is Regulated in the European Union

Web scraping is allowed in the European Union. Risks arise when bypassing technical restrictions on platforms, accessing closed sections, or faking cookies, tokens, or sessions. It is also important to follow the request frequency and website terms of use. These rules are controlled by the GDPR, Database Directive, Copyright Directive, ePrivacy Directive, DSA, and P2B Regulation.

Area

Regulations

Allowed

Not Allowed

Note

Personal Data and Privacy

CDPR, ePrivacy Directive, DSA, P2B Regulation 

  • Collect non-personal data — prices, product specifications, ratings, number of reviews.

  • Process public personal data if a legitimate interest is proven.

  • Manipulate cookies or bypass cookie restrictions.

  • Access data stored on a user’s device without their consent.

  • Collect personal data — email addresses, names, photos, social media profiles, or other private information.

  • Extract information from private profiles or premium‑only areas.

  • Ignore platform prohibitions on automated data collection.

Legitimate interest is a valid reason for working with personal data. If you work with personal data, it is important to follow the main principles of the GDPR: minimize data collection, ensure transparency, have a specific purpose, notify the user, and delete data upon request.

Copyright and Content Use

Copyright Directive

  • Extract facts and general information without creative content — opening hours, prices, number of reviews, product specifications.

  • Use small content fragments for analysis.

  • Copy and publish texts and images.

  • Upload content from other websites or post articles without significant modification.


Databases

Database Directive 96/9/EU

  • Collect small parts or individual elements of databases

  • Copy a substantial part of a database — both in volume and in significance.

  • Extract content in bulk.

  • Republish content.

  • Create a product that is entirely based on someone else’s database.


Technical Access Restrictions

Directive 2013/40/EU, Directive 2001/29/EU

  • Visit public pages via HTTP requests.

  • Use the official API.

  • Follow request limits.

  • Scrape data according to the rules stated in the robots.txt file.

  • Bypass a platform’s technical protection.

  • Spoof cookies, tokens, sessions, or the User-Agent.

  • Emulate a device.

  • Bypass authentication.

  • Access premium‑only data or restricted areas.

  • Overload a website with too many requests.


Platform Rules and Market Relations

DSA, P2B Regulation, ToS

  • Collect public data through official APIs.

  • Scrape data while respecting rate limits and the platform’s technical requirements for bots.

  • Overload the service.

  • Ignore platform rules against bots.

  • Bypass the site’s protection.

  • Imitate real user behavior.


How Web Scraping Is Regulated in the United Kingdom

There are no laws in the UK that directly regulate web scraping. However, its legality depends on whether it involves personal data, databases, or copyrighted materials. It is also important to follow website rules and not bypass a platform’s technical protections.

UK GDPR is the UK version of the European GDPR, adapted after Brexit.

Area

Regulations

Allowed

Not Allowed

Note

Personal Data

UK GDPR, Data Protection Act 2018 

  • Scrape non-personal and anonymized public data — prices, product specifications, event schedules.

  • Collect email addresses, names, photos, social media profiles, and other personal data without consent.

  • Scrape public accounts for marketing, user profiling, or facial recognition.

In the UK, scraping and processing personal information must have a legal basis — for example, the person’s consent. Automated web scraping of personal data can lead to criminal liability.

Copyright

CDPA 1988

  • Collect facts — prices, ratings, product specifications and assortments, event dates, or numerical data.

  • Copy protected materials in their original form — texts, photos, infographics, or code.

  • Republish third-party materials.

  • Aggregate articles on your own platforms.

  • Create catalogs that are entirely based on third-party content.


Databases

Database Right

  • Collect fragments for personal use, statistics, analysis, and research.

  • Use data for non-commercial purposes.

  • Collect non-substantial parts of a database.

  • Copy a substantial part of a database.

  • Create a competing database based on third‑party data.

  • Bypass a database’s technical protection measures.

A non-substantial part of a database is no more than 30–50% and does not include key catalog elements.

Technical Protection Measures and Access

Computer Misuse Act 1990

  • Scrape data from publicly accessible pages.

  • Bypass technical protection measures.

  • Spoof cookies.

  • Bypass authentication and IP blocks.

  • Break CAPTCHAs.

  • Mask a bot as a real user to access restricted systems.


How Web Scraping Is Regulated in Russia

There is no law in Russia that directly regulates web scraping. However, several legal acts affect the scraping of personal data, databases, commercial information, as well as information systems or copyrighted materials.

Area

Regulations

Allowed

Not Allowed

Note

Personal Data

Federal Law on Personal Data No. 152‑FZ

  • Collect public non-personal data — prices, product specifications, event schedules, news, statistics.

  • Scrape anonymized social media data — numbers of likes and reposts, anonymous usernames, links to pages without identifying information.

  • Collect personal information — names, phone numbers, geolocation, email and home addresses, photos, social media profile IDs.

  • Build contact databases — for example, of Avito users.

  • Collect and aggregate personal data from multiple profiles without consent.

  • Share personal data with third parties.

Any collection of personal data must have a legal basis — such as the user’s consent. Alternatively, it must meet another purpose provided by law. For example, to save a person in an emergency, you may share their medical information without consent.

Copyright and Databases

Civil Code of the Russian Federation, Part IV

  • Scrape factual information — prices, technical specifications.

  • Use factual information for analysis, statistics, and research.

  • Mass-copy and publish someone else’s materials — original texts, descriptions, photos, articles, images, and software code.

  • Fully scrape databases.

  • Extract substantial parts of databases, even if individual pieces of information inside are not protected.

Databases are protected as independent objects.

Technical Protection Measures

Federal Law on Information, IT and Information Protection No. 149‑FZ

  • Scrape publicly accessible pages and collect webinar schedules or product specifications.

  • Bypass technical protection measures.

  • Automate access to restricted systems or protected databases.

  • Spoof cookies.

  • Use other people’s tokens or passwords.

  • Bypass authentication and CAPTCHAs.

  • Overload a website, similar to a DDoS attack.


Unfair Competition and Consumer Protection

Federal Law on Protection of Competition No. 135‑FZ, Federal Law on Protection of Consumer Rights

  • Work with competitors’ public data for market monitoring.

  • Create clones of services.

  • Pass off someone else’s content as your own.

  • Show old or incorrect data — for example, on aggregator sites.


Infrastructure and Telecommunications

Federal Law on Communications

  • Collect public data.

  • Send large volumes of requests similar to a DDoS attack.


Best Practices for Safe and Ethical Web Scraping

Use APIs When Available

APIs are an official and safe way to access data from a website without violating its protections or rules. With an API, the site owner determines what information can be collected, how often, and in what format, which minimizes the risk of violations. Many social media and services provide APIs for accessing posts, comments, ratings, or statistics. You can usually find them in sections like API, Developers, Documentation, Integrations, or by searching for “Site name + API.”

Follow the Website’s Rules

Before scraping, review the website’s Terms of Service (ToS). They usually explain whether automated data collection is allowed and under what conditions. Also check the robots.txt file — you can access it at https://domain/robots.txt. It shows which parts of the site can be visited by scraper bots.

Be respectful of the platform’s resources and scrape responsibly. Limit your request rate — for example, make one request per second. Add random delays between requests and pay attention to server response codes like 429 or 503. If you see them, reduce your request frequency. This helps avoid technical violations and lowers the risk of being blocked.

Minimize Data Collection

Collect only the data that is truly necessary for your task. This reduces risks, simplifies storage, and shows respect for the website owners and users.

Before scraping, define your goal and make a list of required fields. Do not collect anything that does not help meet it. For example, when analyzing news, it is enough to collect the headline, date, and category. The author’s name or links to their social media are not necessary.

Also, avoid collecting personal data such as names, email addresses, geolocation, photos, or reviews with personal information.

Document the Data You Collect

Record the sources of your data and how you process it. This helps maintain transparency and, if necessary, demonstrate the legality of your work. If you have collected more data than needed, delete the excess data.

Transform Data to Avoid Copyright Issues

Use the collected data to create a new result — such as analysis, statistics, visualizations, or your own content. For example, if a bot collects MacBook Air prices from different stores, it is fine to use this information to create a price trend graph. However, publishing other people’s product descriptions without modification is not recommended. It may violate copyright.

Risks and Consequences of Not Following the Scraping Rules

Criminal or Regulatory Sanctions (GDPR, CCPA)

GDPR (EU) specifies fines of up to €20 million or 4% of a company’s global annual turnover. CCPA (USA) allows financial penalties of up to $7,500 for each violation. Risks can arise even when working with public data if it can be used to identify individuals or is processed unlawfully.

Regulators actively enforce these measures. In 2024, total GDPR fines exceeded €1.2 billion. Some of the most notable recent sanctions include:

  • Meta — around €1.2 billion for illegal transfer of data from the EU to the USA.

  • Amazon — €746 million for breaching GDPR principles.

  • LinkedIn — €310 million for processing data without a sufficient legal basis.

  • TikTok — €530 million for transferring data to China and insufficient privacy policy transparency.

These fines show that violating data processing and transfer rules is a potentially costly risk for scraping specialists and businesses.

Operational and Business Risks

Beyond fines, proven violations in web scraping can lead to serious business threats. Companies may face consequences such as:

  • IP access blocks and restrictions on data use;

  • lawsuits from competitors or users demanding compensation for unlawful use of personal data, content, or databases;

  • loss of partnerships and reputation if it is revealed that data was obtained or used improperly.

Breaking the rules also leads to operational costs. Businesses may need to:

  • review their architecture;

  • change data storage and processing workflows;

  • delete unlawfully collected datasets;

  • implement compliance processes;

  • maintain logs and manage user consents.

In some cases, companies have completely shut down a product after discovering violations in the collection of a key data source.

Sometimes companies and specialists working with automated data collection use additional solutions — for example, anti-detect browsers, such as Octo Browser. They help manage network parameters more selectively, e.g., use different IP addresses and change the device’s digital fingerprint. These tools also make it possible to control the request rate during web scraping to distribute the load across sessions. All of this enables more responsible scraping. This reduces the risk of automatic platform blocks and additional checks, like CAPTCHAs. However, from a legal perspective, using these solutions does not exempt you from liability if scraping violates the website’s rules or the laws of the country.

Court Cases Related to Web Scraping

LinkedIn vs. hiQ Labs (USA, 2019–2022)

This case is a key precedent in the United States. It established that collecting publicly available data does not violate the CFAA. hiQ analyzed public LinkedIn profiles, while the social network attempted to block the scraping, claiming it constituted unauthorized access. The Ninth Circuit Court of Appeals ruled that if the data is public and does not require authorization, collecting it is legal.

This decision set a standard: scraping public pages with public access (as a user without logging in) is not considered a violation. However, the court noted that attempting to access the private areas of the site qualifies as unauthorized access.

Craigslist vs. 3Taps (USA, 2013)

The Federal Court for the Northern District of California ruled that web scraping violated the CFAA due to bypassing technical restrictions. 3Taps collected listings from Craigslist and reposted them on its own platform. Even after an official cease-and-desist and IP blocks, the company continued scraping pages through proxies.

The court held that any subsequent access after a clear prohibition and blocking is considered unauthorized. This case demonstrated that scraping itself is not always illegal, but bypassing technical protection measures constitutes a serious violation.

Facebook vs. Power Ventures (USA, 2009)

Power Ventures scraped data about users’ friends and activities on Facebook without the social network’s consent, including bypassing authentication. Moreover, Power Ventures ignored warning notices from Facebook.

The court ruled that this violated the CFAA as well as computer security laws. Even with a user’s consent to access their data (granted to Facebook), a third party cannot bypass a platform’s technical protection for mass data collection. The decision became a key precedent for assessing the legality of scraping private systems and complying with platform rules.

Ryanair vs. Booking.com (USA, 2025)

Ryanair accused Booking.com of unauthorized scraping of flight and pricing data, despite explicit prohibitions and technical restrictions. Initially, a jury found the access to be unauthorized. However, in 2025, the judge reviewed the case and noted that Ryanair had not demonstrated actual harm. Therefore, the CFAA could not be applied in this case.

Finally, the parties reached an agreement. Booking.com can legally resell Ryanair tickets as long as it complies with access rules and maintains price transparency. The case showed that bypassing restrictions during scraping is risky, and that proving actual harm and negotiating settlements can often be decisive.

Conclusion

Web scraping in itself is not considered illegal. When used ethically, it is a powerful tool for collecting and analyzing data, as well as improving business processes. However, safe scraping requires a careful approach. To make the process less risky:

  • use official APIs of platforms whenever available;

  • follow rate limits and request frequency rules;

  • collect only the data you actually need;

  • do not bypass a platform’s technical protection measures;

  • avoid scraping personal data;

  • respect copyright and intellectual property.

Before starting web scraping, always review the applicable laws and regulations, the website’s ToS, and potential risks.

FAQ

Is Web Scraping Illegal?

No, web scraping itself is not prohibited. However, its legality depends on what data is being collected and how. It is allowed to collect publicly available factual information. Problems can arise if the scraper violates a website’s rules, processes personal data without a legal basis to do so, or accesses copyrighted or restricted materials. It is also important to use transparent scraping methods without bypassing technical protection measures.

Is Web Scraping Legal in the USA?

The legality of web scraping in the USA depends on whether access to the site violates the CFAA. Public pages can be analyzed, but bypassing logins, paid subscriptions, IP blocks, or other barriers can be considered a violation. A well-known example is the LinkedIn vs. hiQ Labs case. The court allowed collecting data from public profiles but emphasized that any attempt to access private website areas turns scraping into illegal activity.

Can Web Scraping Be Used for Commercial or Research Purposes?

Yes, these are among the most common web scraping purposes. However, there are several conditions that need to be met. Commercial projects must respect copyright, follow platform rules, and avoid collecting personal data. For research purposes, it is important to work with public or anonymized information, avoid accessing protected website areas, and transform the data during its analysis for publishing purposes. The key requirement in both cases is not to bypass technical restrictions or extract data for which there is no legal right or authorization.

Stay up to date with the latest Octo Browser news

By clicking the button you agree to our Privacy Policy.

Stay up to date with the latest Octo Browser news

By clicking the button you agree to our Privacy Policy.

Stay up to date with the latest Octo Browser news

By clicking the button you agree to our Privacy Policy.

Join Octo Browser now

Or contact Customer Service at any time with any questions you might have.

Join Octo Browser now

Or contact Customer Service at any time with any questions you might have.

Join Octo Browser now

Or contact Customer Service at any time with any questions you might have.

©

2026

Octo Browser

©

2026

Octo Browser

©

2026

Octo Browser