Is Web Scraping Legal in Japan? Every Law You Need to Know

Five Japanese statutes govern web scraping. None of them actually use the phrase "web scraping."

If you've ever tried to figure out whether your scraping project is legal in Japan, you've probably hit a wall of vague forum posts, AI-training-focused articles, and conflicting advice. I spent weeks digging through official Japanese statutes, government guidance, enforcement data, and legal commentary to put together the clearest English-language guide I could.

Whether you're monitoring competitor prices on Rakuten, pulling property data for market analysis, or building a B2B lead list, this article walks through every law that matters — with practical tables, real-world scenarios, and a 10-step compliance checklist you can use before you start extracting data.

What Does "Is Web Scraping Legal in Japan" Actually Mean?

Web scraping — using software to automatically pull data from websites — isn't addressed by any single Japanese law. No statute says "scraping is legal" or "scraping is illegal." Whether your project is lawful depends on three things: what you scrape, how you access it, and what you do with the data afterward.

Five statutes form the legal stack:

Statute	What It Covers for Scrapers
Copyright Act (Act No. 48 of 1970)	Protects creative works, images, text, and database structures. Article 30-4 provides a broad exception for data analysis.
APPI (Act on the Protection of Personal Information, Act No. 57 of 2003)	Governs collection, use, sharing, and cross-border transfer of personal data about living individuals.
UCAL (Act on Prohibition of Unauthorized Computer Access, Act No. 128 of 1999)	Criminalizes bypassing authentication and access controls — Japan's anti-hacking law.
UCPA (Unfair Competition Prevention Act, Act No. 47 of 1993)	Protects trade secrets and "shared data with limited access" from wrongful acquisition.
Penal Code (Act No. 45 of 1907)	Articles 233, 234, and 234-2 can apply when scraping disrupts a website's operations.

The rest of this article breaks down each law with practical examples and risk assessments. Want to skip straight to the action items? Jump to the 10-step compliance checklist.

Japan's Copyright Act and Article 30-4: The Information Analysis Exception

Japan's Copyright Act protects creative works: articles, photos, product descriptions, database structures with creative arrangement. When a scraper downloads a web page, it technically "reproduces" that content under Article 21 — the author's exclusive reproduction right.

But here's where Japan stands out.

In 2018, Japan enacted a broad amendment (effective January 1, 2019) that added Article 30-4 — a flexible copyright exception that makes most analytical web scraping legal. The Agency for Cultural Affairs calls it one of the world's most permissive frameworks for data analysis and AI development.

Most English-language articles frame Article 30-4 as only relevant to AI training. That's too narrow. The statute expressly covers "information analysis" — extraction, comparison, classification, and other statistical analysis of data. In other words, exactly what business scrapers do every day.

What Article 30-4 Actually Says (in Plain English)

Article 30-4 permits use of a copyrighted work "where the purpose is not to personally enjoy, or cause another person to enjoy, the thoughts or sentiments expressed in the work." In practice, two conditions must hold:

The "enjoyment" test. If you're extracting factual data — prices, dates, square footage, stock levels — rather than consuming or republishing creative content, you're on the right side. The ACA's 2024 AI and copyright guidance reinforces that non-enjoyment uses include data analysis, classification, and indexing.
The "unjust harm" test. Your scraping shouldn't substitute for the original work or undercut the copyright holder's market. Scraping a paid analysis-ready dataset to avoid buying it, for example, could fail this test even if your purpose is analytical.

ig_0a3cda0b72101bd40169f1b3ed9fd08191a17c22b803fb48ab_compressed.webp

Real-World Scraping Scenarios Under Article 30-4

This is where the rubber meets the road. The statute applies far beyond AI training:

Use Case	Article 30-4 Applies?	Why
Scraping property listings for market price analysis	✅ Yes	Asking price, area, and building age are factual inputs for information analysis, not enjoyment of expression
Scraping stock data from exchange sites	✅ Yes	Statistical analysis purpose
Scraping product images for a competing ecommerce site	❌ No	Exploiting the expressive content itself
Scraping news articles to republish	❌ No	Substitutes for the original work
Scraping product descriptions for price monitoring	✅ Likely yes	Extracting factual data, not enjoying expression
Building a RAG system over scraped documents	⚠️ Mixed	Vectorization may be non-enjoyment, but outputting protected passages needs further analysis

One more wrinkle: Article 47-5 offers a narrower protection for "minor exploitation" incidental to computerized information processing — think small snippets or thumbnails in search results. It's not the main scraping safe harbor, but it can support preparatory copying needed for search or analysis services. The ACA's 2019 commentary judges "minor" by proportion, quantity, and display accuracy.

The bottom line: if you're extracting facts for analysis rather than republishing creative content, Japan's copyright framework is on your side.

Japan's Unauthorized Computer Access Law (UCAL): When Scraping Crosses the Line

Almost no English-language scraping article explains this statute. It's arguably the most important bright line in Japanese law.

The Unauthorized Computer Access Law (不正アクセス禁止法, Act No. 128 of 1999) is Japan's functional equivalent of the US CFAA. It criminalizes unauthorized access to computers protected by authentication measures. Penalties under Article 11 can reach imprisonment for up to 3 years or a fine of up to ¥1,000,000.

UCAL does not prohibit scraping public web pages. The law only kicks in when you bypass or circumvent authentication — login walls, passwords, access tokens, or similar controls. That distinction is everything.

UCAL Risk Levels for Common Scraping Scenarios

Scenario	UCAL Risk Level	Explanation
Scraping public product listings	✅ Low	No authentication bypass involved
Scraping behind a login with your own credentials	⚠️ Medium — depends on ToS	UCAL may not apply if credentials are yours, but ToS and contract risk remain
Bypassing authentication or CAPTCHA to access data	❌ High — likely violation	Article 2(4)(ii) covers evasion of access restrictions
Accessing restricted APIs without authorization	❌ High — likely violation	Authenticated or partner-only APIs are squarely in UCAL territory
Using another person's credentials or session tokens	❌ High — likely violation	Article 2(4)(i) directly addresses use of another person's identification code

Japan's National Police Agency reported 563 cleared UCAL violation cases in 2024, up 8.1% from the prior year. Of those, 511 cases (90.8%) involved unauthorized use of another person's identification code. The enforcement focus is overwhelmingly on credential misuse, not ordinary public scraping.

How UCAL Differs from the US CFAA

UCAL is narrower than the CFAA in a meaningful way. It focuses specifically on authentication bypass, while the CFAA's "exceeds authorized access" language has been debated in US courts for decades. After the US Supreme Court's Van Buren decision, violating a website's ToS alone is less likely to trigger CFAA criminal liability. Japan reaches a similar practical result: ToS violations are a contract matter, not a UCAL criminal matter, unless there's an independent access-control element.

APPI 2022 Amendments: What Scrapers Need to Know About Personal Data

Japan's Act on the Protection of Personal Information (APPI) is the country's primary data protection law — and the 2022 amendments made the rules significantly stricter. If you're scraping names, emails, phone numbers, or any data that identifies a living individual from Japanese websites, APPI applies.

The practical question: when does scraping trigger APPI compliance?

What Counts as "Personal Information" Under APPI

APPI Article 2 defines personal information as data that can identify a specific living individual — including by easy collation with other information. The PPC's Q&A guidance confirms that a work email like firstname.lastname@company.jp can be personal information when it identifies a specific person, and that cookie IDs become personal information when combined with other data that enables identification.

The 2022 amendments introduced a new category: "individual-related information" — data that doesn't directly identify someone but could when combined with other data (cookie IDs, browsing history, purchase history). Why this matters for scraping: data that looks anonymous to the scraper may become identifiable when merged with CRM or adtech data at the receiving end.

Cross-Border Transfer Restrictions

If you're scraping Japanese websites from outside Japan and collecting personal data, APPI Article 28 requires analysis before transferring that data abroad. The PPC's foreign transfer guideline describes three common paths: the recipient is in a PPC-designated equivalent country, the recipient has established equivalent protective measures, or an Article 27(1) exception applies.

If a US, EU, or Singaporean company scrapes personal data from Japanese sites and stores it outside Japan, APPI foreign-transfer analysis is needed. This catches a lot of international teams off guard.

The Opt-Out Third-Party Provision (Article 27)

The forum question I see most often: "What happens if I share or sell scraped data from Japanese sites?"

APPI Article 27 generally requires prior consent to provide personal data to third parties. There's a formal opt-out mechanism — but it requires filing with the Personal Information Protection Commission, notifying individuals, and giving them a way to stop third-party provision. The 2022 amendments narrowed this further: opt-out provision cannot be used for personal data acquired by wrongful means or received from another business through opt-out provision.

The PPC's FY2024 annual report shows 405 total opt-out filings accepted since October 2021, including 93 in FY2024. The system exists but is formal, not casual.

When Scraping Does Not Trigger APPI

APPI does not apply to data that cannot identify a living individual. Lower-APPI-risk fields include:

Product prices, SKUs, stock levels, and shipping fees
Store opening hours and generic company contact info (info@company.jp)
Property listing price, square footage, building age, and station distance — when not linked to named owners or agents
Aggregated market statistics where individual correspondence is eliminated

A practical design choice worth noting: Thunderbit's AI Suggest Fields feature lets users define exactly which data columns to extract. You can deliberately exclude personal data fields and focus only on the business facts you need — reducing APPI exposure by design rather than by accident.

Unfair Competition Prevention Act (UCPA): Scraping Competitor Data

ig_0a3cda0b72101bd40169f1b4462be08191a1ab2d0796a7d30e_compressed.webp

The Unfair Competition Prevention Act enters the picture when scraping moves from public facts into confidential business information or gated datasets.

UCPA defines a trade secret as information that is (1) managed as secret, (2) useful for business, and (3) not publicly known. METI summarizes these as the three requirements for trade-secret protection.

Public website facts — product prices, store locations, job postings, product catalogs — are generally not trade secrets because they are not secret and are publicly known. Scraping them typically does not violate UCPA.

When UCPA Could Apply to Scraping

Scenario	UCPA Risk	Why
Scraping a competitor's public product catalog for price monitoring	Usually low	Public catalog facts are generally not secret
Scraping internal pricing data by exploiting an API vulnerability	High	Nonpublic useful business information acquired through wrongful means
Scraping a paid partner-only database or licensed API outside scope	High	The 2018 UCPA amendments protect "shared data with limited access"
Using scraped data to create a competing product that free-rides on a costly database	Gray area	Courts may evaluate access restrictions, investment, and substitution

The 2018 UCPA amendment added protection for "shared data with limited access" — technical or business information accumulated to a significant extent, managed electronically, and provided regularly to specific persons. But UCPA Article 19 excludes data that is substantially the same as information made publicly available without compensation. So a free public product listing is different from a member-only commercial dataset.

Server Overload and Japan's Penal Code: Don't Crash the Website

The data itself might be perfectly legal to collect. But how you scrape can create criminal risk. Japan's Penal Code includes business-obstruction provisions that kick in when automated access disrupts a website or business system.

Penal Code Article	Conduct	Penalty
Article 233	Obstruction of business by fraudulent means	Up to 3 years or ¥500,000
Article 234	Forcible obstruction of business	Same as Article 233
Article 234-2	Obstruction by damaging/interfering with a computer	Up to 5 years or ¥1,000,000

Every Japanese scraping discussion eventually lands on the Okazaki City Central Library incident (~2010). A software engineer created a crawler to collect new-book information from the library website, generating roughly 33,000 automated accesses over two weeks. The library's server became difficult to use, and police arrested the user on suspicion of business obstruction. The case ended without a merits judgment, but it remains a powerful reminder that server impact matters — even when the data itself is public.

Some context on why website operators escalate: Thales/Imperva reported automated bots were 51% of web traffic in 2024, with bad bots at 37%. Akamai found bots were 42% of overall web traffic, with ecommerce especially affected.

How to Avoid Server Overload Issues

Respect robots.txt (even though it's not a statute, it's evidence of the operator's intent)
Add delays between requests and limit concurrency
Avoid peak hours for the target site
Stop or reduce traffic when you see errors, blocks, or rate-limit responses
Cache previously fetched pages instead of repeatedly hitting the same URLs

Thunderbit's cloud scraping feature distributes requests across multiple servers, which naturally spreads load and reduces the risk of overwhelming any single target server. It's not a legal shield, but it's a practical design choice that aligns with responsible scraping.

Terms of Service Violations: Contract Risk, Not Criminal Risk

Plenty of Japanese websites include Terms of Service that prohibit scraping or automated data collection. Under Japanese law, violating ToS is a contract issue — not a criminal offense.

METI's Interpretative Guidelines on Electronic Commerce explain that website terms are binding when properly incorporated into the transaction contract. Click-wrap agreements (where you must click "Agree") are the strongest. Terms buried in hard-to-notice footer links are weaker.

ToS Design	Enforceability Signal
Clear click-wrap with required "Agree" button	Strongest
Terms linked near transaction but no agree click	More uncertain
Terms hidden in footer or difficult location	Weaker
No contractual relationship with the operator	Contract claim may be weak

No reliable authority was found showing that a ToS breach alone, without more, is elevated into a Japanese criminal charge. The practical position: ToS breach can create civil contract risk (damages, injunctions), but criminal exposure usually requires an independent element — access-control evasion under UCAL, business obstruction under the Penal Code, or copyright infringement.

My advice: read the ToS before scraping any Japanese website. If it explicitly prohibits scraping, look for alternatives — an API, data partnership, or another source for the same information.

Japan vs. US vs. EU: How Web Scraping Laws Compare

If you're coming from a US or EU legal background, this table will help you calibrate. Japan's framework is more permissive in some areas and more restrictive in others.

Legal Dimension	Japan	United States	EU
Core scraping statute	No single statute; patchwork of Copyright Act, APPI, UCPA, UCAL, Penal Code	CFAA, state laws	GDPR, Database Directive, DSM Directive
Copyright exception for data analysis	Article 30-4 (broad)	Fair use (case-by-case)	TDM exception (Articles 3-4, DSM Directive) — with opt-out for commercial TDM
Personal data scraping	APPI — opt-out third-party provision (Art. 27)	Varies by state (CCPA etc.)	GDPR — strict consent/legitimate interest
Bypassing access controls	UCAL — criminal offense	CFAA — criminal + civil	Varies by member state
ToS breach = illegal?	Contract law only; no criminal liability found	CFAA post-Van Buren: likely no	Varies; GDPR may still apply
Server overload risk	Penal Code Art. 233, 234-2 (business obstruction)	CFAA + tortious interference	Varies

Key Takeaways from the Comparison

Japan's Article 30-4 is broader than US fair use or EU TDM exceptions — making Japan one of the most permissive countries for analytical scraping from a copyright perspective. UCAL is narrower than the CFAA because it focuses purely on authentication bypass. APPI's cross-border transfer rules are stricter than fragmented US privacy frameworks but less prescriptive than GDPR in some operational details.

For international teams: you may have more freedom to scrape public Japanese data for analysis than you realize. Personal data handling is where the complexity lives — especially cross-border transfers and third-party sharing.

Your 10-Step Compliance Checklist for Scraping Japanese Websites

Before you start scraping any Japanese website, run through these ten yes/no questions. Each maps to one of the five statutes above.

Is the data publicly accessible? (No login, no paywall, no access-control evasion) → If yes, UCAL risk is low.
Does the website's ToS prohibit scraping? → If yes, assess contract risk; consider alternative data sources.
Are you collecting personal information as defined by APPI? (Names, emails, phone numbers, IDs) → If yes, ensure APPI compliance.
Will you transfer scraped personal data outside Japan? → If yes, comply with APPI Article 28's cross-border transfer rules.
Do you plan to share or sell scraped data to third parties? → If yes, follow APPI Article 27 opt-out procedures or obtain consent.
Is the data protected by copyright? → If scraping for information analysis (not republishing creative content), Article 30-4 likely applies.
Will your scraping activity substitute for the original work? → If yes, Article 30-4 protection likely does not apply.
Are you bypassing any authentication, CAPTCHA, or access controls? → If yes, high UCAL risk — do not proceed without legal advice.
Will your scraping volume risk overloading the server? → If yes, throttle requests, add delays, use distributed scraping.
Is the target data managed as a trade secret by the company? → If non-public proprietary data, UCPA may apply.

If every answer points to public, factual, non-personal, rate-limited, non-republishing analysis — you're in good shape. Any red flag should trigger legal review before you start.

ig_0a3cda0b72101bd40169f1b4db54888191a61af73340d78e18_compressed.webp

How Thunderbit Helps You Scrape Japanese Websites Compliantly

I want to be upfront: Thunderbit is a tool, not legal advice. But it's designed in ways that align with the compliance principles I've outlined.

AI Suggest Fields: Thunderbit's AI reads the page and suggests exactly which data columns to extract. This helps you deliberately define only the non-personal data fields you need — reducing unnecessary personal data collection by design rather than by accident.
Cloud Scraping: Distributes requests across multiple servers, naturally spreading load and reducing the risk of overwhelming a single Japanese server. (Think of it as built-in rate-limit friendliness.)
Free Email and Phone Extractors: When you do need to collect contact information from Japanese websites, Thunderbit's email extractor and phone extractor provide one-click extraction. But pair this with the APPI guidance above — collecting personal data requires understanding your compliance obligations.
Export to Excel, Google Sheets, Airtable, or Notion: Scraped data can be structured and exported for analysis immediately, supporting the "information analysis" purpose that Article 30-4 protects.
No Maintenance Required: Thunderbit's AI reads the site fresh each time, adapting to layout changes. This means no broken scrapers repeatedly hammering a server with failed requests — a practical way to avoid the kind of server-load issues that triggered the Okazaki Library incident.

For a walkthrough of how to use Thunderbit in practice, check out our YouTube channel or the quick start guide. You can try it free via the Chrome extension.

Try Thunderbit for Japanese Web Scraping

Practical Use Case Examples

Use Case	Recommended Fields to Extract	Legal Rationale
Japanese ecommerce price monitoring	Product name, listed price, availability, seller, SKU, URL, timestamp	Factual business data; Article 30-4 information analysis; avoid copying product images or reviews for republication
Japanese real estate market analysis	Asking price, location area, floor area, building age, property type, nearest station, URL, timestamp	Supports aggregate market analysis; exclude agent names, phone numbers, and owner names unless APPI compliance is in place
B2B operations monitoring	Company name, branch address, generic company email, opening hours, service category	Lower APPI risk if no living individual is identified; review ToS and rate limits

Key Takeaways on Web Scraping Legality in Japan

Web scraping is legal in Japan in most cases — especially when you're scraping publicly available, non-personal, factual data for analysis purposes. But "most cases" is not "all cases."

Copyright Act (Article 30-4): Analytical scraping of public data is allowed; republishing creative content is not.
UCAL: Do not bypass authentication or access controls.
APPI: Handle personal data carefully, especially for cross-border transfers and third-party sharing.
UCPA: Public data is generally not a trade secret; gated or paid data is higher risk.
Penal Code: Do not crash the server.

Use the 10-step checklist before starting any scraping project. When in doubt, consult legal counsel — especially for projects involving personal data or access-restricted content.

If you're ready to start scraping Japanese websites compliantly, Thunderbit is built to make the process straightforward for non-technical users. Define your fields, extract the data, export to your preferred tool, and focus on the analysis.

Try AI Web Scraper for Japanese Websites Get Started Free

FAQs

Is it legal to scrape public websites in Japan?

Generally yes. Scraping publicly available data for information analysis is usually legal under Japan's Copyright Act Article 30-4, provided you don't overload the server, bypass access controls, collect personal data without APPI compliance, or republish copyrighted expression. The distinguishing factor is purpose: analysis, not republication.

Can I scrape personal data (emails, phone numbers) from Japanese websites?

You can, but APPI applies. You need a lawful purpose, must disclose how you'll use the data, and face restrictions on cross-border transfers and third-party sharing. The 2022 amendments tightened these rules significantly — especially for data leaving Japan or being shared with other companies.

What happens if a Japanese website's Terms of Service prohibit scraping?

Violating ToS is a contract issue (potential civil liability for damages or injunctions), not a criminal offense. However, it can support broader legal claims and escalate enforcement. Always read the ToS before scraping, and consider whether the data is available through alternative means.

Is scraping behind a login wall legal in Japan?

Using your own credentials is a gray area — UCAL may not directly apply, but ToS violations and contract risk remain. Bypassing authentication, using another person's credentials, or evading access controls is likely a criminal violation of the Unauthorized Computer Access Law, with penalties up to 3 years imprisonment or ¥1,000,000.

Can I sell data I scraped from Japanese websites?

If the data contains personal information, you must follow APPI Article 27's opt-out third-party provision system — which requires formal PPC filing, individual notification, and opt-out mechanisms. Selling personal data without proper procedures is a compliance violation. For non-personal factual aggregates, APPI risk is lower, but copyright, UCPA, ToS, and web scraping legal implications still apply.

Learn More