Five Japanese statutes govern web scraping. None of them actually use the phrase "web scraping."
If you've ever tried to figure out whether your scraping project is legal in Japan, you've probably hit a wall of vague forum posts, AI-training-focused articles, and conflicting advice. I spent weeks digging through official Japanese statutes, government guidance, enforcement data, and legal commentary to put together the clearest English-language guide I could.
Whether you're monitoring competitor prices on Rakuten, pulling property data for market analysis, or building a B2B lead list, this article walks through every law that matters — with practical tables, real-world scenarios, and a 10-step compliance checklist you can use before you start extracting data.
What Does "Is Web Scraping Legal in Japan" Actually Mean?
Web scraping — using software to automatically pull data from websites — isn't addressed by any single Japanese law. No statute says "scraping is legal" or "scraping is illegal." Whether your project is lawful depends on three things: what you scrape, how you access it, and what you do with the data afterward.
Five statutes form the legal stack:
| Statute | What It Covers for Scrapers |
|---|---|
| Copyright Act (Act No. 48 of 1970) | Protects creative works, images, text, and database structures. Article 30-4 provides a broad exception for data analysis. |
| APPI (Act on the Protection of Personal Information, Act No. 57 of 2003) | Governs collection, use, sharing, and cross-border transfer of personal data about living individuals. |
| UCAL (Act on Prohibition of Unauthorized Computer Access, Act No. 128 of 1999) | Criminalizes bypassing authentication and access controls — Japan's anti-hacking law. |
| UCPA (Unfair Competition Prevention Act, Act No. 47 of 1993) | Protects trade secrets and "shared data with limited access" from wrongful acquisition. |
| Penal Code (Act No. 45 of 1907) | Articles 233, 234, and 234-2 can apply when scraping disrupts a website's operations. |
The rest of this article breaks down each law with practical examples and risk assessments. Want to skip straight to the action items? Jump to the .
Japan's Copyright Act and Article 30-4: The Information Analysis Exception
Japan's Copyright Act protects creative works: articles, photos, product descriptions, database structures with creative arrangement. When a scraper downloads a web page, it technically "reproduces" that content under — the author's exclusive reproduction right.
But here's where Japan stands out.
In 2018, Japan enacted a broad amendment (effective January 1, 2019) that added — a flexible copyright exception that makes most analytical web scraping legal. The calls it one of the world's most permissive frameworks for data analysis and AI development.
Most English-language articles frame Article 30-4 as only relevant to AI training. That's too narrow. The statute expressly covers "information analysis" — extraction, comparison, classification, and other statistical analysis of data. In other words, exactly what business scrapers do every day.
What Article 30-4 Actually Says (in Plain English)
Article 30-4 permits use of a copyrighted work "where the purpose is not to personally enjoy, or cause another person to enjoy, the thoughts or sentiments expressed in the work." In practice, two conditions must hold:
-
The "enjoyment" test. If you're extracting factual data — prices, dates, square footage, stock levels — rather than consuming or republishing creative content, you're on the right side. The reinforces that non-enjoyment uses include data analysis, classification, and indexing.
-
The "unjust harm" test. Your scraping shouldn't substitute for the original work or undercut the copyright holder's market. Scraping a paid analysis-ready dataset to avoid buying it, for example, could fail this test even if your purpose is analytical.

Real-World Scraping Scenarios Under Article 30-4
This is where the rubber meets the road. The statute applies far beyond AI training:
| Use Case | Article 30-4 Applies? | Why |
|---|---|---|
| Scraping property listings for market price analysis | ✅ Yes | Asking price, area, and building age are factual inputs for information analysis, not enjoyment of expression |
| Scraping stock data from exchange sites | ✅ Yes | Statistical analysis purpose |
| Scraping product images for a competing ecommerce site | ❌ No | Exploiting the expressive content itself |
| Scraping news articles to republish | ❌ No | Substitutes for the original work |
| Scraping product descriptions for price monitoring | ✅ Likely yes | Extracting factual data, not enjoying expression |
| Building a RAG system over scraped documents | ⚠️ Mixed | Vectorization may be non-enjoyment, but outputting protected passages needs further analysis |
One more wrinkle: Article 47-5 offers a narrower protection for "minor exploitation" incidental to computerized information processing — think small snippets or thumbnails in search results. It's not the main scraping safe harbor, but it can support preparatory copying needed for search or analysis services. The judges "minor" by proportion, quantity, and display accuracy.
The bottom line: if you're extracting facts for analysis rather than republishing creative content, Japan's copyright framework is on your side.
Japan's Unauthorized Computer Access Law (UCAL): When Scraping Crosses the Line
Almost no English-language scraping article explains this statute. It's arguably the most important bright line in Japanese law.
The (不正アクセス禁止法, Act No. 128 of 1999) is Japan's functional equivalent of the US CFAA. It criminalizes unauthorized access to computers protected by authentication measures. Penalties under can reach imprisonment for up to 3 years or a fine of up to ¥1,000,000.
UCAL does not prohibit scraping public web pages. The law only kicks in when you bypass or circumvent authentication — login walls, passwords, access tokens, or similar controls. That distinction is everything.
UCAL Risk Levels for Common Scraping Scenarios
| Scenario | UCAL Risk Level | Explanation |
|---|---|---|
| Scraping public product listings | ✅ Low | No authentication bypass involved |
| Scraping behind a login with your own credentials | ⚠️ Medium — depends on ToS | UCAL may not apply if credentials are yours, but ToS and contract risk remain |
| Bypassing authentication or CAPTCHA to access data | ❌ High — likely violation | Article 2(4)(ii) covers evasion of access restrictions |
| Accessing restricted APIs without authorization | ❌ High — likely violation | Authenticated or partner-only APIs are squarely in UCAL territory |
| Using another person's credentials or session tokens | ❌ High — likely violation | Article 2(4)(i) directly addresses use of another person's identification code |
Japan's National Police Agency , up 8.1% from the prior year. Of those, 511 cases (90.8%) involved unauthorized use of another person's identification code. The enforcement focus is overwhelmingly on credential misuse, not ordinary public scraping.
How UCAL Differs from the US CFAA
UCAL is narrower than the CFAA in a meaningful way. It focuses specifically on authentication bypass, while the CFAA's "exceeds authorized access" language has been debated in US courts for decades. After the US Supreme Court's , violating a website's ToS alone is less likely to trigger CFAA criminal liability. Japan reaches a similar practical result: ToS violations are a contract matter, not a UCAL criminal matter, unless there's an independent access-control element.
APPI 2022 Amendments: What Scrapers Need to Know About Personal Data
Japan's (APPI) is the country's primary data protection law — and the made the rules significantly stricter. If you're scraping names, emails, phone numbers, or any data that identifies a living individual from Japanese websites, APPI applies.
The practical question: when does scraping trigger APPI compliance?
What Counts as "Personal Information" Under APPI
APPI defines personal information as data that can identify a specific living individual — including by easy collation with other information. The confirms that a work email like firstname.lastname@company.jp can be personal information when it identifies a specific person, and that cookie IDs become personal information when combined with other data that enables identification.
The 2022 amendments introduced a new category: "individual-related information" — data that doesn't directly identify someone but could when combined with other data (cookie IDs, browsing history, purchase history). Why this matters for scraping: data that looks anonymous to the scraper may become identifiable when merged with CRM or adtech data at the receiving end.
Cross-Border Transfer Restrictions
If you're scraping Japanese websites from outside Japan and collecting personal data, APPI requires analysis before transferring that data abroad. The describes three common paths: the recipient is in a PPC-designated equivalent country, the recipient has established equivalent protective measures, or an Article 27(1) exception applies.
If a US, EU, or Singaporean company scrapes personal data from Japanese sites and stores it outside Japan, APPI foreign-transfer analysis is needed. This catches a lot of international teams off guard.
The Opt-Out Third-Party Provision (Article 27)
The forum question I see most often: "What happens if I share or sell scraped data from Japanese sites?"
APPI generally requires prior consent to provide personal data to third parties. There's a formal opt-out mechanism — but it requires filing with the , notifying individuals, and giving them a way to stop third-party provision. The 2022 amendments narrowed this further: opt-out provision cannot be used for personal data acquired by wrongful means or received from another business through opt-out provision.
The shows 405 total opt-out filings accepted since October 2021, including 93 in FY2024. The system exists but is formal, not casual.
When Scraping Does Not Trigger APPI
APPI does not apply to data that cannot identify a living individual. Lower-APPI-risk fields include:
- Product prices, SKUs, stock levels, and shipping fees
- Store opening hours and generic company contact info (info@company.jp)
- Property listing price, square footage, building age, and station distance — when not linked to named owners or agents
- Aggregated market statistics where individual correspondence is eliminated
A practical design choice worth noting: 's AI Suggest Fields feature lets users define exactly which data columns to extract. You can deliberately exclude personal data fields and focus only on the business facts you need — reducing APPI exposure by design rather than by accident.
Unfair Competition Prevention Act (UCPA): Scraping Competitor Data

The enters the picture when scraping moves from public facts into confidential business information or gated datasets.
UCPA defines a trade secret as information that is (1) managed as secret, (2) useful for business, and (3) not publicly known. these as the three requirements for trade-secret protection.
Public website facts — product prices, store locations, job postings, product catalogs — are generally not trade secrets because they are not secret and are publicly known. Scraping them typically does not violate UCPA.
When UCPA Could Apply to Scraping
| Scenario | UCPA Risk | Why |
|---|---|---|
| Scraping a competitor's public product catalog for price monitoring | Usually low | Public catalog facts are generally not secret |
| Scraping internal pricing data by exploiting an API vulnerability | High | Nonpublic useful business information acquired through wrongful means |
| Scraping a paid partner-only database or licensed API outside scope | High | The 2018 UCPA amendments protect "shared data with limited access" |
| Using scraped data to create a competing product that free-rides on a costly database | Gray area | Courts may evaluate access restrictions, investment, and substitution |
The 2018 UCPA amendment added protection for "shared data with limited access" — technical or business information accumulated to a significant extent, managed electronically, and provided regularly to specific persons. But UCPA excludes data that is substantially the same as information made publicly available without compensation. So a free public product listing is different from a member-only commercial dataset.
Server Overload and Japan's Penal Code: Don't Crash the Website
The data itself might be perfectly legal to collect. But how you scrape can create criminal risk. Japan's includes business-obstruction provisions that kick in when automated access disrupts a website or business system.
| Penal Code Article | Conduct | Penalty |
|---|---|---|
| Article 233 | Obstruction of business by fraudulent means | Up to 3 years or ¥500,000 |
| Article 234 | Forcible obstruction of business | Same as Article 233 |
| Article 234-2 | Obstruction by damaging/interfering with a computer | Up to 5 years or ¥1,000,000 |
Every Japanese scraping discussion eventually lands on the Okazaki City Central Library incident (~2010). A software engineer from the library website, generating roughly 33,000 automated accesses over two weeks. The library's server became difficult to use, and police arrested the user on suspicion of business obstruction. The case ended without a merits judgment, but it remains a powerful reminder that server impact matters — even when the data itself is public.
Some context on why website operators escalate: automated bots were 51% of web traffic in 2024, with bad bots at 37%. bots were 42% of overall web traffic, with ecommerce especially affected.
How to Avoid Server Overload Issues
- Respect robots.txt (even though it's not a statute, it's evidence of the operator's intent)
- Add delays between requests and limit concurrency
- Avoid peak hours for the target site
- Stop or reduce traffic when you see errors, blocks, or rate-limit responses
- Cache previously fetched pages instead of repeatedly hitting the same URLs
Thunderbit's cloud scraping feature distributes requests across multiple servers, which naturally spreads load and reduces the risk of overwhelming any single target server. It's not a legal shield, but it's a practical design choice that aligns with responsible scraping.
Terms of Service Violations: Contract Risk, Not Criminal Risk
Plenty of Japanese websites include Terms of Service that prohibit scraping or automated data collection. Under Japanese law, violating ToS is a contract issue — not a criminal offense.
explain that website terms are binding when properly incorporated into the transaction contract. Click-wrap agreements (where you must click "Agree") are the strongest. Terms buried in hard-to-notice footer links are weaker.
| ToS Design | Enforceability Signal |
|---|---|
| Clear click-wrap with required "Agree" button | Strongest |
| Terms linked near transaction but no agree click | More uncertain |
| Terms hidden in footer or difficult location | Weaker |
| No contractual relationship with the operator | Contract claim may be weak |
No reliable authority was found showing that a ToS breach alone, without more, is elevated into a Japanese criminal charge. The practical position: ToS breach can create civil contract risk (damages, injunctions), but criminal exposure usually requires an independent element — access-control evasion under UCAL, business obstruction under the Penal Code, or copyright infringement.
My advice: read the ToS before scraping any Japanese website. If it explicitly prohibits scraping, look for alternatives — an API, data partnership, or another source for the same information.
Japan vs. US vs. EU: How Web Scraping Laws Compare
If you're coming from a US or EU legal background, this table will help you calibrate. Japan's framework is more permissive in some areas and more restrictive in others.
| Legal Dimension | Japan | United States | EU |
|---|---|---|---|
| Core scraping statute | No single statute; patchwork of Copyright Act, APPI, UCPA, UCAL, Penal Code | CFAA, state laws | GDPR, Database Directive, DSM Directive |
| Copyright exception for data analysis | Article 30-4 (broad) | Fair use (case-by-case) | TDM exception (Articles 3-4, DSM Directive) — with opt-out for commercial TDM |
| Personal data scraping | APPI — opt-out third-party provision (Art. 27) | Varies by state (CCPA etc.) | GDPR — strict consent/legitimate interest |
| Bypassing access controls | UCAL — criminal offense | CFAA — criminal + civil | Varies by member state |
| ToS breach = illegal? | Contract law only; no criminal liability found | CFAA post-Van Buren: likely no | Varies; GDPR may still apply |
| Server overload risk | Penal Code Art. 233, 234-2 (business obstruction) | CFAA + tortious interference | Varies |
Key Takeaways from the Comparison
Japan's Article 30-4 is broader than US fair use or EU TDM exceptions — making Japan one of the most permissive countries for analytical scraping from a copyright perspective. UCAL is narrower than the CFAA because it focuses purely on authentication bypass. APPI's cross-border transfer rules are stricter than fragmented US privacy frameworks but less prescriptive than GDPR in some operational details.
For international teams: you may have more freedom to scrape public Japanese data for analysis than you realize. Personal data handling is where the complexity lives — especially cross-border transfers and third-party sharing.
Your 10-Step Compliance Checklist for Scraping Japanese Websites
Before you start scraping any Japanese website, run through these ten yes/no questions. Each maps to one of the five statutes above.
- Is the data publicly accessible? (No login, no paywall, no access-control evasion) → If yes, UCAL risk is low.
- Does the website's ToS prohibit scraping? → If yes, assess contract risk; consider alternative data sources.
- Are you collecting personal information as defined by APPI? (Names, emails, phone numbers, IDs) → If yes, ensure APPI compliance.
- Will you transfer scraped personal data outside Japan? → If yes, comply with APPI Article 28's cross-border transfer rules.
- Do you plan to share or sell scraped data to third parties? → If yes, follow APPI Article 27 opt-out procedures or obtain consent.
- Is the data protected by copyright? → If scraping for information analysis (not republishing creative content), Article 30-4 likely applies.
- Will your scraping activity substitute for the original work? → If yes, Article 30-4 protection likely does not apply.
- Are you bypassing any authentication, CAPTCHA, or access controls? → If yes, high UCAL risk — do not proceed without legal advice.
- Will your scraping volume risk overloading the server? → If yes, throttle requests, add delays, use distributed scraping.
- Is the target data managed as a trade secret by the company? → If non-public proprietary data, UCPA may apply.
If every answer points to public, factual, non-personal, rate-limited, non-republishing analysis — you're in good shape. Any red flag should trigger legal review before you start.

How Thunderbit Helps You Scrape Japanese Websites Compliantly
I want to be upfront: Thunderbit is a tool, not legal advice. But it's designed in ways that align with the compliance principles I've outlined.
- AI Suggest Fields: Thunderbit's AI reads the page and suggests exactly which data columns to extract. This helps you deliberately define only the non-personal data fields you need — reducing unnecessary personal data collection by design rather than by accident.
- Cloud Scraping: Distributes requests across multiple servers, naturally spreading load and reducing the risk of overwhelming a single Japanese server. (Think of it as built-in rate-limit friendliness.)
- Free Email and Phone Extractors: When you do need to collect contact information from Japanese websites, and provide one-click extraction. But pair this with the APPI guidance above — collecting personal data requires understanding your compliance obligations.
- Export to Excel, Google Sheets, Airtable, or Notion: Scraped data can be structured and exported for analysis immediately, supporting the "information analysis" purpose that Article 30-4 protects.
- No Maintenance Required: Thunderbit's AI reads the site fresh each time, adapting to layout changes. This means no broken scrapers repeatedly hammering a server with failed requests — a practical way to avoid the kind of server-load issues that triggered the Okazaki Library incident.
For a walkthrough of how to use Thunderbit in practice, check out our or the . You can try it free via the .
Practical Use Case Examples
| Use Case | Recommended Fields to Extract | Legal Rationale |
|---|---|---|
| Japanese ecommerce price monitoring | Product name, listed price, availability, seller, SKU, URL, timestamp | Factual business data; Article 30-4 information analysis; avoid copying product images or reviews for republication |
| Japanese real estate market analysis | Asking price, location area, floor area, building age, property type, nearest station, URL, timestamp | Supports aggregate market analysis; exclude agent names, phone numbers, and owner names unless APPI compliance is in place |
| B2B operations monitoring | Company name, branch address, generic company email, opening hours, service category | Lower APPI risk if no living individual is identified; review ToS and rate limits |
Key Takeaways on Web Scraping Legality in Japan
Web scraping is legal in Japan in most cases — especially when you're scraping publicly available, non-personal, factual data for analysis purposes. But "most cases" is not "all cases."
- Copyright Act (Article 30-4): Analytical scraping of public data is allowed; republishing creative content is not.
- UCAL: Do not bypass authentication or access controls.
- APPI: Handle personal data carefully, especially for cross-border transfers and third-party sharing.
- UCPA: Public data is generally not a trade secret; gated or paid data is higher risk.
- Penal Code: Do not crash the server.
Use the 10-step checklist before starting any scraping project. When in doubt, consult legal counsel — especially for projects involving personal data or access-restricted content.
If you're ready to start scraping Japanese websites compliantly, is built to make the process straightforward for non-technical users. Define your fields, extract the data, export to your preferred tool, and focus on the analysis.
FAQs
Is it legal to scrape public websites in Japan?
Generally yes. Scraping publicly available data for information analysis is usually legal under Japan's Copyright Act Article 30-4, provided you don't overload the server, bypass access controls, collect personal data without APPI compliance, or republish copyrighted expression. The distinguishing factor is purpose: analysis, not republication.
Can I scrape personal data (emails, phone numbers) from Japanese websites?
You can, but APPI applies. You need a lawful purpose, must disclose how you'll use the data, and face restrictions on cross-border transfers and third-party sharing. The 2022 amendments tightened these rules significantly — especially for data leaving Japan or being shared with other companies.
What happens if a Japanese website's Terms of Service prohibit scraping?
Violating ToS is a contract issue (potential civil liability for damages or injunctions), not a criminal offense. However, it can support broader legal claims and escalate enforcement. Always read the ToS before scraping, and consider whether the data is available through alternative means.
Is scraping behind a login wall legal in Japan?
Using your own credentials is a gray area — UCAL may not directly apply, but ToS violations and contract risk remain. Bypassing authentication, using another person's credentials, or evading access controls is likely a criminal violation of the Unauthorized Computer Access Law, with penalties up to 3 years imprisonment or ¥1,000,000.
Can I sell data I scraped from Japanese websites?
If the data contains personal information, you must follow APPI Article 27's opt-out third-party provision system — which requires formal PPC filing, individual notification, and opt-out mechanisms. Selling personal data without proper procedures is a compliance violation. For non-personal factual aggregates, APPI risk is lower, but copyright, UCPA, ToS, and still apply.
Learn More
