AI Discovery · Audit
Audited
Atomic Habits
by James Clear
Partial discovery
AI answers vary run to run, so treat this as an indicative band, not a precise mark — the verdict and the fixes below are what matter.
Some engines surface your book reliably. Others miss it. The chapters below show which, and where to focus.
Dual score · SKU vs Topic
Your book (SKU)
74 / 100
Your subject
74 / 100
AI engines know your book and its subject well — focus shifts to defending position.
SKU score = book mentioned by name (strict). Topic score = response acknowledged your subject (looser — counts author mentions and topic-word overlap). Headline score uses the SKU number.
Profile
The score breaks into four things AI can do (or can't) with your book.
Known-title recall
20 /25
AI recalls the book when named.
Recommendation fit
14 /25
Surfaces when readers describe the need.
Market authority
19 /25
In the genre-listicle / canon pool.
List position
21 /25
Where it lands in ranked lists.
What this looks like
“Best books about self-help productivity published in the last 12 months — give me 10.”
ChatGPT replies
“You should read Deep Work.”
— not Atomic Habits.
What this would cost you otherwise
Surfio does it across 5 engines incl. Amazon Rufus in ~4 minutes — £29.99.
What AI says
A sample of the questions, and what AI answered.
“Best books about self-help productivity published in the last 12 months — give me 10.”
Amazon Rufus mentioned "Atomic Habits".
“Best books about self-help productivity published in the last 12 months — give me 10.”
Gemini answered without mentioning your book.
“Best books about self-help productivity published in the last 12 months — give me 10.”
ChatGPT cited kirkusreviews.com instead.
Where to get featured
AI recommends books it sees cited across these sources. This is your outreach shortlist, ranked by how likely each is to say yes — get featured here and you get recommended.
Book blogs, niche listicles, indie reviews — most likely to land
Mid-tier press, podcasts, trade — possible with a real angle
Broadsheets, network news — only with a major news hook
Was your book used to train AI?
We cross-referenced your book against datasets used to train AI models. Confirmed matches (Books3, Common Crawl) are definitive; others are a probability assessment. Informational only — not legal advice; consult a qualified lawyer before acting.
LibGen
Used by: Meta (LLaMA 1/2/3), Anthropic (Claude 1/2), alleged: OpenAI
Book is too new or lacks Amazon listing — and Anna's Archive lookup unreachable.
active class action — Kadrey v. Meta
Active class action against Meta for downloading LibGen + Z-Library. If your book is in LibGen you are AUTO-ENROLLED. Action: send a copyright assertion letter (template below) and join the Authors Guild Meta list.
Read & join →Z-Library
Used by: Meta (LLaMA 1/2/3 — confirmed in court filings)
Z-Library is ISBN-keyed and Anna's Archive lookup unreachable — can't confirm coverage.
active class action — Kadrey v. Meta
Z-Library use is part of the same Meta lawsuit. Zuckerberg personally signed off on the Z-Library downloads per unsealed internal emails.
Read & join →Books3
Used by: EleutherAI (Pile), Meta (LLaMA-1), Bloomberg GPT
CONFIRMED in Books3. Filename "Atomic Habits_ Tiny Changes, Remarkable Results 2018 - James Clear" matches your title + author in the Books3 dataset (197,000 files).
PiLiMi
Used by: Anthropic (confirmed in court filings)
PiLiMi is a LibGen mirror. If your book is on LibGen it's probably on PiLiMi too.
settled class action — Bartz v. Anthropic
Anthropic SETTLED for $1.5 billion in Sept 2025 — ~$3,000 per qualifying title. Claims deadline was 30 March 2026 (past). New lawsuits expected.
Read & join →Anthropic-class
Used by: Anthropic (Claude 1, 2)
The Anthropic settlement class list contains ~500k titles. We can't auto-check this yet; visit the Authors Guild search tool to verify your eligibility for the $1.5BN distribution. Claims deadline was 30 March 2026 — check now in case appeals reopen the window.
settled class action — Bartz v. Anthropic (settled $1.5BN)
Search the published class list directly to check eligibility. Even if claims deadline passed, document your inclusion for future related lawsuits.
Read & join →BookCorpus
Used by: Google (BERT), OpenAI (GPT-1, GPT-2), Meta (RoBERTa)
Your book post-dates BookCorpus's 2015 scrape window. Lower likelihood — but Smashwords-published books from this era are well-documented in BookCorpus.
HathiTrust
Used by: Anna's Archive (scraped by NVIDIA, others), Indirect AI training via Anna's Archive
HathiTrust scans tend to be older library titles — your book may not be covered.
The Pile (full)
Used by: EleutherAI, Meta (LLaMA-1), Bloomberg GPT, many open-source LLMs
The Pile was assembled 2020-2021; newer books less likely. But ArXiv, PubMed, OpenSubtitles — check by content type.
Common Crawl
Used by: OpenAI (all GPT models), Google (T5, Gemini, Bard), Anthropic (Claude), Meta (LLaMA, BLOOM)
Common Crawl scrapes the entire public web every 1-2 months. Your Amazon listing (including "Look Inside" excerpts), Goodreads page, author blog posts, free chapter samples, and Wikipedia entries about your book are almost certainly in Common Crawl. OpenAI, Google, and Anthropic all use Common Crawl as a primary training source.
Hugging Face datasets
Used by: Any model fine-tuned on HF community datasets, Researchers, open-source LLMs
Books not yet listed on Amazon are less likely to be in third-party scraped collections.
AO3 fanfic scrape
Used by: Unknown — community-uploaded; downstream model training unverified but likely
In April 2025, user "nyuuzyou" scraped 12.6 million fanfics from Archive of Our Own and uploaded to Hugging Face. If you write fanfiction, post on AO3, or have crossposted work there, you're affected. OTW is actively pursuing takedowns.
Wattpad / Fanfiction.net
Used by: Naver HyperCLOVA-X (Wattpad direct), Various research scrapes
Wattpad (90M users, owned by Naver) and Fanfiction.net are widely scraped by both researchers and AI companies. Wattpad has cooperation with Naver's HyperCLOVA-X model. If you've published original or fan work there, it may be in training corpora.
Internet Archive scans
Used by: Internet Archive (now ceased), Unknown downstream AI training
IA's CDL program ran until late 2024 but newer or non-ISBN books are less likely to have been scanned.
Pre-filled letters — copy & send
To: Meta Platforms, Inc. — Copyright Agent (copyright@meta.com)
2 June 2026 Re: Unauthorized use of copyrighted work in AI training datasets — "Atomic Habits" by James Clear To whom it may concern, I am the rights holder of the work titled "Atomic Habits" (the "Work"). I have reasonable grounds to believe that the Work was downloaded by Meta from LibGen, Z-Library, and/or PiLiMi — repositories that Meta has admitted (per unsealed court filings in Kadrey v. Meta) using to train its LLaMA family of large language models without permission. I hereby: 1. Assert my exclusive copyright in the Work. 2. Demand that Meta cease using the Work, or any derivative thereof, in training, fine-tuning, evaluation, or operation of any AI system. 3. Demand confirmation, in writing within 30 days, of whether the Work was downloaded by or on behalf of Meta and whether it was used in training any Meta-owned or operated model. 4. Reserve all rights including but not limited to damages, attorneys' fees, and injunctive relief. I do not grant any license, permission, or authorisation to use the Work for any purpose related to AI training, fine-tuning, or evaluation. Please direct all responses to the address above. Sincerely, James Clear Rights holder, "Atomic Habits" — — — NOT LEGAL ADVICE. This template is provided as a starting point for authors who suspect their work was used without permission. For legal counsel specific to your circumstances, consult an intellectual-property attorney. The Authors Guild offers a Legal Services Department to members.
To: Z-Library administrators (use the takedown form at z-lib.io / dmca@z-lib.io)
2 June 2026 DMCA Takedown Notice Pursuant to 17 U.S.C. § 512(c), I formally notify Z-Library and its operators of copyright infringement: - Copyrighted work: "Atomic Habits" by James Clear - Infringing material location: any Z-Library URL hosting the above work - Statement of good faith belief: I have a good faith belief that the use of the material in the manner complained of is not authorised by the copyright owner, its agent, or the law. - Accuracy under penalty of perjury: The information in this notice is accurate, and under penalty of perjury, I am the owner, or authorised to act on behalf of the owner, of an exclusive right that is allegedly infringed. Please remove the infringing material immediately and confirm removal in writing. James Clear — — — NOT LEGAL ADVICE. Standard DMCA template. Z-Library operates outside US jurisdiction; compliance is voluntary but documenting the notice creates an evidentiary record useful in related class actions.
To: Authors Guild member services (info@authorsguild.org)
Subject: Joining the Meta / LibGen class action I am writing to confirm my interest in being included as a class member in Kadrey v. Meta and any related class actions arising from Meta's use of LibGen, Z-Library, and PiLiMi to train large language models. I am a rights holder whose work appears to have been downloaded into Meta's training corpus. Please advise on: 1. Membership / qualifying steps for the active class 2. Required documentation to substantiate my claim 3. Timeline for any settlement distribution I would also like information about Authors Guild membership and your Legal Services Department. Thank you, [YOUR NAME] [EMAIL] [POSTAL ADDRESS] — — — INFO ONLY, NOT LEGAL ADVICE. The Authors Guild auto-includes US-resident authors whose work was demonstrably scraped — joining the Guild also gives you access to a copyright legal services line at member rates.
Related: audiobook voice cloning
If your book has an audiobook on Audible, Spotify, or YouTube, the narrator’s voice may be cloned by AI voice models. This is a separate rights concern from text scraping — the words are yours, the voice belongs to your narrator. Action: document the narrator’s original consent, register the audiobook with Voice-Cloning Watch (in development), and monitor for unauthorised AI-narrated versions of your book.
HarperCollins ↔ Microsoft(Nov 2024) — $5,000 per book licensed for AI training. Author opt-in required; 50/50 author/publisher split. If you’re with HarperCollins, check with your editor — you may have money owed.
Taylor & Francis, Wiley — academic licensing deals. Check via your royalty statement.
Big Four still holding out(Hachette, Penguin Random House, Simon & Schuster, Macmillan) — no public AI licensing deals as of May 2026. Books from these publishers in AI training datasets were NOT licensed; you have stronger claims.
Action plan
Sorted by impact. Tick them off as you go.
Paste the description, 7 backend keywords, 5 bullets, and bio into KDP.
30–45 min · 24–48h to re-index
Open the outreach-tiers panel above. Email the green-tagged book blogs / niche listicles that already cite your competitors — they're who AI is reading.
1–3 hours per outlet · Weeks; depends on editor
Listing rewrite · Amazon
Description, bullets, keywords, A+ blocks, bio — generated against your source material. This chapter is Amazon-specific (KDP / Author Central / A+ Content). If your book isn't on Amazon, skip to the next chapter.
Backend keywords
Paste into KDP → Keywords (7 slots)
Bullets
Paste into KDP → Bullet points
Description
Paste into KDP → Description
Atomic Habits is for anyone who feels stuck in cycles of failed resolutions and wants a practical, repeatable system — not motivational rhetoric — for making good behaviours stick and bad ones fade. It is especially useful for readers in their twenties through forties who are managing competing demands on their time and need change strategies that work without relying on bursts of willpower. The book is built around a four-stage model of habit formation — cue, craving, response, and reward — and shows readers exactly where to intervene at each stage. Rather than urging you to want things more, it teaches you to make good habits obvious, attractive, easy, and satisfying, while making unwanted habits invisible, unattractive, hard, and unsatisfying. Each principle is paired with concrete techniques you can apply the same day you read the chapter. A central argument running through the book is that identity shapes behaviour more reliably than outcome-focused goals. Instead of saying "I want to run a marathon", the approach asks you to become the kind of person who runs — and to cast small votes for that identity every day. This reframe shifts the psychological foundation of change from external pressure to internal consistency. James Clear draws on findings from psychology, neuroscience, and biology to ground each recommendation, and supplements the research with case studies drawn from elite sport, business, and everyday life. The writing is direct and chapter lengths are short, making the material accessible even in fragmented reading sessions. Readers who already engage deeply with academic behaviour-change literature, or who are specifically looking for extended treatment of workplace deep-focus strategies or the neuroscience of organisational habit loops, may find this book covers familiar ground more briefly than they would prefer. Compared with The Power of Habit, which focuses substantially on the science of habit loops and corporate case studies, Atomic Habits prioritises individual-level, immediately actionable instruction. Compared with Deep Work, which addresses concentrated professional performance, Atomic Habits operates at the level of daily behavioural architecture across all life domains.
A+ Content
Requires Brand Registry / Vendor Central
Author bio (short)
Paste into Author Central → Bio
Author bio (long)
Use on your own About page
Competing titles
These are who's on the lists you want to be on. Your book is highlighted where it ranks (or at the bottom if it didn't rank).
Run integrity
Every audit on this product runs the same set of components. This panel records which actually ran for your report so you can confirm nothing was silently skipped.
Now
Then re-audit in 4–6 weeks.
Every re-audit auto-shows the diff vs your last. Score moves, axes shift, listicle outreach lands — you’ll see it.
publishing.co.uk · AI Discovery Audit