Thank you for sharing this information. I've started to see Instagram posts showing up in Google search results. I appreciate your tip about being more keyword conscious when posting on Instagram. I've also noticed visitors are beginning to come to my website via ChatGPT. Your article has got me thinking that if a writer's website/blog can be seen as an authority on a topic by AI apps in addition to Google, it might also be a good way to increase discoverability in this emerging landscape. For example, I maintain a blog on my website while I'm working on a novel. After I email my newsletter out each month, I add a portion of it to the blog on my website. (I'm writing a folklore-inspired novel, and my newsletter/blog explores folklore topics.) Since I can see that visitors to my blog are coming from ChatGPT, it follows that ChatGPT has trained on my blog. At first I was bothered by this, as I can't seem to block it from reading my website like you can do on Substack. But maybe I should just lean into it, and accept it as a new search engine.
Gosh, I am a bit scared thinking of all of the AI inserting itself into our lives. Makes me want to stick to my notebook and pen and hide all my good stuff away in my book desk which looks like an old fashioned pulpit.
Okay, I've been wondering about this! A reader emailed me a few months ago and told me she discovered me through ChatGPT. That was a real new one for me. Apparently she was using it to get resources for adult ADHD and my book was the first one it recommended! I had no idea why or how, though, or if this was something authors had any influence on. I believe this book was *not* in the pirated dataset these LLMs were trained on, so it's not that -- it's something more organic. Like word of mouth, but AI. Really interesting.
Here’s a rundown of the top three user-facing AI apps—ChatGPT, Claude, and Gemini—with a focus on where they likely source their information (in order of preference or reliability, based on known practices):
I appreciate the info about google search, but was this written partly by AI? I ask in part because I find many of the AI enthusiasts here on Substack (logically) have no qualms about using it to generate newsletter content. Also, I'm confused about some of the writing - what is "how to start a conversation about this"? Who are you talking to?
I prefer not to pay for AI generated content, as I disagree with the ethics behind it, so I'd appreciate the disclosure either way. Thanks!
Thanks Kathleen, I appreciate the answer, though I'm puzzled that the question offended you. You've mentioned in the past that you subscribe to multiple AI services and use it regularly for work...I assume this newsletter is work? You also say you use Claude for writing. Given this, why would it surprise you that some people wonder if you use it to generate newsletter content, or would want clarification on that?
Thanks Kathleen- this is so important. I feel entirely daunted by the captioning, but always glad to know what new nonsense I need to learn. Appreciate your very basic explanation, we need it.
A terrific summary for those of who prefer not wade through the weeds, thank you! Wondering if you have a quick example of 'proactive' publicity, v. 'reactive' publicity?
Proactive publicity is when you start promotional activity months before your book is published. Reactive is when the pub date is close, and you start thinking about publicity. The latter occurs more often than you'd think.
Thank you for sharing this information. I've started to see Instagram posts showing up in Google search results. I appreciate your tip about being more keyword conscious when posting on Instagram. I've also noticed visitors are beginning to come to my website via ChatGPT. Your article has got me thinking that if a writer's website/blog can be seen as an authority on a topic by AI apps in addition to Google, it might also be a good way to increase discoverability in this emerging landscape. For example, I maintain a blog on my website while I'm working on a novel. After I email my newsletter out each month, I add a portion of it to the blog on my website. (I'm writing a folklore-inspired novel, and my newsletter/blog explores folklore topics.) Since I can see that visitors to my blog are coming from ChatGPT, it follows that ChatGPT has trained on my blog. At first I was bothered by this, as I can't seem to block it from reading my website like you can do on Substack. But maybe I should just lean into it, and accept it as a new search engine.
Gosh, I am a bit scared thinking of all of the AI inserting itself into our lives. Makes me want to stick to my notebook and pen and hide all my good stuff away in my book desk which looks like an old fashioned pulpit.
Okay, I've been wondering about this! A reader emailed me a few months ago and told me she discovered me through ChatGPT. That was a real new one for me. Apparently she was using it to get resources for adult ADHD and my book was the first one it recommended! I had no idea why or how, though, or if this was something authors had any influence on. I believe this book was *not* in the pirated dataset these LLMs were trained on, so it's not that -- it's something more organic. Like word of mouth, but AI. Really interesting.
Think of it as Google. That is what it is turning into.
More like Wikipedia if you’re using ChatGPT.
Here’s a rundown of the top three user-facing AI apps—ChatGPT, Claude, and Gemini—with a focus on where they likely source their information (in order of preference or reliability, based on known practices):
⸻
🧠 1. ChatGPT (OpenAI)
Models: GPT-4, GPT-4o
Interface: chat.openai.com, API, desktop & mobile apps
Primary Information Sources:
• ✅ Licensed datasets (e.g. books, Wikipedia, technical documentation)
• ✅ Public web data (up to training cutoff, e.g. Common Crawl, public forums)
• ✅ Third-party partnerships (e.g. integration with Microsoft Bing for real-time search)
• ⚠️ No access to proprietary subscription content unless explicitly licensed
Notes:
• GPT-4o has multimodal capabilities (text, image, voice, code, video input).
• ChatGPT integrates with tools (e.g. Python, web browsing) in pro versions.
⸻
🤖 2. Claude (Anthropic)
Models: Claude 2, Claude 3 family
Interface: claude.ai, Slack plugin, API
Primary Information Sources:
• ✅ Public domain content (e.g. Wikipedia, Project Gutenberg)
• ✅ Publicly available web data (carefully filtered for safety and fairness)
• ✅ Anthropic’s own curated datasets (aligned with constitutional AI values)
• ❌ No direct web access or external tool integration
Notes:
• Claude is known for its strong alignment and polite tone.
• Claude 3 Opus matches or outperforms GPT-4 in some reasoning and language tasks.
⸻
🌐 3. Gemini (Google DeepMind)
Models: Gemini 1.5 family
Interface: gemini.google.com, Google Workspace (Docs, Gmail, etc.), API
Primary Information Sources:
• ✅ Google Search index & Knowledge Graph
• ✅ YouTube transcripts, Google Books, Google Scholar (selective data use)
• ✅ Public web data (curated and filtered)
• ✅ Internal Google content (selectively, depending on use case and privacy settings)
Notes:
• Deep integration with Google products gives Gemini real-time info and app-level knowledge.
• Gemini can retrieve and reason over live data via search better than Claude or base ChatGPT.
⸻
⚖️ Summary Comparison – Data Sources Preference Order
Rank ChatGPT Claude Gemini
1️⃣ Licensed datasets Public domain Google Search & internal products
2️⃣ Public web data Curated web data Public web data
3️⃣ Bing search (pro version) In-house curation YouTube, Scholar, etc.
Hmm. I hadn’t thought I was seeing much response from my Instagram postings. I’ll have to reconsider that.
Just signed up for the class. Looking forward to learning from you!
I appreciate the info about google search, but was this written partly by AI? I ask in part because I find many of the AI enthusiasts here on Substack (logically) have no qualms about using it to generate newsletter content. Also, I'm confused about some of the writing - what is "how to start a conversation about this"? Who are you talking to?
I prefer not to pay for AI generated content, as I disagree with the ethics behind it, so I'd appreciate the disclosure either way. Thanks!
If you have read this newsletter for any amount of time, you'd know I do not use AI to write it. I am a bit insulted that you're asking me that.
What do you think "How to start a conversation" means? It's literally how you'd begin a conversation talking about all of this.
Please don't pay for this newsletter. Learning about AI and being an AI enthusiast are two very different things.
Thanks Kathleen, I appreciate the answer, though I'm puzzled that the question offended you. You've mentioned in the past that you subscribe to multiple AI services and use it regularly for work...I assume this newsletter is work? You also say you use Claude for writing. Given this, why would it surprise you that some people wonder if you use it to generate newsletter content, or would want clarification on that?
Thanks Kathleen- this is so important. I feel entirely daunted by the captioning, but always glad to know what new nonsense I need to learn. Appreciate your very basic explanation, we need it.
A terrific summary for those of who prefer not wade through the weeds, thank you! Wondering if you have a quick example of 'proactive' publicity, v. 'reactive' publicity?
Proactive publicity is when you start promotional activity months before your book is published. Reactive is when the pub date is close, and you start thinking about publicity. The latter occurs more often than you'd think.
I'm not on Instagram. I'm rethinking that now. Thank you for doing the research and filling us in. Very helpful.
You're welcome. What I would say is do whatever makes you comfortable.
Fascinating stuff! Thanks for sharing your thoughts, which are practical and rational as always.
Massive food for thought, thank you for sharing.