Fascinating approach to the AI search citation problem! I'm curious about the technical implementation - how do you ensure that AI search engines actually prioritize your structured company profiles over other sources? Are you using specific schema.org markup, or is there a more sophisticated method to influence their source selection?
Also, regarding your custom 32B MoE model - how do you handle the potential for conflicting information between user-submitted profiles and existing web sources? It seems like there could be interesting challenges around fact verification and maintaining source authority while still giving businesses control over their narrative.
Great questions!
On getting AI engines to prioritize CoThou profiles:
It's a combination of signals, not a single trick:
Yes, schema.org (Organization, Person, Article schemas) plus JSON-LD. AI parsers love machine-readable structure. In addition Subdomain structure (company.cothou.com and john.cothou.com) creates clear attribution. I'm also working on verification badges (domain ownership, ORCID for researchers) to build trust and Semantic clarity, where I enforce consistent entity resolution (company names, people, topics). When an AI engine searches for "Acme Corp," it finds one authoritative, structured source instead of scattered mentions. It's quite complex but it works. Try "Search for Aiobis" for example to see how a verified company appears.
On the MoE model and conflicting information:
You've hit the core challenge. My approach:
CoThou doesn't replace fact-checking, it's a tool for presenting your version alongside existing sources. If someone asks ChatGPT about your company, ideally it will say: "According to their official CoThou profile with a link, they claim X. Other sources say Y."
We're not trying to suppress conflicting info. We're giving businesses a canonical source so AI engines have something authoritative to cite in addition to Wikipedia, news, etc.
For researchers: Academia already has this solved—peer review, citations, ORCID. We're just making that structured data accessible to AI parsers.
The harder problem is bad actors, someone could create a profile with false claims but i'm working on it:
Right now, we rely on
- requiring citations and
- domain verification for businesses.
Long-term, we're exploring reputation scoring and community flagging.
Does that answer it, or should I dig deeper into any part?
—Marty
You're absolutely right to be concerned, this is something I think about constantly.
The reality is:
bad actors don't need to reverse-engineer anything. AI engines already prioritize structured, citable content. Anyone can spin up a website with schema.org markup and fake citations.
The barrier is low.
What makes this hard to abuse at scale:
1. Domain verification – For businesses, we require proof of domain ownership. You can't claim to be Apple unless you control apple.com or an official subdomain or work at apple respectively having @apple.com business mail.
2. Citation requirements – Claims need links to primary sources. AI engines cross-reference. If your "citations" point to non-existent papers or contradict other sources, you lose authority fast.
3. Reputation signals – We're building verification badges (ORCID for researchers, business registries, etc.). Over time, verified profiles will rank higher.
But you've identified the fundamental tension: any system that makes it easier for legitimate businesses to be cited also makes it easier for bad actors. This is the same problem Google faced in the '90s, Wikipedia deals with daily, and AI engines are grappling with now.
The goal isn't to be manipulation-proof, nothing is. It's to make CoThou profiles more trustworthy than the alternatives (random blogs, SEO spam, outdated info).
What would you add?
This is an evolving problem and I'd love HN's input.
—Marty
Fascinating approach to the AI search citation problem! I'm curious about the technical implementation - how do you ensure that AI search engines actually prioritize your structured company profiles over other sources? Are you using specific schema.org markup, or is there a more sophisticated method to influence their source selection?
Also, regarding your custom 32B MoE model - how do you handle the potential for conflicting information between user-submitted profiles and existing web sources? It seems like there could be interesting challenges around fact verification and maintaining source authority while still giving businesses control over their narrative.
Great questions! On getting AI engines to prioritize CoThou profiles: It's a combination of signals, not a single trick: Yes, schema.org (Organization, Person, Article schemas) plus JSON-LD. AI parsers love machine-readable structure. In addition Subdomain structure (company.cothou.com and john.cothou.com) creates clear attribution. I'm also working on verification badges (domain ownership, ORCID for researchers) to build trust and Semantic clarity, where I enforce consistent entity resolution (company names, people, topics). When an AI engine searches for "Acme Corp," it finds one authoritative, structured source instead of scattered mentions. It's quite complex but it works. Try "Search for Aiobis" for example to see how a verified company appears.
On the MoE model and conflicting information: You've hit the core challenge. My approach: CoThou doesn't replace fact-checking, it's a tool for presenting your version alongside existing sources. If someone asks ChatGPT about your company, ideally it will say: "According to their official CoThou profile with a link, they claim X. Other sources say Y." We're not trying to suppress conflicting info. We're giving businesses a canonical source so AI engines have something authoritative to cite in addition to Wikipedia, news, etc. For researchers: Academia already has this solved—peer review, citations, ORCID. We're just making that structured data accessible to AI parsers.
The harder problem is bad actors, someone could create a profile with false claims but i'm working on it: Right now, we rely on - requiring citations and - domain verification for businesses.
Long-term, we're exploring reputation scoring and community flagging. Does that answer it, or should I dig deeper into any part? —Marty
I'm worried some bad actors are reverse engineering this as well.
You're absolutely right to be concerned, this is something I think about constantly.
The reality is: bad actors don't need to reverse-engineer anything. AI engines already prioritize structured, citable content. Anyone can spin up a website with schema.org markup and fake citations. The barrier is low.
What makes this hard to abuse at scale:
1. Domain verification – For businesses, we require proof of domain ownership. You can't claim to be Apple unless you control apple.com or an official subdomain or work at apple respectively having @apple.com business mail.
2. Citation requirements – Claims need links to primary sources. AI engines cross-reference. If your "citations" point to non-existent papers or contradict other sources, you lose authority fast.
3. Reputation signals – We're building verification badges (ORCID for researchers, business registries, etc.). Over time, verified profiles will rank higher.
But you've identified the fundamental tension: any system that makes it easier for legitimate businesses to be cited also makes it easier for bad actors. This is the same problem Google faced in the '90s, Wikipedia deals with daily, and AI engines are grappling with now.
Long-term solutions I'm exploring:
- Community flagging + reputation scoring - Integration with trust registries (DUNS, ORCID, Crossref DOIs) - Transparent edit histories (like Wikipedia)
The goal isn't to be manipulation-proof, nothing is. It's to make CoThou profiles more trustworthy than the alternatives (random blogs, SEO spam, outdated info).
What would you add? This is an evolving problem and I'd love HN's input. —Marty