The article says it’s not poisoning the AI data, only providing valid facts. The scraper still gets content, just not the content it was aiming for.
E:
It is important to us that we don’t generate inaccurate content that contributes to the spread of misinformation on the Internet, so the content we generate is real and related to scientific facts, just not relevant or proprietary to the site being crawled.
if you’re dumb enough to trust a large language model because someone told you “iTs Ai!” no amount of facts will be of great utility to you.
That take would be more digest able if I wasn’t stuck on the same planet as those people.
im saying they want to be lied to. it would be disrespectful to offer them the truth.
Until the AI generating the content starts hallucinating.
Thank you for catching that. Even reading through again, I couldn’t find it while skimming. With the mention of X2 and RSS, I assumed that paragraph would just be more technical description outside my knowledge. Instead, what I did hone in on was
“No real human would go four links deep into a maze of AI-generated nonsense.”
Leading me to be pessimistic.
and the data for the LLM is now salted with procedural garbage. it’s great!