- cross-posted to:
- [email protected]
- [email protected]
- cross-posted to:
- [email protected]
- [email protected]
Reddit says Microsoft’s Bing, Anthropic, and Perplexity have scraped its data without permission. “It has been a real pain in the ass to block these companies.”
An absolutely prodigious back catalog of high quality images, interviews, and explainers. A treasure trove of historical content that’s been heavily indexed and participant-weighted for relevancy. And the bulk of it predates the infestation of AI, so its valuable just as sampling data of original human content for further iterative development of ChatGPT and other LLMs.
I don’t know about the AI part. The major companies had plenty of time scraping everything on the internet, or am I simplyifing the effort too much in my head?