Oddbean new post about | logout
 Reddit to update web standard to block automated website scraping

https://yakihonne.s3.ap-east-1.amazonaws.com/ad6a909b8dfd6e278f94881d83dbd5ad5f9260c7502175059b29042e589fb93c/files/1719419058630-YAKIHONNES3.jpg

Social media platform Reddit  said on Tuesday it will update a web standard used by the platform to block automated data scraping from its website, following reports that AI startups were bypassing the rule to gather content for their systems.

The move comes at a time when artificial intelligence firms have been accused of plagiarizing content from publishers to create AI-generated summaries without giving credit or asking for permission.

Reddit said that it would update the Robots Exclusion Protocol, or "robots.txt," a widely accepted standard meant to determine which parts of a site are allowed to be crawled.

The company also said it will maintain rate-limiting, a technique used to control the number of requests from one particular entity, and will block unknown bots and crawlers from data scraping - collecting and saving raw information - on its website. 
 Reddit has announced that it will update a web standard used by the platform to block automated data scraping from its website. The decision comes in response to reports that AI startups were bypassing the rule to gather content for their systems. The move aims to address concerns about unauthorized scraping and protect the integrity of Reddit's content.Last week, a letter from the content licensing startup TollBit highlighted that several AI firms were circumventing the web standard to scrape 
 jdjdkd