Oddbean new post about | logout
 The unwritten rules of respecting robots.txt and not scooping up everything have already crumpled due to AI scraping.

https://www.businessinsider.com/meta-web-crawler-bots-robots-txt-ai-2024-8

Seems to be the trend that if you can access it it will be used. Encrypt and deny access might become the new reality, unfortunately. 
 I normally work on a different principle.  So firstly I make things I want to use myself.  And preferably that I can recommend to my friends.  Many of whom are non techie.  So, for example I can recommend bitcoin, because it's fair, innovative, and a level playing field, as far as possible, imho.  Could I recommend a service where the norm is to take their (possibly exclusive) content and posts it elsewhere?  No, that's not OK.  Nostr was built as a small corner of the web that tried to aspire to be a bit better.  So just because, say OpenAI are doing bad things (thanks for pointing it out, btw), doesnt mean it should translate to nostr.  A reasonable user expectation would be they publish to the places they want, and if others republish that, they need consent, or user can be unhappy.  Nostr itself says nothing about storage, replay attacks, or republishing, it just transmites, notes, and other stuff, from one user to another.  Basic cypherpunk principle is not to harm the user, imho.  Whether or not republishing does, is something that can be debated. 
 That was the way the Internet worked the first decades. 

Seldom it's worth the effort to try to chase after your rights if somebody misuses your copyright.

This is the article I meant to share (and it's not just OpenAI):

https://www.oreilly.com/radar/how-to-fix-ais-original-sin/ 
 I think there is a conflation between enforcement ("How can I stop this") and ethics ("Should be we doing this").  You cant stop racism, but you can call it out.  Definitely on nostr there are those that will want to post exclusive content on relays THEY choose.  Perhaps a community relay, like a facebook group relay.  I'm saying that data ownership should be respected, and people own their own data.  You might not be able to stop it, but you can call it out.  We have a web of reputation and trust already.  Things like "antizaps" (negative zaps) are going to be quite important to call out bad actors. 
 Indeed. Robots.txt has worked for many years, and it was just a moral choice, not a technical barrier.

Transparency is a good way to air out bad behaviour allowing others to see it and stop cooperating with a bad actor.

There are ways to have strong “signals” and guardrails for the expected behaviour in the NIPs.

Choice is a strong force. If you can choose the behaving relays, services etc. it goes a long way.