Oddbean new post about | logout
 @brugeman any chance you’d open source the codebase that does the data collection & storage?

nostr:note1hw75f7dq792c2u2r0p9kntslxrsu3mh9usrh9h3te4d9q7q75feq52qzjy 
 +1 This would be super interesting to see.  
 My guess is you’re doing some stuff manually, which is fine! So don’t say no if it’s only because of that, hah! 
 Aside from a dozen very annoying bots that I blocked, nothing manual there, I don't have time for that. 
 Awesome!  
 We are seeing another surge of events at one of the nos.social servers, just in case checked stats.nostr.band and it matches https://stats.nostr.band/#daily_profile_events . I wonder if we are under some bot swarm attack as few weeks ago 
 Most probably so. Not that huge to make me go figure it out, honestly 
 Not at this point, it's all very tied up with other parts of the backend, search etc, and I never intended to open source all that. 
 He means he wants to see the cleaned  dataset without having to deal the collection process 😅

ˢᵒ ʷᵒᵘˡᵈ ⁱ 
 I could export a 500gb compressed events if somebody wanted to play with it. Or you could just connect to our relay and subscribe to "everything", you wouldn't have to deal with collection process. 
 Im not too versed on web technologies to download something like that, so my vote would be yes. A datadump would be kinda cool as just to have that reproducible snapshot.

To get more specific if possible - smaller chunks of 50gb or so seems enough to play around with first. 🫂 
 I could export a 500gb compressed events if somebody wanted to play with it. Or you could just connect to our relay and subscribe to "everything", you wouldn't have to deal with collection process. 
 Im not too versed on web technologies to download something like that, so my vote would be yes. A datadump would be kinda cool as just to have that reproducible snapshot.

To get more specific if possible - smaller chunks of 50gb or so seems enough to play around with first. 🫂