I'm not autistic, I just don't like working for other people.
nostr:nevent1qvzqqqqqqypzqnyqqft6tz9g9pyaqjvp0s4a4tvcfvj6gkke7mddvmj86w68uwe0qythwumn8ghj7un9d3shjtnswf5k6ctv9ehx2ap0qy2hwumn8ghj7un9d3shjtnyv9kh2uewd9hj7qpqz52k6k6yz898c2rypund0kv2h6l02ut5nt7ftsyatvlwt7x4nnrq6sg22e
I agree. I wish they could schedule... lets say, differently. I guess there is an unmet market for Bitcoin and nostr talk with live engagement in my timezone.
It would be great if an archive could generate revenue, but honestly, my focus is on building relationships and trust with relay operators to get the data without disrupting their primary function. Once the data is coming in, then it will be down to working with people like yourself to figure out how to serve the data subset you need in the time frame you need it with the resources I have available. And if the resources are lacking, how to get them.
#asknostr what upcoming meetup opportunities are there for people developing FOSS companies in the nostr space? I want to learn which holes not to step in.
Thanks!
#introductions Hi nostr, I am Bob. I am starting a project called Nostrchive. I have the crazy idea to collect and archive as much nostr data as possible with the goal to pre-process (collect, collate, organize, and tokenize) data for (re)training FOSS nostr-aware LLMs for nostr search. Any strfry relay operators who might consider whitelisting my archive strfry relay for negentropy connections, please reach out. I would like to identify optimal batch sizes and connection windows in UTC. Thanks for your consideration.
Hi @mleku
Just thinking here... Building search is really hard, and I am sure I am not the man for that job; however, I like to organize, analyze, and automate. I also have a large symmetric connection of which I can only really use 25% for IRL, so I thought to myself; what useful service can I create for nostr with my excess capacity? I am not sure I will be able to host a proper relay archive available to the public as managing a single large relay database would be unwieldly. It might be possible to host particular curations of the data as separate public relays, though.
My main focus is segmenting and archiving the data. This seems achievable, manageable, and open to automation. I believe this will serve as a useful foundation for projects needing large nostr datasets for LLMs. I expect I can make this segmented data available as periodic updates to the public. Early stages, building out the garage data center off of local surplus in the Bay Area.
Similar to bitcoin, nostr is an open book for all to see, which I am okay with. There are easy enough ways to do pseudonymity in these open systems if you need/want it, but as you point out the asymmetry of the three letter data hoard is a problem we can begin to address. If folks require absolute privacy, I am not sure nostr is the right protocol for them.
I think we should at least have open datasets where it is possible. AI frameworks are open enough but the up-to-date data is not. The tech giants are really all data hoarders and they monetize those hoards selling to advertisers, ngos, governments, and private firms. Not really much access for plebs, which will reduce the impact of pleb-driven AI tech.
Notes by Bob_stores_nostr | export