Oddbean new post about | logout
 Just wrote an article about Blossom Drive. a new nostr app I'm building for file sharing and censorship resistant file hosting

If you want to give it a try its hosted at https://blossom.hzrd149.com/

nostr:naddr1qqxkymr0wdek7mfdv3exjan9qgszv6q4uryjzr06xfxxew34wwc5hmjfmfpqn229d72gfegsdn2q3fgrqsqqqa28e4v8zy 
 👀
nostr:nevent1qqs9pgpsdel6ejqrh4nrlj6p8j0d09w0frlhhuyajfy3wn8ugzzfjfgpp4mhxue69uhkummn9ekx7mqzyqnxs90qeyssm73jf3kt5dtnk997ujw6ggy6j3t0jjzw2yrv6sy22qcyqqqqqqgat57hx 
 🏴‍☠️ 
 I can't log in on mobile.  
 The login process has got a few bugs. although I'm not sure its even ready for mobile yet. the UI is only built for desktop at the moment 
 I'm using Firefox+Alby on Android. Works! 
 reporting login not working  with kiwi + nos2x 
 I'm excited to discuss this with you further tonight at the meetup! I am curious if you think this may be a viable alternative method of hosting and distributing V4V music in the form of an RSS Feed (XML file), some .mp3s for the tracks and some .jpgs for the album art? I would love to get a v4v use case published using your Blossom solution over the next month or so! 
 I wanna see that picture more clearly too! Both for music and film. 
 I'm really excited about this. Blossom is deceivingly simple, which is why it is powerful and why it has a chance of working.

What #nostr got right (and Bitcoin, for that matter) is that duplication is a feature, not a bug. "It takes advantage of the nature of information being easy to spread but hard to stifle," to quote Satoshi. De-duplication is a fools' errand, as it assumes a God's-eye-view. A global state is required to properly understand what to delete and what to keep. The second problem is of course indexing and discovery, which is indeed a hard problem if a global state is to be avoided. It's hard, but solvable. Especially if you already know what you're looking for, and especially especially if you have a common and purple-coloured discoverability layer.

Blossom is basically copying what nostr did for notes and applies it to arbitrary files. Instead of relays handling events, there's simple HTTP servers handling files. Like relays, servers are interchangeable as they share the same interface, encouraging duplication and redundancy. Instead of uploading something to a single server, you might upload it to five different servers. Popular and/or important files will be on many servers, which is how the online world works today already. Files that you need often might even be served by a #blossom server that is geographically close to you, just like we now have local cache relays packaged with some clients (or that you can self-host on your home server).

In the best case, Blossom will organically mirror what YouTube et al's content delivery networks already do well today, which is to provide file hosting that is high in availability and proximity. The neat thing about it all is that you can provide monetary incentives as it is nostr-native, and you get web-of-trust characteristics for free, as you can use only your servers, or those who are trusted by your friends, etc. And in the future, we'll probably have paid servers that whitelist npubs, just like we have paid relays now.

So why is all of that awesome? Well, here's the thing: as the user, you actually don't care where a file is hosted; you just care about the file itself. The current iteration of image (and other) hosts is incredibly stupid. Images are uploaded, downloaded, and re-uploaded without end, often with massive loss-of-quality as the same image is compressed and re-compressed a hundred times. It's always the same image, or at least it *should* be. With Blossom, it actually is. 

Gone are the days of finding a thing and uploading again. You just need the hash, and the thing will appear. You could even insert images directly in notes with something like a blossom:ef1c26172f55017c9d9d6afa7cf22605b237b0fe92425e81e3b5e24d46c95448 and each client can choose how (HTTP, torrent, I2P, etc.) and where (public servers, private servers, etc.) to retrieve it from.

But wait, there's more. Remember the monetary incentives we talked about? It is what allows for the emergence of a proverbial "assassination market" for files: you provide the hash, along with a bounty of 21k sats to anyone who can provide the file most readily. Servers could provide cryptographic proof that they have the file, and you could escrow the money until delivery is done and you verify the file on your side. The building blocks are already there, we just need to put them together in the right way.

Blossom is one of the most exciting projects that came out of the first @Sovereign Engineering cohort, aka #SEC-01. I'll have more to say about all the other amazing #SovEng projects that came out of the discussions and collaborations we had, but now I'll have to go and upload some files.

https://blossom.us-ord-1.linodeobjects.com/6b72c905176cb02fd35528e33e03a2496596270d6c153690cf9d3f232d38ee52.jpeg

nostr:nevent1qqs9pgpsdel6ejqrh4nrlj6p8j0d09w0frlhhuyajfy3wn8ugzzfjfgpzpmhxue69uhkummnw3ezuamfdejsygpxdq27pjfppharynrvhg6h8v2taeya5ssf49zkl9yyu5gxe4qg55psgqqqqqqst9390z 
 This is 🤯 
 Yeah, wow. Amazing! 
 interesting, but unless I’m missing a crucial detail this doesn’t sound new. 

is this an http server with a well known api for uploading content? that uses the content hash for later retrieval and deduplication? 

I love the idea but have seen this exact solution proposed many times. I even implemented a proposed spec for this almost a year ago. is there something new or is this just novel because it came out of sec-01? 
 nostr:note1xpzr5jnjw7d46ntcnjtk8wexsxlefvuyq2kf6x2wahvf739wy52qytwz7j 
 I was not aware of this spec but it looks similar to what I've written for blossom in that it uses the sha256 for the id of the file.

however I think it has some issues
1. Its only focused on images and videos. if we want to solve the storage problem it needs to be generic
2. It relies on the "/.well-known/nostr.json" document for servers to describe themselves. this makes servers not interchangeable when it comes to retrieving files
3. It only focuses on uploading and retrieving the files. I think its also necessary for users to be able to list and delete the files they have on servers 
 Oh the beauty of a concise, knowledgeable answer ❤️ 
 is your spec/protocol documented somewhere I can read it?

I think the issues you highlighted from the old spec are trivial… but that’s neither here nor there. 
 nostr:naddr1qqxkymr0wdek7mfdv3exjan9qgszv6q4uryjzr06xfxxew34wwc5hmjfmfpqn229d72gfegsdn2q3fgrqsqqqa28e4v8zy 
 nothing happens when I click that link in damus 🤷 
 spec is documented here https://github.com/hzrd149/blossom 
 If replication and distributed retrieval can be solved I'll be excited. Same problem nostr has in general. I think it's doable, but it hasn't been done. 
 does blossom do replication or distributed retrieval? I clicked through the link but couldn’t find any kind of spec. 
 That can be done out of band, doesn’t need to be on the spec; much like negentropy is useful but doesn’t need to be done via nostr nor speced within nostr.

@hzrd149 talked about doing something similar to NIP-65 for blossom as well as the idea of creating a DVM market for “I want to find file with hash x and willing to pay y sats” which are two very useful building blocks of this system 
 so this is just another http media server? 
 Who remembers NIP-95? Blobs all the way! :)  
 I remember and still think blobs should be stored *not* in json encoded events. still think @nothenry ‘s original proposal from a year ago is fine. curious how blossom is different. 
 Blobs are not stored in json under blossom 
 Oh, sorry, you were reply to Vitor 😂 
 Now that some clients are putting base64 images directly inside the .content of kind 1s to go around the chinese firewall, NIP-95 is easy.  
 Which client? 
 I don't know. I just see the event with the massive base64 in it. It's becoming quite common. 
 I was just testing this yesterday, hoping to find that clients would render a data URL embedded in kind1 content. None of the clients I tested did, though.

Storing file hashes and content in events makes sense to me. If there are clients that support it I’d love to see what they’ve implemented. I was thinking of experimenting with it this week(end).  
 👀 
 Some emoji images? 
 You have a typo, but yes, all men are into them big time! 
 Not me, I’m a newbie. What was in NIP-95? 
#nostrdev 
 @PABLOF7z how does it compare/contrast to this:

https://github.com/michaelhall923/nostr-media-spec?tab=readme-ov-file#uploading-media 
 Seems similar. I really like some of the stuff in that one, particularly how the filename can be anything. 
 one of my favorite things about hodlbod... he is enthusiastic, but honest, and doesnt ride the fanboy train. 
 There is no deduplication.  
 this isnt an answer either.  
 Wd could use this to store git objects. 
 Hmm... I actually do care where my files are stored. Especially images. I'm more picky about that, than about notes.

Could this just end up being another gigantic Datenkraken?
I don't really understand the wider implications. 
 Couldn’t read your earlier article. 
 Fascinating 
 Could this mirror the functionality of IPFS in some way, where the content would be in multiple servers/nodes, and would get provided from the one with the best connection based on a content ID?  Could even use sats to pay for file storage like FIL pays for storage on the IPFS network. 
 Yes and no. Blossom uses a SHA256 content hash, so content has a unique ID across the network.

Users can publish which servers they use, so you know where to look for a users content alternatively.

It does not magically solve the storage of the content "somewhere on the internet" though. A user must actively publish his content to multiple servers to get censorship resistance and might need to pay for it. 

That being said there are links between servers, so if a piece of content is missing it could be automatically fetched from an upstream server. 
 Doesn't IPFS hash the content too to get a unique ID? I'm struggling to understand the difference between Blossom and IPFS besides, I'm assuming, the lack of a DHT for Blossom 
 Yes IPFS computes a hash, even though it's a bit involved so the hash won't be the same as Blossom's or what sha256sum tells you.

Lack of a DHT is a big one. IPFS's DHT is what they claim makes it scale ("interplanetary" lol) but it's poorly designed and doesn't actually scale well. You'll find lots of timeouts and "content not found" even though the content it out there somewhere. 
 @simplex also has a similar thing called XFTP 
 Can we run blossom in our metals using the storage of it? Or do we need to use s3? 
 Local storage is an option in the yaml 
 That's nuts!🔥🔥🔥 @hzrd149 can confirm? 
 The blossom-server I've written has an option to serve files from the local filesystem.
But you could also always write your own implementation 😉  
 That's good man!🔥🔥🔥 
 Workshop of the day ⚡
#1day1app
nostr:nevent1qqs9pgpsdel6ejqrh4nrlj6p8j0d09w0frlhhuyajfy3wn8ugzzfjfgppamhxue69uhkummnw3ezumt0d5pzqfngzhsvjggdlgeycm96x4emzjlwf8dyyzdfg4hefp89zpkdgz99qvzqqqqqqymzemmp 
 My biggest concern is how we will solve an active synchronization of blobs. Currently it is user centric, i.e. a user or an app on behalf of a user can upload and duplicate blobs on multiple servers.

- 1. With current API blobs can be listed on a user basis. Compared to a nostr relay, where I  anyone can get a feed of events and publish them to any relay, the blossom server doesn't have a feed of new blobs.  It would be hard to duplicate content without a user (npub) focus. (still wondering if we need that).

- 2. I think we might need a way to check if a blob is on a server without downloading it and without doing a lookup on upstream servers. I'm thinking it may be useful to add HEAD /<sha256> to the spec for that.

- 3. We might need to solve a circular dependency problem in how the "download" from upstream servers work. If we have blossom servers pointing to each other in a circular way, pulling blobs from an upstream server might end up in an endless loop. 
 1. The focus on the pubkeys is the key here I think, without it the servers are left with a pile of miscellaneous blobs which would be difficult to synchronize. however with the pubkey the server can categorize the blobs under the pubkeys and synchronize the pubkeys and not the whole server. Either way though I've not sure how much servers should be synchronizing if at all

2. Just added the requirement for the HEAD /<sha256> endpoint, easy win https://github.com/hzrd149/blossom/blob/master/Server.md#head-sha256---has-blob

3. The HEAD /<sha256> endpoint could probably be a good solution to this. it would allow a client to check if the blob existed before requesting it. either way though the concept of "upstream servers" is only implemented in my blossom-server and I don't expect every implementation to have it 
 👏 
 Oh neat, I'm already using cdn.satellite.earth and files are already there when I log in. 
 Good shit ! Keep up the good work 
 Btw, I want to integrate blossom to Relayed as a library. S
Do I use https://github.com/hzrd149/blossom-server-sdk for that? 
 yes the library provides a few classes that should help with building a blossom server

The "BlossomSQLite" class gives you a simple interface for keeping track of blobs and who owns them using an sqlite database

The "LocalStorage" and "S3Storage" classes provide a nice interface for storing the actual blobs on either the local filesystem or a remote s3 bucket 
 Have you heard of SeaweedFS? Pretty different project, but there might be some good ideas in there for scaling when dealing with lots of small files and a designing distributed hash based lookup.

https://github.com/seaweedfs/seaweedfs?tab=readme-ov-file#master-server-and-volume-server