Oddbean

Just wrote an article about Blossom Drive. a new nostr app I'm building for file sharing and censorship resistant file hosting If you want to give it a try its hosted at https://blossom.hzrd149.com/ nostr:naddr1qqxkymr0wdek7mfdv3exjan9qgszv6q4uryjzr06xfxxew34wwc5hmjfmfpqn229d72gfegsdn2q3fgrqsqqqa28e4v8zy

I'm excited to discuss this with you further tonight at the meetup! I am curious if you think this may be a viable alternative method of hosting and distributing V4V music in the form of an RSS Feed (XML file), some .mp3s for the tracks and some .jpgs for the album art? I would love to get a v4v use case published using your Blossom solution over the next month or so!

I'm really excited about this. Blossom is deceivingly simple, which is why it is powerful and why it has a chance of working. What #nostr got right (and Bitcoin, for that matter) is that duplication is a feature, not a bug. "It takes advantage of the nature of information being easy to spread but hard to stifle," to quote Satoshi. De-duplication is a fools' errand, as it assumes a God's-eye-view. A global state is required to properly understand what to delete and what to keep. The second problem is of course indexing and discovery, which is indeed a hard problem if a global state is to be avoided. It's hard, but solvable. Especially if you already know what you're looking for, and especially especially if you have a common and purple-coloured discoverability layer. Blossom is basically copying what nostr did for notes and applies it to arbitrary files. Instead of relays handling events, there's simple HTTP servers handling files. Like relays, servers are interchangeable as they share the same interface, encouraging duplication and redundancy. Instead of uploading something to a single server, you might upload it to five different servers. Popular and/or important files will be on many servers, which is how the online world works today already. Files that you need often might even be served by a #blossom server that is geographically close to you, just like we now have local cache relays packaged with some clients (or that you can self-host on your home server). In the best case, Blossom will organically mirror what YouTube et al's content delivery networks already do well today, which is to provide file hosting that is high in availability and proximity. The neat thing about it all is that you can provide monetary incentives as it is nostr-native, and you get web-of-trust characteristics for free, as you can use only your servers, or those who are trusted by your friends, etc. And in the future, we'll probably have paid servers that whitelist npubs, just like we have paid relays now. So why is all of that awesome? Well, here's the thing: as the user, you actually don't care where a file is hosted; you just care about the file itself. The current iteration of image (and other) hosts is incredibly stupid. Images are uploaded, downloaded, and re-uploaded without end, often with massive loss-of-quality as the same image is compressed and re-compressed a hundred times. It's always the same image, or at least it *should* be. With Blossom, it actually is. Gone are the days of finding a thing and uploading again. You just need the hash, and the thing will appear. You could even insert images directly in notes with something like a blossom:ef1c26172f55017c9d9d6afa7cf22605b237b0fe92425e81e3b5e24d46c95448 and each client can choose how (HTTP, torrent, I2P, etc.) and where (public servers, private servers, etc.) to retrieve it from. But wait, there's more. Remember the monetary incentives we talked about? It is what allows for the emergence of a proverbial "assassination market" for files: you provide the hash, along with a bounty of 21k sats to anyone who can provide the file most readily. Servers could provide cryptographic proof that they have the file, and you could escrow the money until delivery is done and you verify the file on your side. The building blocks are already there, we just need to put them together in the right way. Blossom is one of the most exciting projects that came out of the first @Sovereign Engineering cohort, aka #SEC-01. I'll have more to say about all the other amazing #SovEng projects that came out of the discussions and collaborations we had, but now I'll have to go and upload some files. https://blossom.us-ord-1.linodeobjects.com/6b72c905176cb02fd35528e33e03a2496596270d6c153690cf9d3f232d38ee52.jpeg nostr:nevent1qqs9pgpsdel6ejqrh4nrlj6p8j0d09w0frlhhuyajfy3wn8ugzzfjfgpzpmhxue69uhkummnw3ezuamfdejsygpxdq27pjfppharynrvhg6h8v2taeya5ssf49zkl9yyu5gxe4qg55psgqqqqqqst9390z

interesting, but unless I’m missing a crucial detail this doesn’t sound new. is this an http server with a well known api for uploading content? that uses the content hash for later retrieval and deduplication? I love the idea but have seen this exact solution proposed many times. I even implemented a proposed spec for this almost a year ago. is there something new or is this just novel because it came out of sec-01?

I was not aware of this spec but it looks similar to what I've written for blossom in that it uses the sha256 for the id of the file. however I think it has some issues 1. Its only focused on images and videos. if we want to solve the storage problem it needs to be generic 2. It relies on the "/.well-known/nostr.json" document for servers to describe themselves. this makes servers not interchangeable when it comes to retrieving files 3. It only focuses on uploading and retrieving the files. I think its also necessary for users to be able to list and delete the files they have on servers

That can be done out of band, doesn’t need to be on the spec; much like negentropy is useful but doesn’t need to be done via nostr nor speced within nostr. @hzrd149 talked about doing something similar to NIP-65 for blossom as well as the idea of creating a DVM market for “I want to find file with hash x and willing to pay y sats” which are two very useful building blocks of this system

I was just testing this yesterday, hoping to find that clients would render a data URL embedded in kind1 content. None of the clients I tested did, though. Storing file hashes and content in events makes sense to me. If there are clients that support it I’d love to see what they’ve implemented. I was thinking of experimenting with it this week(end).

Hmm... I actually do care where my files are stored. Especially images. I'm more picky about that, than about notes. Could this just end up being another gigantic Datenkraken? I don't really understand the wider implications.

Could this mirror the functionality of IPFS in some way, where the content would be in multiple servers/nodes, and would get provided from the one with the best connection based on a content ID? Could even use sats to pay for file storage like FIL pays for storage on the IPFS network.

Yes and no. Blossom uses a SHA256 content hash, so content has a unique ID across the network. Users can publish which servers they use, so you know where to look for a users content alternatively. It does not magically solve the storage of the content "somewhere on the internet" though. A user must actively publish his content to multiple servers to get censorship resistance and might need to pay for it. That being said there are links between servers, so if a piece of content is missing it could be automatically fetched from an upstream server.

Yes IPFS computes a hash, even though it's a bit involved so the hash won't be the same as Blossom's or what sha256sum tells you. Lack of a DHT is a big one. IPFS's DHT is what they claim makes it scale ("interplanetary" lol) but it's poorly designed and doesn't actually scale well. You'll find lots of timeouts and "content not found" even though the content it out there somewhere.

Workshop of the day ⚡ #1day1app nostr:nevent1qqs9pgpsdel6ejqrh4nrlj6p8j0d09w0frlhhuyajfy3wn8ugzzfjfgppamhxue69uhkummnw3ezumt0d5pzqfngzhsvjggdlgeycm96x4emzjlwf8dyyzdfg4hefp89zpkdgz99qvzqqqqqqymzemmp

My biggest concern is how we will solve an active synchronization of blobs. Currently it is user centric, i.e. a user or an app on behalf of a user can upload and duplicate blobs on multiple servers. - 1. With current API blobs can be listed on a user basis. Compared to a nostr relay, where I anyone can get a feed of events and publish them to any relay, the blossom server doesn't have a feed of new blobs. It would be hard to duplicate content without a user (npub) focus. (still wondering if we need that). - 2. I think we might need a way to check if a blob is on a server without downloading it and without doing a lookup on upstream servers. I'm thinking it may be useful to add HEAD /<sha256> to the spec for that. - 3. We might need to solve a circular dependency problem in how the "download" from upstream servers work. If we have blossom servers pointing to each other in a circular way, pulling blobs from an upstream server might end up in an endless loop.

1. The focus on the pubkeys is the key here I think, without it the servers are left with a pile of miscellaneous blobs which would be difficult to synchronize. however with the pubkey the server can categorize the blobs under the pubkeys and synchronize the pubkeys and not the whole server. Either way though I've not sure how much servers should be synchronizing if at all 2. Just added the requirement for the HEAD /<sha256> endpoint, easy win https://github.com/hzrd149/blossom/blob/master/Server.md#head-sha256---has-blob 3. The HEAD /<sha256> endpoint could probably be a good solution to this. it would allow a client to check if the blob existed before requesting it. either way though the concept of "upstream servers" is only implemented in my blossom-server and I don't expect every implementation to have it

Since some apps still don't resolve the naddr link, here ist the article: https://habla.news/a/naddr1qqxkymr0wdek7mfdv3exjan9qyg8wumn8ghj7mn0wd68ytnddakj7qgwwaehxw309ahx7uewd3hkctczyqnxs90qeyssm73jf3kt5dtnk997ujw6ggy6j3t0jjzw2yrv6sy22qcyqqq823cnz0nn5

yes the library provides a few classes that should help with building a blossom server The "BlossomSQLite" class gives you a simple interface for keeping track of blobs and who owns them using an sqlite database The "LocalStorage" and "S3Storage" classes provide a nice interface for storing the actual blobs on either the local filesystem or a remote s3 bucket

Have you heard of SeaweedFS? Pretty different project, but there might be some good ideas in there for scaling when dealing with lots of small files and a designing distributed hash based lookup. https://github.com/seaweedfs/seaweedfs?tab=readme-ov-file#master-server-and-volume-server