Oddbean new post about | logout
 hah... thinking about this has almost got me wanting to just generate fucking protobuf interface for the binary codec in #realy - or even just, fuck it, turn on max compression with zstd and store the damn raw json and modify the request handling so instead of decoding it, it just shunts it on

i might do that... fuck itt... json ftw... inside a binary database with compression enabled lol

i know my json codec is pretty much the bomb and it was benching so close to as fast as the binary codec i am really thinking to myself, fuck this, why decode it at all

nostr:nevent1qvzqqqqqqypzqnyqqft6tz9g9pyaqjvp0s4a4tvcfvj6gkke7mddvmj86w68uwe0qyghwumn8ghj7mn0wd68ytnvv9hxgtcpzamhxue69uhk6mr9dd6jumn0wd68yvfwvdhk6tcqyqvpxmhl9yr2vw0tc5x948xegt3pqt3ft7wa6f2dzd9kcmf9u3e2yde355y 
 yeah, nah, i can't do that... the search functions need to be able to apply secondary filters and sift out replaced events and whatnot.

so, fuggit, i'm just going to make a protobuf for events. maybe some day in the future i'll make a whole protobuf protocol to replace the existing one but for now just going to use protobuf. the database itself, badger, uses protobuf encoding for data as it is. 
 protobuf ... been some time since i used it, but fortunately almost everything required to do binary encoding of events does not require anything other than allocating the struct itself, which is mostly just a series of byte slice headers...

the timestamp can even stay in its native form, more or less, only the kind needs to be copied into an int32... i already took care to make EVERYTHING into byte slices even though some fields are fixed etc... because it's obviously 2x faster to compare the actual binary data in a search, so filters also already decode fields that should be binary to do the comparisons to fields that are binary (id, pubkey, sig) and everything else is bytes just because ... well

i reserve in this the ability to later make easy optimizations like i tried to do with my binary encoder but failed to do correctly, especially the tags... i think there is some glitches in the content fields as well... but i know that it goes from JSON to the runtime format that can do the comparisions back to the JSON 100% fine it's just the binary encoding breaks on some special cases, mostly e and p and a tags

anyhow, fuck it, i want this working 100% this weekend so i will do this to the binary encoder and then run my categorizer and see if ANYTHING fails when i remove the fancy optimizations

probably none lol 
 so, that was really easy, and of course it eliminated all of the encoding errors at all

even if i decide to keep my binary codec, i can also now add protobuf encoding to the database code anyway, it's so damn simple i did it in an hour

probably will be faster, but it's the biggest bug mess in my codebase at the moment and this will be great for really ironing out the JSON encoder, it seems there is a few glitches but they could well just be bad escaping... but if they are not bad escaping, then i'll fix these bugs i guess... i'm just gonna skip the fixing the encoder side for now, maybe make it a configuration option so i can enable the bespoke codec and fix it later