Oddbean new post about | logout
 nostr:nprofile1qy88wumn8ghj7mn0wvhxcmmv9uqjqamnwvaz7tmwdaehgu3wv45kuatwv3a8wctw0f5kwtnnwpskxef0qyf8wumn8ghj7mn0wd68yat99e3k7mf0qyfhwumn8ghj7mmxve3ksctfdch8qatz9uq32amnwvaz7tmjv4kxz7fwv3sk6atn9e5k7tcqyrhprfwl7sxpnf247s07g26g7q8xrry3yftz9t3hkmptkeahd38yjk8vj6z  this relay testing tool is pretty awesome... finding so many bugs i need to fix

i've filed an issue about the events with control characters though, they should not be accepted? or what is the deal with that? it's technically invalid json to have linebreaks in a string isn't it? (or backspaces or form feeds or tab characters)

should i be returning an error for that case and refusing to save it or what? nip-01 seems to imply that such an event should not be accepted 
 Glad it is helping.  I discovered a handful of chorus bugs with it myself.

I replied on github. NO in this case confusingly means that it does not do the probably-wrong thing. 
 yeah, i think the main issue with strings and escaping is that you can't really predict anything about what's going to be inside the strings, and the primary intent of making the most common possible contrtol characters use the standard C style single is for space efficiency reasons

but \u and \/ need to be kept as i, and octal escapes, basically any number after a backslash, all need to be kept as is

the problem is that these escape codes have literally 3 representations, some have the single letter, and then there is \uXX and \uXXXX and then there is \[0-9] [0-9][0-9] octal codes, so if you decode them, you must re-encode them to the same form for deriving the ID hash

the encoding MUST define the normative encoding and any other form should be correctly accepted and retained as a literal... this pushes the issue off the relay devs and onto the clients where the problem originates anyway

in actual fact, you could even leave all this stuff out in the runtime data format, and just swallow the strings whole and leave them unprocessed  but in general the single letter escapes save memory space...

similarly i have opted to take advantage of the intent of binary encoding for pubkeys, id's and signatures, as well as e and p tags as well as the filters in runtime form as binary because of the space saving (and lack of further processing to store in the database)

it's this matter that leads to the need for a specification at all, if it had just been "store strings as they are exactly, b/c of crazy encoding schemes" the tests would not even be necessary, but on the other hand, the standard encodings put a lot of burden on devs writing client code that needs to make sense of these especially for non-latinic scripts,

so if one client uses 16 bit hex escapes, and another has to be able to decode them... so, someone has to decode control characters and the JSON standard is a mess and UTF-8 is complicated af

so, yeah, anyway, my 2 cents

good work on the relay tester! 
 i have made a draft PR for nip-01 that elaborates every bit of the details of how to do escaping correctly, i'm sure i went too far with detail but maybe it is ok