Oddbean new post about | logout
 Nostrudel.ninja is only missing the automatic translation of the notes and then it would be perfect for me. 
 @hzrd149 if you need a LibreTranslate endpoint to test against, hit me up. 
 Transactions is probably one of the oldest things on my to-do list. Ive tested LibreTranslate although I couldn't figure out how to make it ignore URLs and other non word content. 
 Interesting. I might look at how some of the other FOSS clients do this sometime this week. 
 oops, I meant "Translations" 
 I wrote some custom filtering when we do translations with nokyctranslate.com, which I run which filters out certain patterns like urls, e-cash nuts, etc. I mostly did it so that what is translated is shorter and thus cheaper.

I’d be happy to work with you to integrate it as a paid translation option for your client. I did it on damus, which also supports another paid (non btc though, cc only) DeepL.

I’m really excited about your client btw! 
 Removing the URLs is only half the problem. You have to add them back in the correct position. And since the text is translated the length and position has changed 
 Yep, that’s what we do. A special syntax counter keeps track of where each piece of non translated text goes and inserts it back in after we get the translation. This does assume the translation service will leave the symbol of the counter untouched. Currently we use AWS machine translation as the translator and it’s nearly perfect at leaving the symbols in place. Haven’t tested LibreTranslate tho.

While our code isn’t open source, I’d be happy to share my logic there as I wrote it in python. 

Are you open to offering paid translation services through the client? You could offer libre translate as a free option as well. That’s what damus has. 
 *untouched and in the right position.

Traveling now but can shoot you over my code logic later tonight, in case it’s helpful. 
 I have similar issue when using LibreTranslate to detect language. The quality and the performance can be lower whenever involves text with url and non word (symbol). Have to use regex to remove those url and symbols 😅

I have take a look at Mastodon source code, the interesting thing is it seems they don't preprocess any text before request translation

https://github.com/mastodon/mastodon/pull/19218/files