translating you can do with a model, for way more than 7 languages, for cheap. no?
nah not that simple, because the videos are tightly constructed to be just shy of 60 seconds, translations would have to trim or accomodate shortened context depending on syllable length. needs to be proofread for accuracy and timing IMO