There are probably SaaS offers but if you want to do it yourself you’ll need https://github.com/ggerganov/whisper.cpp to extract the subs then use something else to encapsulate the subs in ASS format into the video container or just hard burn it into the video with something.