The color, size and position of the word the reader needs to snag quickly bounces too much.
To see what I mean, watch it with audio off when you don’t know what it is going to say next, and see if you get behind when it plays through at speed. When watching with the audio on it may feel complimentary visually to the video, but I wasn’t in a place where I could have audio so I was crippled at keeping up.
For a rapid flicker of word replacement style keep the point of focus in the same spot on the screen for rapid injection and to keep it easy keep the font/color/scale consistent.
For a follow-me style highlight flow on the transcript line, just don’t let everything shift around, keep as much stationary as possible when the highlight shifts across the line to each focus word.