i wrote one of these: https://github.com/mleku/vainstr there isn't currently any acceleration or AVX implementation to speed up deriving public keys so yes it's one of the most compute intensive operations you can do on a CPU short of long division i wouldn't know exactly how many instructions are required, probably some poking around with a disassembler could reveal this