One additional thing that might be worth modeling a bit more is the actual impact of such an attack across a long sequence of blocks. Seems like some sort of Markov process with two states (Attacker mined previous block, and there is thus a delayed competition for the next block, or the next block is "fairly" mined).
I'm not immediately seeing closed form solution, so ran a quick simulation and found the following:
1% Attacker and X=10 minutes:
effective hash on delayed block = 2%, and over long chain unchanged at 1%
5% Attacker and X=11 minutes:
effective hash on delayed block = 10.6%, and over long chain 5.6%
10% Attacker and X=12 minutes:
effective hash on delayed block = 20.4%, and over long chain 11.3%
20% Attacker and X=14 minutes:
effective hash on delayed block = 39.3%, and over long chain 25%
Might also be interesting to see impact across difficulty adjustments (all other things being held constant), but haven't looked into that.