@novalis yeah I have some of the same thoughts since my thoughts about regexes were also formed in theory of computation class some things I find weird: - "a regex defines a regular language" doesn't exactly map to regexes in practice (the '?' feels a little slippery to me, and in fact I get a little confused about ?s in my regexes) - the thing where matching regexes with backreferences is NP-hard (I never use backreferences because they confuse me, but is it bad to use backreferences?)
@novalis also I have a pretty clear model of how to "be the computer" for a python program, but for a regex it's harder because actually to match an NFA you have to maintain multiple possibilities in your head at the same time or something? Maintaining 3 parallel possibilities for where in the state machine we could be actually doesn't feel that easy. running a DFA computer in your head is pretty easy but I don't think anyone is really converting a regex to a DFA in their head
@novalis (when I say "maintaining many parallel possibilites at a time", I'm thinking of this "Implementation: Simulating the NFA" section from https://swtch.com/~rsc/regexp/regexp1.html where he simulates an NFA using an array of states)
@Julia Evans I definitely do not think of it as a NFA -- I think of it in terms of backtracking. That happens to be how Python implements it (last time I checked). I know that Rust and Golang use NFAs, but even there, it doesn't usually hurt to think about backtracking. In practice, the NP thing is rarely an issue. I think I hit exponential behavior (not NP) once in the last 25 years, and I really like regexps. Of course you have to worry about it for untrusted expressions, but that's a pretty rare case. So, when I see something like "(ab)*abraca" matching ababababraca, I think, "so, we'll match some 'ab's, and then the last one we'll have to backtrack so we can get it to part of 'abraca'. (Actually, I think "let me rewrite that as (ab)+raca", but it's just an example).
@novalis that makes sense thanks -- thinking about it in terms of backtracking seems much more realistic