the "heaviest block wins" and "block reward can be some maximum" rules combine with network latency and block propagation patterns to ensure that a lot of the time, blocks that have the most common reward maximum, or less, have the greatest amount of time to be found compared to a fork with more
even nodes that have forked to allow bigger rewards than the canonical schedule will allow blocks with a lower reward
all that is required for them to be the "best block" is having a smaller block hash
so long as the majority of nodes stick to the schedule the miners have to stick to the schedule, yes, even the little humble node runners
because all it takes is a few miners using that old schedule as well, and 50/50 chance most of the time their block is going to be "heavier" and because of restricted propagation there is plenty of time for such side chains to become the winner in cumulative work (smallest sum of all blocks in the chain)