I'm confused. If you want to know "does this have muted words in it?" that's one bit? If you want to support dynamically changing mute words then you probably do a pair of generation numbers (one for words added to mute list, one for words removed) and then recalc on the fly if it could be a false negative / positive.
The performance difference between a 50% full hash table and a 100% full is minimal in practice. And if you don't use the bitstuffing tricks of ccan/htable you will get a second cache hit to actually validate the miss.