Here is my 2 cents for the questions:
1. App developers often make bugs, and for list events it means that it's easier to unwittingly erase the list if the user wants to add a new pubkey and cannot find the previous event on the network. Apart from that, it's easier to build graph from individual events (as VictorPamplona mentions in the github issue).
1.5 Replaceable events or static events? - Replaceable events can be updated, if trust relation changes over time. Also, event deletion seem to be not well implemented in relays.
2. Another model is private to the author + shared by relay/DVM when needed. I think the common sentiment is that we all see usecases for private trust events, but implementation is more complex. We need interactive protocol for this - as Hodlbod mentioned - or ZK proofs, and who knows what. For now, I think we should go for public, discover the possibilities and implement private model when we know more about the problem / we see what the main usecases are.
3. I would vote for an optional scalar "score" and "confidence" value. If they are absent, it's a binary true value.
4. I like the "context" term. I think it could be optional, but see a lot of value in embedding into the event, so as it can be used by filters as input. I don't see clearly what is the usecase for it derived by filters. Is it like "I trust this guy", and the algorithm tells me the context it was used in?
5. I'm probably not the one who has dig into it deeply. Not sure if this is what you mean by spec, but I'd like to know: what are the inputs/outputs, where the filter should be ran (client / relay / DVM / phone) and when (when content is created OR when queried). Should I be able to make a relay subscription based on a content filter?