To your point, here’s the current summary of my approach:
1. Decide what question you want to answer.
2. Select sources of raw data that are 1) available to you and 2) relevant to the question.
3. Translate raw data into a format suitable for consumption by the grapevine algo.
4. Crunch the numbers.
Suppose the question is to maintain a list of nostr users who are not bots. In step 2, you may decide that follows (and mutes, and zaps) are the best sources of CURRENTLY AVAILABLE data, so those are what you use today. But if tomorrow a better source of data becomes available, you can throw that in the mix to improve the quality of the end result. And you can use multiple sources of data at the same time: no need to pick and choose. But you can and probably will do is to adjust the relative weights you give to each data source. So as your new sources of data become more and more available, you may want to decrease the “weight” you attribute to follows gradually towards zero.
And indeed, as more sources of raw data become available, you may decide you want to alter the question from step 1. Not because you didn’t previously care about that question, but because you simply didn’t have any relevant data to work with. This, too, can happen gradually.