unfortunately i think that's the thing, the less it leaks metadata the more candidates you are gonna turn up, which defeats the purpose
but i do think, regardless, that for building DVMs able to find any kind of data that this kind of cryptography is the way to make it possible for someone with a sufficiently high entropy clue set to match it up to a highly obfuscated data point
broaden your concept of how to generate the match set and balance your expectations with the idea that people who will publish such hashes are maybe putting a bullseye on their backs