anyone have a good idea on how to make hashes of phone numbers rainbow-table resistant? Trying to think of a way to find friends on nostr by phone number but without leaking the number. maybe argon2(first name + number) ? We should spec this if so.
GM Will hitting the ground running this morning I see, good on you! (I haven’t the foggiest idea how to help you with that but I’m glad you and all these other fine people are thinking about it already this morning.) Thanks for making Damus I use it everyday now and it’s exciting to know there will always be something to innovate!
I still feel like name + number is rainbow-tableable. Maybe a deterministic salt beacon (like hash of bitcoin block at height (N mod 4096 == 0) to resalt the hash every 4 weeks? but then old hashes could be cracked eventually 🤔 will ponder further
Double sha256 hash? That's what Satoshi would do
Too weak, easy to compute a large table 🐶🐾🫡
I'm starting to think phone numbers don't have enough entropy. Without another source, it's hard.
What about this? nostr:nevent1qqsx0fe5xkckvtxzwa4fv2s3jszg7ctg5g5week44gnxf0yhcyzrt5gpz4mhxue69uhhyetvv9ujumt0wd68ytnsw43qzxrhwvaz7tmddahxzepwdf3r2dfwvdhk6w3cxqurqqgewaehxw309ac8junpd45kgtnxd9shg6npvchxxmmdqyg8wumn8ghj7mn0wd68ytnhd9hx232ahc5
That gets out of my depth 🤣
Argon2 is hard to generate with the modern hardware and CPU. What you could do is, to make it even harder, is generate each phone number hash with deterministic hash (0-16 prepended) and when looking up, generate all 16 hashes and then match against stored hashes. Not fool proof approach, since you literally need a rainbow table, but would take years for anyone determined to bruit force it. And storing such table would take a few TB of storage (not much but). One more way, that will make lookup a pain, is to salt it and then compute for each comparison, but then it will take ages in normal use. Short of finding a compromise (spend 10 seconds to compute n hashes that will be used for lookup), I see no other way. 🐶🐾🤔
If i’m reading this right: argon2( argon2( argon2(bob + first digit), bob + second digit) … ) ?
Argon2 per digit of the number appending hash in each iteration. Not sure how complex that will be, but all you need is to have it compute in 1-3 second on the modern phone 🐶🐾🤔
Something like that? 🐶🐾🤔 function hashPhoneNumber(phoneNumber): hashedValue = "" for each digit in phoneNumber: if hashedValue is empty: hashedValue = argon2Hash(digit) else: concatenated = hashedValue + digit hashedValue = argon2Hash(concatenated) return hashedValue
I like this. It would be crazy hard to build a table for this, maybe by setting a large iteration security parameter? I would just have an initial salt value just in case this approach gets standardized, but otherwise I’m leaning toward this. I don’t think you would even need the name to simplify it.
I think so! And we hardest iteration number you’ll make it close to impossible to table for the next 5-10 years, which is sufficient for this level of information 🐶🐾🫡
You all are so fucking smart it’s ridiculous
But as usual, we're solving problems from a technical perspective. 😞
what do you propose to solve this problem?
I propose not to stick with that old "phone number" stuff. Just don't do it, it's not necessary, we have nip05, I.e.
What if the phone number just gets associated with an independent keypair, and instead of a "search" occurring, a sort of encrypted inbox is utilized? Someone adds their phone number to nostr by having this new and unique keypair that is indexed by the phone number. When a user tries to find their friend, they receive the associated keypair. This user then deposits an encrypted payload identifying themselves, perhaps even including a phone number if it exists. This would allow many different UI flows, including contact matching. The original user who added their number in this way can check on the status of their "encrypted inbox" by requesting new events to the inbox. Only the user controlling the private key would be capable of decrypting the inbox, so it likely wouldn't need to even verify they own the inbox. Once the user sees that their friend is there (a suggested connection, or some way to vet users requesting access?), they just add them and a connection is made. There would be some things to work out, like proving you own the number, and protecting anonymity of the person registering this encrypted inbox (built in, multi-hop, encrypted relaying?), but seems like this sort of flow could function.
Problem statement: you have a friend’s phone number. You want to find your friend’s npub in such a way that the number is never revealed to anyone who doesn’t already know it. That is phone->npub, but NOT npub->phone. One approach is to treat it as a two-step process. The first step is to find candidate npubs who credibly could have the number. The second step is to challenge candidates to prove it. Step 1. Find candidates. Suppose each npub publishes a Bloom filter which has been fed their phone number AND other random garbage. By fetching these filters, you can test them to see if any at least credibly claim to have the number. Step 2. Challenge candidates. For each candidate, send a DM which has a random payload symmetrically encrypted with a key containing their phone number and a nonce. Tell them the nonce and encryption scheme in the DM. If they have the phone number, they can decrypt the random payload and return it as proof. If they don’t, they’ll have to search. But they can’t have pre-computed the rainbow table because your nonce was random, and also you control the encryption scheme. If a candidate doesn’t respond in a timely manner, this could be because they didn’t receive the DM, or they’re trying to crack the number. So you’ll have to use some judgement, perhaps starting a DM conversation first, then issuing the challenge. Note: Because the search space of phone numbers is so small, the Bloom filters must contain a bunch of random phone numbers as false positives to prevent an attacker from determining an npub’s number by brute force tests. A few tens of thousands ought to be enough. There are probably holes in this approach, I just thought it up now in response to your query.
On further thought, the entries in the bloom filter probably also have to be salted and re-hashed. That way an attacker has to salt and hash every phone number and test it against the filter even to get the 10k bogus ones. Otherwise, there’s little value in putting it in a bloom filter aside from space saving and computation. Salting and rehashing makes the attacker’s job harder.
Sending an encrypted phone number is unnecessary in Step 2. This is the interactive protocol you want: https://en.wikipedia.org/wiki/Socialist_millionaire_problem
perhaps un-adopt first name in favour of “shared secret” as a safer way to think about this?
but why phone numbers?
It’s the most common method for finding friends on pretty much every phone app that exists.
Unfortunately, I can't think of a better idea. I just hate phone numbers because it's a kyc hell.
If you ARE NOT adding a random salt, you get the rainbow table problem. If you ARE adding a random salt, you can't match the result up with others for a lookup that will be using a different random salt.
Note that this doesn't protect against impersonation. I can just compute the argon2 of "Alice" + "555-4939" and publish it in my profile. And then you'll contact me and I'll pretend I'm your friend Alice and hilarity ensues or something more sinister.
Accumulators might help here https://en.m.wikipedia.org/wiki/Accumulator_(cryptography)
I had this thought! But didn’t see anyone doing this with accumulators 🤔
i think accumulators combined with web of trust might be the magic you are looking for if you know a few people in common and there is a good chance they are linked one or two steps from this other person it acts as a big set of extra set members to get a result that approaches true/false what that would probably mean in practise is you would fill in several numbers in one search and when the keys of several hit then it narrows down the identities to a small set, and uses web of trust traversal following the memberships i'm sure you are gonna bake your noodle trying to figure out a credibly secure method to do this on an async data system like nostr but i just want to keep encouraging you to look into it i'm sure i'm not the only person with a passing curiosity about such things... cryptography applications are nowhere near a complete set of principles as you see in many other sciences, there is a heap of new combinations of tools that haven't happened yet, combinatorial possibilities in cryptography, so, LFG!
Too much risk for less profit. I would never reveal my phone to a nostr client. Phone's are registered by governments.
this is why cryptography exists, to do things like this without having to reveal your number to anyone. If you are paranoid about it you shouldn’t use a sim card. Also this doesn’t have to be a phone number, it could be any piece of data shared among friends. Phone # is just the obvious one that is in contact lists.
Yes, there is something called “salt” in hashing. You add a value (salt), to the end of the phone number and when you hash it with the salt it is completely different than the phone number without the salt. This makes it extremely resilient to rainbow lists. I hope that is helpful.
@jb55 https://pypi.org/project/proximityhash/ geohashes are one form of proximity hash scheme, there probably is others i think if you look closely at this you may find a way whereby you can get a close match that massively thins down from a set of publicly visible hashes that are partially generated from phone numbers, such that they don't facilitate reverse indexing but enable those with the number to find them there is probably a heap of papers you can find and maybe someone has already posed the question of how to make a forward matching phone number search resistant to reverse matching it's a problem that is very old, i remember in the old days when there started to be reverse phone number databases, which is the problem you are trying to solve, so probably someone has already got a scheme that is data-type-specific
I was trying to google this, I would be surprised if this hasn’t been thought of before.
i hope you find a way to do it because it would be a boon for nostr onboarding
as a side note, it might be a feasible mechanism to create giftwraps that can be searched for, btw, eliminating the key problem with sender-only-identified DMs (giftwraps)
as a side note, it might be a feasible mechanism to create giftwraps that can be searched for, btw, eliminating the key problem with sender-only-identified DMs (giftwraps)