13

Any ideas for how two people could anonymously determine if they live in the same city?

Submitted by sudo in Privacy

I was thinking that it would be nice to see if anyone here lives in the same city I do, so we could start organizing in real life. But, I thought, if they don't live in the same city, then I don't want to reveal my real location to them, as I'm sure they don't want to reveal their real location to me.

So, my original idea was for each person to take the name of their city, hash it using sha512sum, then send the hash to the other person. If the hashes match, then they both live in the same city. If they don't, then they live in different cities (or one typed their city name with a capital letter and the other didn't, or one typed the city and the province where the other typed only the city name, etc. This would have to be accounted for somehow).

This sounded like a good idea, until I realized that there are not that many city names (even if there were 10,000, that wouldn't be that many for hash cracking), so one person could brute-force attack the hash quite easily, especially if they started with the most populated cities. If the other person was in fact not a revolutionary leftist, but a double agent pretending to be a leftist, they would be able to find out what city you live in this way. Since that's always a possibility, this method won't work, because we need a method that makes absolutely sure the other person could never figure out what city you live in, if they don't live in the same city.

Does anyone have any ideas?

Comments

You must log in or register to comment.

11

zorblax wrote (edited )

Alice and Bob both hash their locations.

Alice asks Bob for the first bit of his hash. If they differ, Alice terminates the communication.

Bob then asks Alice for the second bit of her hash, and so on.

Thus they become asymptotically more sure that they have the same hash over time, and if there are n cities the chance of an attacker correctly guessing enough bits to determine the others' location is close to 1/n.

Another idea that relies on physical location rather than place-names:

Alice sends Bob a random location and asks how far away it is from him, with some bounds. Bob returns the real answer plus or minus some random amount of kilometers within bounds. If Bob is too far from the location to be near Alice, Alice terminates the communication.

Bob then asks Alice a similar question, and so on.

This continues, with intersections of disks building up, until both Alice and Bob know almost for certain where each other live. This would probably take fewer rounds and also would avoid the problem that people might type location names differently, and it can work to any accuracy (you can just learn if someone lives less than 1000 miles from you, if you so desire)

The disadvantage of this system is that it can be succeptable to a sort of replay attack: Eve could connect to Bob multiple times, get a location range, and then terminate the connection, thus being able to determine Bob's location over multiple attacks. It requires some discretion from Bob.

6

sudo wrote

I like the first idea better than the second one. The second one, I assume, requires Alice to initially choose some random location at least somewhat close to her own city. If she lives somewhere very isolated, like Honolulu, Hawaii, then she would have to pick a location either on one of the Hawaiian islands, or in the Pacific Ocean near Hawaii. This would make it obvious to Bob that she lives on one of the Hawaiian islands, since there is nothing else near there within the bounds they agreed to. This could also be the case for cities surrounded by deserts, or forests, or sparsely populated plains, etc. If Alice lives in one of those cities, she could start by setting the bounds very wide, just to see if Bob lives on the same continent. But once she starts narrowing them down to exclude other cities on the same continent or in the same country, it would be obvious to Bob where Alice lives, since the random points she picks would start converging in one area where there is only one city. So, that is one case where the second method would not work in protecting each party's anonymity.

I like the first method - it seems to be mostly safe. The only problem I can see with it would be that it could be used to rule out large numbers of cities, and it is also vulnerable to the same replay attack, albeit much slower. Let's say Eve pretends to be Bob, and tells Alice the first bit of Bob's city hash is 0 (Eve is making this part up - she doesn't actually know what Bob's hash is). Alice disconnects. This tells Eve that the first bit of Alice's city hash is not 0, which leaves only 1. Assuming that sha512 hashes are just about random, this means Eve can rule out half of the cities in the world for where Alice lives. A few days later, Eve pretends to be Carol, and tells Alice the first bit of Carol's city hash is 1. Alice replies that her first bit is also 1, but Eve already knew this. Then, Eve tells Alice that the second bit of Carol's city hash is 1. If Alice disconnects, then Eve knows the second bit of Alice's city hash is 0, and she can rule out another half of the cities in the world (down to 1/4 now). But, if Alice says the second bit of her hash is 1, Eve can keep going, and rule out cities much faster.

This isn't that big of a problem, since there are thousands of cities in the world, so even if Eve rules it down to just a dozen, she isn't much closer to finding Alice. Plus, Alice might catch on that someone new keeps messaging her every few days (or weeks, or months), and the bits of their city hash always match, for the ones that Alice has previously revealed to anyone, but they start mismatching when she hasn't revealed them yet. She may realize that she's being attacked, and stop responding. But it could be a problem if Eve starts out knowing that Alice lives in only one of a handful of cities, like if Alice was careless and revealed the state or province she lives in, or if she used some word or phrase only used in a certain area of a certain country. This way, Eve could very quickly rule out cities in her short list, and possibly determine which city Alice lives in before Alice catches on to the attack.

With my original idea, if it didn't have the problem of being brute-forceable, the only piece of information that Alice would be revealing is that she doesn't live in the same city that Bob (or Eve) does. That would only rule out 1 city at a time. With this one, for each bit she reveals, that rules out roughly half of the cities in the world. So, I guess it depends on each person's threat model. If you think the person on the other end might be an attacker, and you think they have narrowed down the cities you could possibly live in to a dozen or less, then I wouldn't use this hash bit-by-bit method. Otherwise, it's probably safe.

2

zorblax wrote (edited )

But once she starts narrowing them down to exclude other cities on the same continent or in the same country, it would be obvious to Bob where Alice lives, since the random points she picks would start converging in one area where there is only one city. So, that is one case where the second method would not work in protecting each party's anonymity.

Yes, but the point isn't to prevent the other person from knowing your location, the point is to only reveal your location if the other person probably lives there too. If you start to converge as more rounds are played, that's good -- you need it to converge so you eventually get an answer.

Additionally, with very wide bounds you don't actually get a compacted region, you get a washer-shaped zone of possibility. For the first 2 rounds it would be practically impossible to figure out.

Let's say Eve pretends to be Bob, and tells Alice the first bit of Bob's city hash is 0 (Eve is making this part up - she doesn't actually know what Bob's hash is). Alice disconnects. This tells Eve that the first bit of Alice's city hash is not 0, which leaves only 1.

I thought of that, but I don't think it's that big of a deal.

I just came up with a better one that has none of these drawbacks: Alice and Bob agree on a salt, and both hash their locations with the salt(using some intentionally slow hasing function, like 1 minute per hash) and sign that data. They then send the detatched signature to each other. If they have the same location, the signature will verify on their hashed location, if not, it will not, with no information leakage.

It could still be broken by brute force, but not as easily. You could improve it by involving Carol, who compares the two hashes and only exchanges the signatures if they are the same, but that would mean trusting Carol to not always give Eve your signature for brute-forcing.

This is a very difficult problem to solve because it's basically a two-way zero-knowledge proof, where there are a very small number of possible objects. I don't know if you'll get better than a guessing game.

2

sudo wrote

I don't think that new idea is much better, unless CryptoNight is unbearably slow (not slow as in computeHash(); sleep(10);, I mean slow as in it takes a lot of processor time to compute the hash - and if it's that slow, how could Alice and Bob compute their hashes in a timely fashion in the first place?) Even if they have to agree on a salt first, all Eve would have to do is create a table of hashes from big city names + salt, then try to verify each one with the detached signature until it clicks. Then she'd know where Alice lives. Or, if Eve already computed a table of hashes of city names with a given salt (say the salt is "StalinsMoustache"), then Eve could suggest to Alice that they use "StalinsMoustache" as the salt. Alice would agree, and Eve's work would be reduced significantly. Or, if Alice is aware of this trick, and Bob suggests a salt to her, she says, "No, Bob, I think you might really be Eve, and you're just suggesting that salt because you have a pre-computed hash table of city names using that salt, and you'll try to brute-force it using my signature once I give it to you. I'm picking the salt." To which Bob replies, "Well, you could really be Eve impersonating Alice, and you could be trying to trick me with a salt you made up beforehand and computed a hash table from. So, I can't trust you to pick the salt." If neither can trust the other to pick a salt, then they wouldn't be able to do this exchange. (Maybe they could have some third party randomly generate the salt, but that's just shifting the trust again.)

It seems to me like your first idea is still the best one that's been suggested.

1

zorblax wrote

Use Diffie-Hellman to determine the salt. Unique for any pair of people.

And yes, the hash function would have to be very very slow to make brute-force hard.

1

sudo wrote

Diffie-Hellman would only work if you've already met the other person in real life before (so you know that they actually exist). If Eve is pretending to be Bob online, but Bob doesn't actually exist in real life, doing a Diffie-Hellman exchange won't help Alice. It would only work if Alice knew Bob in real life, but thinks the person she's talking to online isn't really Bob, but Eve.

I guess your first idea (bit-by-bit revealing of the hash) is the best we can do.

1

zorblax wrote

That's a problem of the protocol, not the primitive itself. If Eve can impersonate Bob then the entire thing is basically public to her anyway, no matter what you do.

4

josefStallman wrote

I don't think it would be too hard to write a program to work out the holes. You could host it somewhere where neither user has access to the other's hash at all.

3

sudo wrote

Yeah, but then whoever hosts it would have access to the hashes. They'd just be trusting the hoster, instead of the other person. I want to eliminate the trusting part entirely.

5

josefStallman wrote

What about a tor-style layered encryption setup? So each user downloads a program, they input their location, and then that's converted to co-ordinates, maybe? Then the program hashes the co-ordinates and then hashes the hash? That's sent to the other user's program, which unhashes it without the user being able to see the result, and then unhashes that into a location?

3

sudo wrote

But that would still have a 1:1 correlation between city name and final hash, right? If so, that'll just make a little bit of extra work for someone to undo it. It's still vulnerable to the same type of brute-force attack the original was.

1

josefStallman wrote

You could salt the hash

2

sudo wrote

That's only useful for passwords. It wouldn't be useful here, because when you're authenticating a password, the salt is concatenated with the text the user typed in before it's put through the hashing function. So, in order to come up with the same hash, the other person would need to know the salt you're using, thus making it worthless for our purposes. (In real life, salts are used to protect against dictionary attacks or rainbow table attacks, assuming the hashes are leaked, but not the salts. If the attacker knows the salt, then it's useless.)

1

alqm wrote

But how will the program know the city where these coordinates point to? By using an external map service like openstreetmap? Or maybe I didn't understand it.