wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> hard >> Sharing secrets
(Message started by: towr on Jan 26th, 2006, 5:57am)

Title: Sharing secrets
Post by towr on Jan 26th, 2006, 5:57am
   You're running an (non-US) airline company and have a number of passengers that for some incomprehensible reason want to fly to the United States. The US on the other hand has a no-fly list, barring numerous people from entering US airspace. Obviously it would be beneficiary to both of you to know how those two lists overlap.
   However, due to privacy concerns you don't feel like sharing all your passengers' personal details, and neither does the US feel like sharing the complete list of people on their no-fly list. It's not all angst and distrust though, the both of you managed to agree on a one way encryption method and a unique representation of your respective 'secrets'. (i.e. if you both have the same data-item, you have an identical key representing it)
   So now, how can you check your list against theirs, while at the same time giving as little as possible additional information away (like the number of people on your list)

   Also try to undermine attempts at 'cheating'. Like fake entries on the list, or suspiciously complete lists (possibly from encoding the personal information they could get from phone books of your area etc)

Bear in mind the method has to practical, it's sort of a CS question, but I think sufficiently hard to deserve a place here. What's asked for is a relatively safe shared secret detection protocol, which can compare lists for matches efficiently.

Title: Re: Sharing secrets
Post by SMQ on Jan 26th, 2006, 8:10am
I can't think of a way to avoid "suspiciously complete" lists except by repeatedly reminding both parties that it is in everyone's best interest that the comparison be as accurate as possible...

Other than that, I would propose both parties pad their lists with as many randomly-generated entries as they feel comfortable with (to avoid leaking information about list size), then compare keys as normal.  As long as the agreed one-way function is wide enough, the chance of a false-positive collision, even in lists padded to millions of names, is vanishingly small.

For SHA-512, for instance, using only half the hash, the chance of an individual collision is 1/2256, so even for lists of, oh let's be outrageous and say 240 ~ 1012 names each, the chance of collision is approximately 1 in 2256 - (40 + 40) = 2176 or about 1 in 1053.

If this comparison is to take place repeatedly rather than just once, I would propose that the random padding be changed in approximate proportion to the change in the actual list, to avoid leaking information about list size over time.  So if 5% of the names on your list changed, change somewhere around 5% of the padding as well.

That strategy would protect list size information at the cost of leaking rate-of-change information, but in the example given both parties would already expect the airline's passenger list to be mostly different and the US no-fly list to be mostly the same each time anyway.

--SMQ

Title: Re: Sharing secrets
Post by towr on Jan 26th, 2006, 9:13am
I think you'd have to do a lot of padding if you use that as the main method to obfuscate your list size.
But it seems like a good way to check whether the other party isn't just pretending to know everything already, to get a better idea of what you know. Because you can safely assume he doesn't know lies.

Also, bear in mind you need a somewhat efficient comparison, for list of size M and N, you probably wouldn't want M*N comparisons (and you don't need to).
I suppose for checking a passenger list it's not likely to be a problem (at most 600 passengers), but in general if both of you have lists in excess of a million secrets and want to know what subset you share (so you can gossip about it), it starts to take a lot of time to compare every pair seperatedly.

Title: Re: Sharing secrets
Post by towr on Jan 26th, 2006, 9:18am
Oh, and one thing I always forget about in the cryptography problems, the eavesdropper.
You don't want your list to fall in her hands so she can find out what secrets you and she share. (At least not beyond what all three of you share)
That's to say the communication channel isn't secure.

Title: Re: Sharing secrets
Post by Grimbal on Jan 27th, 2006, 12:25am
I don't see how you can prevent the CIA to encode the whole world population as "suspect" and wait for positives to get back to know exactly who is on the plane.  If the airline complains the list is too long, the CIA will pretend it is padded, and indeed, only with a list as long as it possibly could be there is no information leaked about the nb of peoples on the suspect list.

We must at least assume a reasonable upper limit on the list sizes.

Title: Re: Sharing secrets
Post by towr on Jan 27th, 2006, 1:32am

on 01/27/06 at 00:25:28, Grimbal wrote:
I don't see how you can prevent the CIA to encode the whole world population as "suspect" and wait for positives to get back to know exactly who is on the plane.
I think you can detect it.
f.i. put some people on your list that aren't on your plane and can't be on the no-fly list (or are extremely unlikely to be on it, in any case). If they say they do share that secret, you know they're bullsh*tting you.


Quote:
We must at least assume a reasonable upper limit on the list sizes.
Actually, as I understand it, list size doesn't matter much at all. You don't have to share much more bits of information than the size of the overlap to find out what the overlap is.

Title: Re: Sharing secrets
Post by Grimbal on Jan 27th, 2006, 10:08am
I have an idea.
[hide]
Use the passenger data to create random sequence of integers.

For each step n, both parties publish the set of all nth items of the sequences.  Throw in some extra random values for good measure.

For any number not matched on the other side, remove the secrets that generated that number.  Repeat until there is a consistent match.  (i.e. all values generated by real secrets, not the random padding, are consistently matched on the other side).

The range of the generated sequences would depend on the sizes of the lists.  It should be about twice an upper bound of expected secrets list.  THe random padding should be used to fill it up to half the possible values.
[/hide]

Title: Re: Sharing secrets
Post by towr on Jan 28th, 2006, 8:23am
Yep, that's about what I had in mind for the method. Or rather what the person writing a research paper on it in our faculty has come up with. Although he hasn't worried about the 'cheating' yet.

The first step is accomplished by the hash.
For the second step [hide]you can make use of a prefix tree build from the binary representation of the hashes[/hide]


Title: Re: Sharing secrets
Post by Grimbal on Jan 29th, 2006, 4:36pm
You mean, my solution is worth being published?  8)

Title: Re: Sharing secrets
Post by towr on Jan 30th, 2006, 1:00am
You'd have to work it out further of course. You have to turn it into a complete protocol first, then show it works, and examine its behaviour: average run time, security, etc.

What's the expected size of the prefix tree with N items? And what is the expected size of the overlap between two such trees with M and N secrets respectively?

Title: Re: Sharing secrets
Post by Barukh on Jan 30th, 2006, 1:34am
First, I thought this is the case for Zero-Knowledge Proofs (http://en.wikipedia.org/wiki/Zero-knowledge_proof). However, they are not too practical...

Grimbal, could you please explain your idea in more details? I didn't quite get it. Does every passenger data generates a random number? random sequence?

Title: Re: Sharing secrets
Post by Grimbal on Jan 30th, 2006, 9:54am
Both sides agree on a formula to generate a pseudo-random sequence of integers from a person's id.  That means for each person p, a sequence R[p,n], n=1,2,3,... is defined.

For each step n, n=1,2,..., both sides build the set of collecting the R[p,n], for all p over their secret list.  They discard duplicates and publish the resulting set ordered by value.

Then they compare the sets.  If one side includes a number R[p,n] in its set that the other side does not include, it means the person p is  not in the other side's list.  It can be discarded.

Here the number of possible values for R[p,n] is critical.  It must be large enough that the random values don't cover all possible values, but small enough so that a single value is still shared by many people outside of the secret lists.

The process is repeated until eliminations don't happen any more.

Title: Re: Sharing secrets
Post by towr on Jan 30th, 2006, 10:40am
It might be interesting to have several (one-way) hashes avaliable for calculating R[p,n], and alternately each side choses a hash that will provide a nice distribution of R[p,n] over his set.



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board