Author |
Topic: Real-space Hash Functions (Read 721 times) |
|
RandomSam
Newbie
Gender:
Posts: 20
|
|
Real-space Hash Functions
« on: Oct 4th, 2007, 4:39pm » |
Quote Modify
|
A hash function is a many-to-one mapping that is s commonly used to reduce the amount of information you have to handle, so that large sets of data can quickly be compared. Given an arbitrarily large integer, it will return a manageably small positive integer. Inputting the same integer into the function always yields the same output, but inputting a slightly different integer (e.g. a consecutive number) will yield a completely different output. My question: could a hash function be devised that had an infinitely large domain? What about an uncountable domain, e.g. accepting any real number as its input? It would need to have a defined output but an undefined derivative at all points; is that possible? (My apologies if I am posting this in a silly place - let me know if I should repost elsewhere. I almost put it in the 0.999... thread as this would have applications there!)
|
|
IP Logged |
|
|
|
Aryabhatta
Uberpuzzler
Gender:
Posts: 1321
|
|
Re: Real-space Hash Functions
« Reply #1 on: Oct 4th, 2007, 4:59pm » |
Quote Modify
|
So you want a function from Real numbers to integers such that it is one to one? (i.e. there are no collisions)
|
« Last Edit: Oct 4th, 2007, 5:01pm by Aryabhatta » |
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Real-space Hash Functions
« Reply #2 on: Oct 5th, 2007, 12:38am » |
Quote Modify
|
on Oct 4th, 2007, 4:39pm, RandomSam wrote:My question: could a hash function be devised that had an infinitely large domain? |
| Sure, take the set of integers modulo some number. Or if you really want adjacent number in the domain to end up further apart in the range, take pn mod q for some appropriate primes p,q. That should work fairly ok. Quote:What about an uncountable domain, e.g. accepting any real number as its input? |
| I think this might be a problem; I suspect that infinitely many intervals in the domain would have to end up with the same value if you want to map it to a finite set. Quote:It would need to have a defined output but an undefined derivative at all points; is that possible? |
| I think so, but that doesn't guarantee you'd get a proper hash. You could think along the lines of http://mathworld.wolfram.com/DirichletFunction.html
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
RandomSam
Newbie
Gender:
Posts: 20
|
|
Re: Real-space Hash Functions
« Reply #3 on: Oct 5th, 2007, 10:32am » |
Quote Modify
|
on Oct 4th, 2007, 4:59pm, Aryabhatta wrote:So you want a function from Real numbers to integers such that it is one to one? (i.e. there are no collisions) |
| All hash functions are many-to-one, so there must be collisions, but if the function is difficult to reverse then these collisions will take a long time to find. on Oct 5th, 2007, 12:38am, towr wrote:I suspect that infinitely many intervals in the domain would have to end up with the same value if you want to map it to a finite set. |
| I agree, but can the intervals each be infinitesimally narrow? on Oct 5th, 2007, 12:38am, towr wrote:The Dirichlet function separates the rationals from the irrationals. Given an integer hash function, I can easily enough define a rational-to-integer hash function using e.g. h(b/a):h(h(a)+b). But I'm not sure what to do with the irrational numbers: how can I be sure whether two reals are distinct if no function can distinguish between them?
|
« Last Edit: Oct 5th, 2007, 10:37am by RandomSam » |
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Real-space Hash Functions
« Reply #4 on: Oct 5th, 2007, 11:01am » |
Quote Modify
|
on Oct 5th, 2007, 10:32am, RandomSam wrote:But I'm not sure what to do with the irrational numbers: how can I be sure whether two reals are distinct if no function can distinguish between them? |
| If no function can distinguish between them, then they are not distinct. Besides, between any two distinct irrational numbers, you can find a rational one. Along the lines of: given irrational a<b take s=floor(log2(b-a)) then floor(b/2s)2s should be rational, and between a and b.. maybe.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Aryabhatta
Uberpuzzler
Gender:
Posts: 1321
|
|
Re: Real-space Hash Functions
« Reply #5 on: Oct 5th, 2007, 1:31pm » |
Quote Modify
|
on Oct 5th, 2007, 10:32am, RandomSam wrote: All hash functions are many-to-one, so there must be collisions, but if the function is difficult to reverse then these collisions will take a long time to find. |
| Well, if we take any function from Reals to Integers, there will be some integer to which uncountably many reals map (try proving it). It might be pretty hard to resolve the collision then... Interesting question though, probably research-worthy.
|
« Last Edit: Oct 5th, 2007, 1:32pm by Aryabhatta » |
IP Logged |
|
|
|
SMQ
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 2084
|
|
Re: Real-space Hash Functions
« Reply #6 on: Oct 7th, 2007, 6:13am » |
Quote Modify
|
on Oct 5th, 2007, 1:31pm, Aryabhatta wrote: Informally: assume the opposite -- that no integer has an uncountable infinity of real values mapped to it. But a countable infinity (the integers) of countable infinities (the reals mapped to a given integer) is countable, a contradiction. Therefore the assumption is false. --SMQ
|
|
IP Logged |
--SMQ
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Real-space Hash Functions
« Reply #7 on: Oct 7th, 2007, 7:17am » |
Quote Modify
|
on Oct 5th, 2007, 1:31pm, Aryabhatta wrote:Well, if we take any function from Reals to Integers, there will be some integer to which uncountably many reals map (try proving it). It might be pretty hard to resolve the collision then... |
| You'd want the distribution of reals over the integers (or (finite) subset thereof) to be fairly uniform. So I'd say typically all integers would have a uncountable set of reals mapped to them. If we take N integers (0..N-1), you could consider mapping a real R to floor(N*[R]), where [R] is the fractional part of R. The downside is that's it's a bit predictable. And it depends on the probability distribution of the reals you expect to deal with whether that's a problem. Although obviously there will need to be some pattern to the mapping, preferably it'd be chaotic (which also implies that a small difference between two reals means they are very likely to end up further apart). So perhaps something like floor(N*[PR]), for some P, would be more suitable (exponentiation and modulo are common tricks to promote chaotic behaviour in functions. It's also used in pseudo random generators)
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
RandomSam
Newbie
Gender:
Posts: 20
|
|
Re: Real-space Hash Functions
« Reply #8 on: Oct 7th, 2007, 11:05am » |
Quote Modify
|
on Oct 7th, 2007, 7:17am, towr wrote:...perhaps something like floor(N*[PR])... |
| The problem with this is the "sawtooth" effect of the [R] operator: its derivative is 1 almost everywhere, so it won't make the hash function difficult to reverse. This attack relies on differentiation, so is not possible when only integers are allowed. on Oct 7th, 2007, 7:17am, towr wrote:...preferably it'd be chaotic... |
| A chaotic function sounds like a brilliant idea: fractals exist which have infinite intricacy. Are there any fractals which are discontinuous everywhere, other than those that classify types of numbers?
|
« Last Edit: Oct 7th, 2007, 11:07am by RandomSam » |
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Real-space Hash Functions
« Reply #9 on: Oct 7th, 2007, 11:25am » |
Quote Modify
|
on Oct 7th, 2007, 11:05am, RandomSam wrote:The problem with this is the "sawtooth" effect of the [R] operator: its derivative is 1 almost everywhere |
| N*[PR] does not have a derivative of 1 nearly everywhere, far from it. If we take P=e, then the derivative is eR nearly everywhere, which is only 1 for R=0. Quote:so it won't make the hash function difficult to reverse. |
| Ok, suppose P is e, N=100, and the function evaluates to 50, what is R? Rather a lot of options, I'd say.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
RandomSam
Newbie
Gender:
Posts: 20
|
|
Re: Real-space Hash Functions
« Reply #10 on: Oct 7th, 2007, 2:10pm » |
Quote Modify
|
on Oct 7th, 2007, 11:25am, towr wrote:Ok, suppose P is e, N=100, and the function evaluates to 50, what is R? Rather a lot of options, I'd say. |
| There have to be a lot of options: a hash function must be many-to-one. The problem is that all those options can easily be calculated if we are in real-space, so the hash function can be reversed. In your case, the input can be ln(0.5), ln(1.5), ln(2.5)... whereas the discrete equivalent is non-trivial to reverse with modulo arithmetic, an important property for cryptography. EDIT: I think I can prove that the hash function must be discontinuous everywhere. If the hash function's domain is countable, then its derivative must be 0 at all points where the function is continuous, and the function can't distinguish between almost-identical numbers. In other words, it has a finite-precision input. We might as well round to the nearest <insert small number> and use a conventional hash function. If the function's domain isn't countable, then it must output an infinite amount of data. We can't store or compare that much data, so the function is of little use. Any subsequent truncating of the data means we have a countable function, and so finite precision. I think that proves that a hash function won't work unless it's discontinuous everywhere.
|
« Last Edit: Oct 7th, 2007, 2:38pm by RandomSam » |
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Real-space Hash Functions
« Reply #11 on: Oct 7th, 2007, 2:35pm » |
Quote Modify
|
on Oct 7th, 2007, 2:10pm, RandomSam wrote:There have to be a lot of options: a hash function must be many-to-one. The problem is that all those options can easily be calculated if we are in real-space, so the hash function can be reversed. |
| If you can't bring it back to a single value, what's the problem? Quote:whereas the discrete equivalent is non-trivial to reverse with modulo arithmetic, an important property for cryptography. |
| Meh, then just round the darn real, and throw it in the discrete equivalent.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
rmsgrey
Uberpuzzler
Gender:
Posts: 2873
|
|
Re: Real-space Hash Functions
« Reply #12 on: Oct 8th, 2007, 10:47am » |
Quote Modify
|
How precise is the real you're starting with anyway?
|
« Last Edit: Oct 8th, 2007, 10:47am by rmsgrey » |
IP Logged |
|
|
|
RandomSam
Newbie
Gender:
Posts: 20
|
|
Re: Real-space Hash Functions
« Reply #13 on: Oct 8th, 2007, 4:24pm » |
Quote Modify
|
on Oct 8th, 2007, 10:47am, rmsgrey wrote:How precise is the real you're starting with anyway? |
| Finite precision numbers are easy - just multiply by the threshold of accuracy, round to an integer and use a normal hash function. An exact expression that is known to be in its simplest form could be typed as ASCII text and hashed, although I'm not sure how to check if a simpler form exists: h("0.5") won't equal h("1/2"). What I can't seem to devise is a function with a real-valued domain and a countable number of output states, for which it is computationally difficult to discover collisions.
|
« Last Edit: Oct 8th, 2007, 4:25pm by RandomSam » |
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Real-space Hash Functions
« Reply #14 on: Oct 9th, 2007, 1:01am » |
Quote Modify
|
on Oct 8th, 2007, 4:24pm, RandomSam wrote:What I can't seem to devise is a function with a real-valued domain and a countable number of output states, for which it is computationally difficult to discover collisions. |
| How about using a polynomial of degrees greater than 4 as an intermediate step. It would in general be difficult to find roots for it. (Besides, I don't think this problem is any less for mapping integers to integers if all parameters of the function are known)
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
RandomSam
Newbie
Gender:
Posts: 20
|
|
Re: Real-space Hash Functions
« Reply #15 on: Oct 13th, 2007, 11:40am » |
Quote Modify
|
I've found a silly function whose output is only defined for irrational input. The ideal function seems to be non-trivial to reverse (e.g. I don't know how to find an x such that a < silly(x) < b, except by brute force) so if I round the output to a finite precision, I'll have an ideal hash function. In practice it's not possible to do the infinite sum, and the approximation using the first few terms has a well-defined slope almost everywhere. This means it's reversible to arbitrary accuracy using numerical methods, so if I round the output, collisions can easily be produced. EDIT: Found another function with similar properties but a finite expectation, which should give a stronger guarantee of convergence, although I'm not exactly sure what that means in this context. I think that's as far as I can go with this idea!
|
« Last Edit: Oct 13th, 2007, 1:54pm by RandomSam » |
IP Logged |
|
|
|
|