|
||||||
Title: Real-space Hash Functions Post by RandomSam on Oct 4th, 2007, 4:39pm A hash function is a many-to-one mapping that is s commonly used to reduce the amount of information you have to handle, so that large sets of data can quickly be compared. Given an arbitrarily large integer, it will return a manageably small positive integer. Inputting the same integer into the function always yields the same output, but inputting a slightly different integer (e.g. a consecutive number) will yield a completely different output. My question: could a hash function be devised that had an infinitely large domain? What about an uncountable domain, e.g. accepting any real number as its input? It would need to have a defined output but an undefined derivative at all points; is that possible? (My apologies if I am posting this in a silly place - let me know if I should repost elsewhere. I almost put it in the 0.999... thread as this would have applications there!) |
||||||
Title: Re: Real-space Hash Functions Post by Aryabhatta on Oct 4th, 2007, 4:59pm So you want a function from Real numbers to integers such that it is one to one? (i.e. there are no collisions) |
||||||
Title: Re: Real-space Hash Functions Post by towr on Oct 5th, 2007, 12:38am on 10/04/07 at 16:39:34, RandomSam wrote:
Quote:
Quote:
You could think along the lines of http://mathworld.wolfram.com/DirichletFunction.html |
||||||
Title: Re: Real-space Hash Functions Post by RandomSam on Oct 5th, 2007, 10:32am on 10/04/07 at 16:59:32, Aryabhatta wrote:
on 10/05/07 at 00:38:20, towr wrote:
on 10/05/07 at 00:38:20, towr wrote:
|
||||||
Title: Re: Real-space Hash Functions Post by towr on Oct 5th, 2007, 11:01am on 10/05/07 at 10:32:48, RandomSam wrote:
Besides, between any two distinct irrational numbers, you can find a rational one. Along the lines of: given irrational a<b take s=floor(log2(b-a)) then floor(b/2s)2s should be rational, and between a and b.. maybe. |
||||||
Title: Re: Real-space Hash Functions Post by Aryabhatta on Oct 5th, 2007, 1:31pm on 10/05/07 at 10:32:48, RandomSam wrote:
Well, if we take any function from Reals to Integers, there will be some integer to which uncountably many reals map (try proving it). It might be pretty hard to resolve the collision then... Interesting question though, probably research-worthy. |
||||||
Title: Re: Real-space Hash Functions Post by SMQ on Oct 7th, 2007, 6:13am on 10/05/07 at 13:31:46, Aryabhatta wrote:
Informally: [hide]assume the opposite -- that no integer has an uncountable infinity of real values mapped to it. But a countable infinity (the integers) of countable infinities (the reals mapped to a given integer) is countable, a contradiction. Therefore the assumption is false.[/hide] --SMQ |
||||||
Title: Re: Real-space Hash Functions Post by towr on Oct 7th, 2007, 7:17am on 10/05/07 at 13:31:46, Aryabhatta wrote:
If we take N integers (0..N-1), you could consider mapping a real R to floor(N*[R]), where [R] is the fractional part of R. The downside is that's it's a bit predictable. And it depends on the probability distribution of the reals you expect to deal with whether that's a problem. Although obviously there will need to be some pattern to the mapping, preferably it'd be chaotic (which also implies that a small difference between two reals means they are very likely to end up further apart). So perhaps something like floor(N*[PR]), for some P, would be more suitable (exponentiation and modulo are common tricks to promote chaotic behaviour in functions. It's also used in pseudo random generators) |
||||||
Title: Re: Real-space Hash Functions Post by RandomSam on Oct 7th, 2007, 11:05am on 10/07/07 at 07:17:05, towr wrote:
on 10/07/07 at 07:17:05, towr wrote:
|
||||||
Title: Re: Real-space Hash Functions Post by towr on Oct 7th, 2007, 11:25am on 10/07/07 at 11:05:20, RandomSam wrote:
Quote:
|
||||||
Title: Re: Real-space Hash Functions Post by RandomSam on Oct 7th, 2007, 2:10pm on 10/07/07 at 11:25:49, towr wrote:
EDIT: I think I can prove that the hash function must be discontinuous everywhere. If the hash function's domain is countable, then its derivative must be 0 at all points where the function is continuous, and the function can't distinguish between almost-identical numbers. In other words, it has a finite-precision input. We might as well round to the nearest <insert small number> and use a conventional hash function. If the function's domain isn't countable, then it must output an infinite amount of data. We can't store or compare that much data, so the function is of little use. Any subsequent truncating of the data means we have a countable function, and so finite precision. I think that proves that a hash function won't work unless it's discontinuous everywhere. |
||||||
Title: Re: Real-space Hash Functions Post by towr on Oct 7th, 2007, 2:35pm on 10/07/07 at 14:10:01, RandomSam wrote:
Quote:
|
||||||
Title: Re: Real-space Hash Functions Post by rmsgrey on Oct 8th, 2007, 10:47am How precise is the real you're starting with anyway? |
||||||
Title: Re: Real-space Hash Functions Post by RandomSam on Oct 8th, 2007, 4:24pm on 10/08/07 at 10:47:37, rmsgrey wrote:
An exact expression that is known to be in its simplest form could be typed as ASCII text and hashed, although I'm not sure how to check if a simpler form exists: h("0.5") won't equal h("1/2"). What I can't seem to devise is a function with a real-valued domain and a countable number of output states, for which it is computationally difficult to discover collisions. |
||||||
Title: Re: Real-space Hash Functions Post by towr on Oct 9th, 2007, 1:01am on 10/08/07 at 16:24:54, RandomSam wrote:
(Besides, I don't think this problem is any less for mapping integers to integers if all parameters of the function are known) |
||||||
Title: Re: Real-space Hash Functions Post by RandomSam on Oct 13th, 2007, 11:40am I've found a silly function whose output is only defined for irrational input. http://www.mr.ethz.ch/free_cgi-bin/formula.gif?expr=%5Cmbox%7Bsilly%7D%28x%29%20%3D%20%5Cmathop%7B%5Csum%7D_%7Bn%3D0%7D%5E%7B%5Cinfty%7D%7B%5Cfrac%7B2%5E%7B-n%7D%7D%7B%5Csin%5E%7B2%7D%28%5Cpi%7B%7Dn%21x%29%7D%7D&size=300 The ideal function seems to be non-trivial to reverse (e.g. I don't know how to find an x such that a < silly(x) < b, except by brute force) so if I round the output to a finite precision, I'll have an ideal hash function. In practice it's not possible to do the infinite sum, and the approximation using the first few terms has a well-defined slope almost everywhere. This means it's reversible to arbitrary accuracy using numerical methods, so if I round the output, collisions can easily be produced. EDIT: Found another function with similar properties but a finite expectation, which should give a stronger guarantee of convergence, although I'm not exactly sure what that means in this context. http://www.mr.ethz.ch/free_cgi-bin/formula.gif?expr=%5Cmbox%7Bsillier%7D%28x%29%20%3D%20%5Cmathop%7B%5Csum%7D_%7Bn%3D0%7D%5E%7B%5Cinfty%7D%7B%5Cfrac%7B2%5E%7B-n%7D%7D%7B%5Csqrt%7B%5Bn%21x%5D%7D%7D%7D&size=300 I think that's as far as I can go with this idea! |
||||||
Powered by YaBB 1 Gold - SP 1.4! Forum software copyright © 2000-2004 Yet another Bulletin Board |