|
||
Title: Efficient Datastructure Post by techforn10 on Jul 7th, 2007, 10:00am Company A serves millions of requests per day. Give a efficient method to return two character string country code given IP address as input. |
||
Title: Re: Efficient Datastructure Post by towr on Jul 7th, 2007, 11:16am Well, presumedly the input and output is known in advance, so you can spend quite some power to find the most efficient mapping. I'd be hardpressed to say a priori what that mapping will be. |
||
Title: Re: Efficient Datastructure Post by aki_scorpion on Jul 7th, 2007, 11:28am hmmm If the input and output are known is advance and if they are limited why can we use 2 hash functions for every ip so that query will take only o(1) time. |
||
Title: Re: Efficient Datastructure Post by towr on Jul 7th, 2007, 11:50am The range of ip numbers is limited, that much is a given. But O(1) doesn't say much in itself; if each request takes a second it's too long, even if it's a constant amount of time. We have some 2^32 IPv4 addresses, and some 200 countries. Hmm, I wonder if there's a list somewhere, so we can actually do it, rather than only try to figure out an approach. |
||
Title: Re: Efficient Datastructure Post by Grimbal on Jul 7th, 2007, 4:00pm I think IP addresses are attributed to countries in ranges. Storing each IP address would be overkill. Maybe a Btree would be appropriate? |
||
Title: Re: Efficient Datastructure Post by techforn10 on Jul 8th, 2007, 9:16am How about using Trie?? Vague idea: I was thinking of building a Trie of network part of IP addresses separated by 2 letter country code. So on query just traverse through the Trie matching network part of octets then just return the data in leaf node which contains country code. |
||
Title: Re: Efficient Datastructure Post by towr on Jul 8th, 2007, 10:22am The (free) geolite database from maxmind contains just under 100000 ip-range - country pairs. And boasts 98% accuracy. (Their commercial database has over 99% accuracy) So, that's about 500 ranges per country. A tree of some sort would seem an obvious solution. But I wonder if it can't be done with fewer costly hops through memory. |
||
Title: Re: Efficient Datastructure Post by mad on Mar 9th, 2008, 3:44am But, using a trie to store it as digits from 0-9 rather than binary tree might be efficient. Or can we store a tree of intervals and check if a given IP address falls in a given range/interval and output the corresponding country. |
||
Title: Re: Efficient Datastructure Post by Grimbal on Mar 12th, 2008, 10:47am on 03/09/08 at 03:44:18, mad wrote:
IP addresses being 4 bytes, you should rather go 256-way at each level, or maybe 16-way. |
||
Title: Re: Efficient Datastructure Post by Hippo on Mar 12th, 2008, 1:18pm on 07/07/07 at 11:28:37, aki_scorpion wrote:
I should comment this ... there is a big difference between O(1) and o(1) ... There is no algorithm working in o(1) time. (At least you cannot call such algorithm.) |
||
Powered by YaBB 1 Gold - SP 1.4! Forum software copyright © 2000-2004 Yet another Bulletin Board |