Author |
Topic: Closest to the median (Read 2817 times) |
|
gotit
Uberpuzzler
Gender:
Posts: 804
|
|
Closest to the median
« on: Aug 20th, 2007, 1:59am » |
Quote Modify
|
We are given an array containing n integers and a number k.We have to find the k numbers in the array that are closest to the median of the array. Can anyone suggest an algo with O(n) time complexity When i say closest,I mean closest by value, not by position.
|
|
IP Logged |
All signatures are false.
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Closest to the median
« Reply #1 on: Aug 20th, 2007, 3:18am » |
Quote Modify
|
I think to start you'd want to find the median, which can be done in O(n) (either stochastically, which is typically fastest, I think; or guaranteed linear using the median of medians approach) For the next step you could e.g. think of a tree with k nodes (if k log k < n, then it's still linear); if there are less than k nodes in it, add the next node, if there are k nodes, replace the largest or smallest when appropriate by a new one. Partial sorting is also an option.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
TenaliRaman
Uberpuzzler
I am no special. I am only passionately curious.
Gender:
Posts: 1001
|
|
Re: Closest to the median
« Reply #2 on: Aug 20th, 2007, 4:27am » |
Quote Modify
|
on Aug 20th, 2007, 3:18am, towr wrote:Partial sorting is also an option. |
| Yeah, i would suggest doing quick sort on the median and then choosing only the halves which are closer the median everytime during sort. -- AI
|
|
IP Logged |
Self discovery comes when a man measures himself against an obstacle - Antoine de Saint Exupery
|
|
|
Skandh
Junior Member
Gender:
Posts: 77
|
|
Re: Closest to the median
« Reply #3 on: Aug 20th, 2007, 7:09am » |
Quote Modify
|
on Aug 20th, 2007, 4:27am, TenaliRaman wrote: Yeah, i would suggest doing quick sort on the median and then choosing only the halves which are closer the median everytime during sort. -- AI |
| By performing quick sort type of algo ( known as quick select, to find median) , we can find median, but what after that, because k closest numbers can be on both side of the median.
|
|
IP Logged |
I wanna pull by legs!!!
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Closest to the median
« Reply #4 on: Aug 20th, 2007, 7:14am » |
Quote Modify
|
on Aug 20th, 2007, 7:09am, Skandh wrote:By performing quick sort type of algo ( known as quick select, to find median) , we can find median, but what after that, because k closest numbers can be on both side of the median. |
| You could get the k closest smaller than the median and the k closest larger; then somehow sort out these 2k numbers further. (And if O(k log k) < O(n), you could just sort them without losing linearity of the overal algorithm. Once properly sorted it's easy enough to find the k closest ones.)
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Closest to the median
« Reply #5 on: Aug 20th, 2007, 8:46am » |
Quote Modify
|
Assuming n tends to infininty and k is constant, sorting k elements is O(1). Maybe you could find the n/2+k largest elements using some divide and conquer scheme that should be O(n) on average. Then, get the 2k smallest elements from that. Then, finding the median of the 2k central elements and identifying the k closest items to it is O(1).
|
|
IP Logged |
|
|
|
KWillets
Junior Member
Posts: 84
|
|
Re: Closest to the median
« Reply #6 on: Aug 20th, 2007, 10:45am » |
Quote Modify
|
on Aug 20th, 2007, 7:09am, Skandh wrote: By performing quick sort type of algo ( known as quick select, to find median) , we can find median, but what after that, because k closest numbers can be on both side of the median. |
| Do k-smallest on the absolute value of the difference from the median.
|
|
IP Logged |
|
|
|
gotit
Uberpuzzler
Gender:
Posts: 804
|
|
Re: Closest to the median
« Reply #7 on: Aug 20th, 2007, 11:42am » |
Quote Modify
|
Regarding towr's first solution of creating a k-node tree, I think he is considering of putting the actual elements of the array in the tree.What if I put the abs value of the difference of the elements and the median.In that case,we need only to compare the new elements with the largest in the tree and replace the largest with it if it is smaller. Another question:How does the process of inserting in the k-node tree take O(klogk) time?I think it takes O(nlogk) time(in the worst case when all the n-k farthest elemets in the array occur before all the k closest elements).
|
|
IP Logged |
All signatures are false.
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Closest to the median
« Reply #8 on: Aug 20th, 2007, 1:07pm » |
Quote Modify
|
on Aug 20th, 2007, 11:42am, gotit wrote:Regarding towr's first solution of creating a k-node tree, I think he is considering of putting the actual elements of the array in the tree.What if I put the abs value of the difference of the elements and the median.In that case,we need only to compare the new elements with the largest in the tree and replace the largest with it if it is smaller. |
| Yeah, that's better. I really should have thought of using the absolute difference before KWillets mentioned it in the post before yours, that makes things so much simpler; and it should have been obvious. Quote:Another question:How does the process of inserting in the k-node tree take O(klogk) time? I think it takes O(nlogk) time(in the worst case when all the n-k farthest elements in the array occur before all the k closest elements). |
| Yup, right again; so I guess that was a really bad suggestion on my part The obvious give away is that it should be at least O(n) because you have to consider every element in the array.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
wangzhen
Newbie
Posts: 21
|
|
Re: Closest to the median
« Reply #9 on: Sep 6th, 2007, 3:07am » |
Quote Modify
|
My Method: 1st step: Delete the top (n-k)/2 smallest data. 2st step: Delete the top (n-k)/2 largest data. Take the top (n-k)/2 smallest data for example: Select a data (as pivot similar with quick sort does) randomly. Do the partition as quick sort does to let the left of pivot are all smaller than it, and the right are all larger than it. This operation are O(n). Denote p as the current subscript (or postion) of the pivot. Then, (p-1) means the number of the left, and (n-p) means the number of the right. If the (p-1)>(n-k)/2, Then Delete (n-k)/2 smallest in the left (p-1). This is a process of recursion. else if (p-1) == (n-k)/2, Then delete the left part and return. else , Then Delete the left part and pivot, and Delete (n-k)/2 - p smallest in the right part. This is a process of recursion. Calculate the complexityWhich we define F(n)) F(n) = F(n/2) + n; Then F(n) = F(n/4) + n/2 +n = ... = n+ n/2 +n/4 +... We can know that the result is smaller than 2*n, So complexity is O(n). Similarly, We Delete the Largest (n-k)/2 in the remaining array, and eventually the final data are the answer. Then the average complexity are O(n). But as similar as Quick sort, this method have a complexity of O(n^2) in the most bad situation.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Closest to the median
« Reply #10 on: Sep 6th, 2007, 5:26am » |
Quote Modify
|
on Sep 6th, 2007, 3:07am, wangzhen wrote:My Method: 1st step: Delete the top (n-k)/2 smallest data. 2st step: Delete the top (n-k)/2 largest data. |
| That might not leave the k elements closest to the median. Consider for example the array [1,2,3,6,7] with k=3. 3 is the median, and {1,2,3} are the three closest elements; your method however would give {2,3,6}.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
wangzhen
Newbie
Posts: 21
|
|
Re: Closest to the median
« Reply #11 on: Sep 6th, 2007, 8:03pm » |
Quote Modify
|
Thanks for towr's remind. I have misunderstood it. Then, I also have a new method: suppose the array are a[]. Firstly, find the median in the array, I think this is O(n) time complexity. Then, new an array b[] to store the values: for(i=0; i<n; ++i) b[i] = abs(a[i]-median) Then in the array b[i], find the top k smallest (O(n)), and output them after adding median. So, this method is also O(n). on Sep 6th, 2007, 5:26am, towr wrote: That might not leave the k elements closest to the median. Consider for example the array [1,2,3,6,7] with k=3. 3 is the median, and {1,2,3} are the three closest elements; your method however would give {2,3,6}. |
|
|
|
IP Logged |
|
|
|
wangzhen
Newbie
Posts: 21
|
|
Re: Closest to the median
« Reply #12 on: Sep 6th, 2007, 8:07pm » |
Quote Modify
|
Sorry, in the end, not adding the median, because it is abs(). should remember the each b[i] 's origin and recover the data.
|
|
IP Logged |
|
|
|
|