Author |
Topic: The hardest interview question I met so far (Read 37839 times) |
|
Frank
Guest
|
|
The hardest interview question I met so far
« on: Nov 16th, 2005, 9:22pm » |
Quote Modify
Remove
|
Given two sorted postive integer arrays A[n] and B[n] (W.L.O.G, let's say they are decreasingly sorted), we define a set S = {(a,b) | a \in A and b \in B}. Obviously there are n^2 elements in S. The value of such a pair is defined as Val(a,b) = a + b. Now we want to get the n pairs from S with largest values. The tricky part is that we need an O(n) algorithm. I got this question from Google onsite interview. I finally got an O(n \log n) algorithm with hints. Any ideas? Thanks !
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: The hardest interview question I met so far
« Reply #1 on: Nov 17th, 2005, 3:14am » |
Quote Modify
|
I can see how to get an O(n log n) algorithm, but a step to O(n) still refuses to show itself. O(n log n): hidden: | The largest value will be (A[0],B[0]). Given the M largest values, the M+1st largest value will be a (4-connected) neighbour of one of the larger values. So use a balanced binary tree initialized with just (A[0], B[0]), then repeatedly extract the pair with the largest value, and insert the two neighbours (except when duplicate). There will always be at most two new neighbours, so the size of the BBT is at most 2n (and in fact generally much less) | There are probably more efficient ways though.
|
« Last Edit: Nov 17th, 2005, 3:19am by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Jelani Nelson
Guest
|
|
Re: The hardest interview question I met so far
« Reply #2 on: Nov 17th, 2005, 4:42pm » |
Quote Modify
Remove
|
I can do this in randomized linear time, using something very similar to randomized selection of the kth largest element in an array. At each step, I pick a random pair from what's left, and partition around it. The key is that the partioning can be done in linear time if you start at the last index of A and work your way to the front (you don't explicitly write the pairs down since that would take too much time -- you just keep for each index i in A the index j in B such that (A[i], B[k]) is in the same partition for all k<=j). You know that if (A[i], B[j]) is in the partition of bigger elements, then so is (A[k],B[j]) for k < i. So, you know when to stop working at A[i] decrement i. There is the issue of, what do you do in later depths of the recursion when you pick a random index out of what's left. You don't know exactly which pair that gives you to partition around since the pairs are not written explicitly. That's fine though -- you can just binary search through the indices of A to figure out the A[i] that goes into the partition pair's first entry. From there it's easy to figure out the B[j]. Sorry...I came up with this sort of quickly and didn't have time to organize my thoughts well and write things down in an easy-to-read way. I hope what I said above makes sense.
|
|
IP Logged |
|
|
|
KWillets
Junior Member
Posts: 84
|
|
Re: The hardest interview question I met so far
« Reply #3 on: Nov 20th, 2005, 1:01pm » |
Quote Modify
|
The lists are ordered, so it's more like ordered list merge. hidden: | Take each list A, B and difference-encode it into arrays DA and DB, so that DA[i] = A[i] - A[i-1] (skip 0). Start with (A[0], B[0]) as the max. Let (i,j) = (0,0). Now iteratively pick the min(DA[i+1], DB[j+1]), incrementing i or j correspondingly on each iteration. The sequence of (i,j) pairs produced is the correct order. For the top n pairs, this is O(n). For more than n, the loop has to be modified to wrap around when i or j reaches n, incrementing the non-wrapped index. And, of course, DA and DB aren't necessary, but they clarify things a bit. |
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: The hardest interview question I met so far
« Reply #4 on: Nov 20th, 2005, 2:47pm » |
Quote Modify
|
I can't see how that works if always i or j are incremented, but never decremented. Not all elements you want are along a nondecreasing line..
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
KWillets
Junior Member
Posts: 84
|
|
Re: The hardest interview question I met so far
« Reply #5 on: Nov 20th, 2005, 6:56pm » |
Quote Modify
|
Yes, I see the problem.
|
|
IP Logged |
|
|
|
Maniac
Guest
|
|
Re: The hardest interview question I met so far
« Reply #6 on: Nov 23rd, 2005, 10:28am » |
Quote Modify
Remove
|
What do you mean when you say a / in A? That a is NOT in A?
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: The hardest interview question I met so far
« Reply #7 on: Nov 23rd, 2005, 2:06pm » |
Quote Modify
|
on Nov 23rd, 2005, 10:28am, Maniac wrote:What do you mean when you say a / in A? That a is NOT in A? |
| \in is a command in the markup language LaTeX, which is often used for mathematical text. Some forums even process it to give proper symbols. So in short he just meant that a is in A. (We used to have a feature that [in] would show the right symbol, but it didn't survive the last forum update)
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Icarus
wu::riddles Moderator Uberpuzzler
Boldly going where even angels fear to tread.
Gender:
Posts: 4863
|
|
Re: The hardest interview question I met so far
« Reply #8 on: Nov 24th, 2005, 8:44pm » |
Quote Modify
|
I occasionally use c Code: It looks like the element sign, if you squint hard enough!
|
|
IP Logged |
"Pi goes on and on and on ... And e is just as cursed. I wonder: Which is larger When their digits are reversed? " - Anonymous
|
|
|
ChunkTug
Full Member
Gender:
Posts: 166
|
|
Re: The hardest interview question I met so far
« Reply #9 on: Nov 25th, 2005, 8:35pm » |
Quote Modify
|
Sri, this is the same solution proposed by KWillets and has the same flaw. Consider A = {3,2,1,0} and B = {6,4,2,0}. Here m=1 and n=2 for each step. We want: (3,6) (2,6) (1,6) (3,4). However, by the algorithm we get: (3,6) (2,6) (1,6) (0,6).
|
« Last Edit: Nov 25th, 2005, 8:36pm by ChunkTug » |
IP Logged |
|
|
|
Frank
Guest
|
|
Re: The hardest interview question I met so far
« Reply #10 on: Nov 27th, 2005, 6:13pm » |
Quote Modify
Remove
|
I want to clear the problem a little bit. It is for sure that A[0]+B[0] is the largest one. The candidates for the 2nd largest are A[0]+B[1], A[1]+B[0]. We got two candidates. Let's suppose the above two candidates turn out to be the 2nd and 3rd largest ones (which is which does not really matter here). How many candidates we have for the 4th largest value? We have three: A[0]+B[2], A[1]+B[1], and A[2]+B[0]. Let's continue to assume that those three take the seats for the 4th, 5th, and 6th largest values. How about the 7th largest value candidates? We got FOUR now. Indeed, we cannot say who is the largest without comparison among A[0]+B[3], A[1]+B[2], A[2]+B[1], and A[3]+B[0]. .... I think the point is clear now. The number of candidates could grow linearly. (I guess towr might miss this in his post by saying "four candidates". Correct me if I misunderstood, towr.) This is the hard part of composing a linear algorithm, which means that we have to pick out the best one from O(N) candidates in constant time. Hope this helps.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: The hardest interview question I met so far
« Reply #11 on: Nov 28th, 2005, 6:36am » |
Quote Modify
|
on Nov 27th, 2005, 6:13pm, Frank wrote:I think the point is clear now. The number of candidates could grow linearly. (I guess towr might miss this in his post by saying "four candidates". Correct me if I misunderstood, towr.) |
| Actually I limited it to two candidates being added each step, since the one to the left and above are known to be bigger, so they would have been picked already. But it's only an upper bound. You never have to consider more than N candidates, because you only have to look at the maximum element from each row (or column) that hasn't been chosen yet. One of those has to be the largest one overall.
|
« Last Edit: Nov 28th, 2005, 6:38am by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
KWillets
Junior Member
Posts: 84
|
|
Re: The hardest interview question I met so far
« Reply #12 on: Nov 28th, 2005, 11:00am » |
Quote Modify
|
It's easy to visualize if you plot A x B on a plane, and (doing smallest first, wlog), slide the line x + y = C from C = 0 to +infinity. The AxB pairs form an irregularly spaced grid, and the line will cross them in smallest-to-largest order. The problem is that the frontier can range up to n points, so picking the next candidate seems like a greater-than-O(1) operation. IIRC the problem was to find the top n, not necessarily in order(?), so it may be a trick in placing the line with exactly n points below it.
|
|
IP Logged |
|
|
|
Jelani Nelson
Guest
|
|
Re: The hardest interview question I met so far
« Reply #13 on: Nov 29th, 2005, 5:33am » |
Quote Modify
Remove
|
Is the solution I posted flawed? I'm just curious since no one has said anything about it. : )
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: The hardest interview question I met so far
« Reply #14 on: Nov 29th, 2005, 6:00am » |
Quote Modify
|
on Nov 29th, 2005, 5:33am, Jelani Nelson wrote:Is the solution I posted flawed? I'm just curious since no one has said anything about it. : ) |
| It is a little hard to understand what you're doing. At the time I was probably hoping someone else would find the merit in it. I'll think on it some more. But I'm still not sure about how partial sorting would work here.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
JohanC
Senior Riddler
Posts: 460
|
|
Re: The hardest interview question I met so far
« Reply #15 on: Nov 29th, 2005, 7:07am » |
Quote Modify
|
on Nov 29th, 2005, 6:00am, towr wrote: It is a little hard to understand what you're doing. |
| Both Jenali's and Kwillets' solutions are incomplete, but they seem to have the same basic ideas. If you add Kwillets' geometric view to Jenali's description, it seems to make sense. If you have a line a[i]+b[j] < c you can count the number of solutions in a time proportional to Max(i,j), which in this case moves around sqrt(2N). To me, that's the main observation made by both ("counting is much faster than generating" and "a complete sort is not asked for"). Practical proving O(N) time (or expected time) can be a little bit tricky. On the other hand, a practical implementation would know about typical sizes and typical distributions, for which it could be highly optimized. Probably Google uses this kind of algorithm to combine search results for different keywords (the same idea generalizes if you have more than 2 arrays with "page rankings"). Does this makes sense?
|
|
IP Logged |
|
|
|
KWillets
Junior Member
Posts: 84
|
|
Re: The hardest interview question I met so far
« Reply #16 on: Nov 29th, 2005, 12:40pm » |
Quote Modify
|
Jelani's idea makes some sense, but I don't see how it's all supposed to pull together. As far as I can tell it starts with picking a random (i,j) and tracing the grid from that point to find how many points are on or within the line x + y = A[i]+B[j]. Let's call that count C(i,j). That's O(n) on average (?). However C(i,j) >= (i+1)(j+1), so i = j = sqrt(n) is probably a good upper bound starting point, and there may be ways to constrain the numbers at each step.
|
|
IP Logged |
|
|
|
Jelani Nelson
Guest
|
|
Re: The hardest interview question I met so far
« Reply #17 on: Nov 30th, 2005, 7:27pm » |
Quote Modify
Remove
|
>>Let's call that count C(i,j). That's O(n) on average (?). The idea is that when you're partioning around the pivot you randomly selected, if you draw the pairs in a grid (with A's being the y axis and B's being the x axis), the points on the same side of the partition form something that looks like stairs. What I described above is a method for figuring out the shape of the stairs in O(n) time. Then using the same analysis as randomized select, you can show that the total expected running time is O(n). Does this clear things up?
|
|
IP Logged |
|
|
|
Siva Basava K
Guest
|
|
Re: The hardest interview question I met so far
« Reply #18 on: Dec 1st, 2005, 1:32am » |
Quote Modify
Remove
|
Hi Guys, Here is the sollution. print the (a[0],b[0]) it will be a sure pair. start two indices p1,p2 point to a[1] and b[1] respectively. always try to add p1+the ohter array b[0] and viceversa. pick up the max one. Do till u get all the pairs. I have the following program which does this . Any flaws in my code. And ofcourse it is O(n) #include <stdio.h> #include <stdlib.h> int main() { /* Read in the number of element in the two arrays */ int n1; int i; int * arr1, * arr2; int num; int curJ, curI, exhaustedJ, exhaustedI; fscanf(stdin, "%d", &n1); arr1 = (int *)malloc(sizeof(int) * n1 + 1); arr2 = (int *)malloc(sizeof(int) * n1 + 1); for( i = 1; i <= n1; i ++) fscanf(stdin, "%d", &arr1[i]); for( i = 1; i <= n1; i ++) fscanf(stdin, "%d", &arr2[i]); arr1[0] = -10000; arr2[0] = -10000; /* * Here goes the actual algorithm for the same. */ num = 1; curJ = 2; curI = 2; exhaustedJ = 0; exhaustedI = 0; printf("List is as follows --- \n"); printf("(%d, %d)#\n", arr1[1], arr2[1]); while(num < n1) { // Calculate the three sums. int cIcJsum = arr1[curI] + arr2[curJ]; int eIcJsum = arr1[exhaustedI + 1] + arr2[curJ]; int cIeJsum = arr1[curI] + arr2[exhaustedJ + 1]; if ((cIcJsum > eIcJsum) && (cIcJsum > cIeJsum)) { printf("(%d, %d)#\n", arr1[curI], arr2[curJ]); num ++; curJ++; curI++; } else if ((eIcJsum >= cIcJsum) && (eIcJsum >= cIeJsum)) { printf("(%d, %d)#\n", arr1[exhaustedI+1], arr2[curJ]); num ++; curJ++; } else { printf("(%d, %d)#\n", arr1[curI], arr2[exhaustedJ + 1]); num ++; curI++; } } return 0; }
|
|
IP Logged |
|
|
|
Jelani Nelson
Guest
|
|
Re: The hardest interview question I met so far
« Reply #19 on: Dec 1st, 2005, 7:13am » |
Quote Modify
Remove
|
Sorry, I meant to reply to this above but since I can't modify my posts (not registered), I have to make a new post! >>Let's call that count C(i,j). That's O(n) on average (?). No. C(i,j) is something like n^2 on average during the first iteration. In general it's expected to be half of the remaining pairs. You're picking a random pivot point, which serves as your approximate median, then partitioning all remaining points about that pivot. Then you recurse on the half that has the element that has the rank you desire (which you can tell since you know how many points went on each side of the pivot). This isn't exactly legit since there is randomization involved, but if we had an actual median in our hands each time the recurrence would look something like: T(n^2) = T(n^2 / 2) + O(n) which when doing a substitution turns into: T(m) = T(m/2) + O(sqrt(m)) The solution to the second recurrence is O(sqrt(m)), and doing backsubstitution that is O(n). Since there's randomization involved, more work is needed for analysis. But, you can use an analysis very similar to that of randomized select (see e.g. clrs).
|
|
IP Logged |
|
|
|
KWillets
Junior Member
Posts: 84
|
|
Re: The hardest interview question I met so far
« Reply #20 on: Dec 1st, 2005, 11:27am » |
Quote Modify
|
Sorry, I meant that finding C(i,j) is O(n) complexity on average, tracing the "stairs" you mention. Yes, the value is O(n^2). Sorry for the shorthand.
|
|
IP Logged |
|
|
|
fanduBoy
Guest
|
|
Re: The hardest interview question I met so far
« Reply #21 on: Dec 2nd, 2005, 4:04pm » |
Quote Modify
Remove
|
Here is a solution that works: using System; using System.Collections.Generic; using System.Text; namespace ProgrammingProblems { class HighestPairs { static int N = 20; static void Main(string[] args) { int[] A = { 21, 20, 19, 18, 17, 16, 15, 14, 13, 11, 10, 9, 7, 6, 5, 4, 3, 2, 1, 0 }; int[] B = { 20, 19, 18, 17, 16, 15, 14, 13, 12, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 }; NSquareComplexity(A, B); Console.WriteLine(); NComplexity(A, B); } private static void NComplexity(int[] A, int[] B) { int[] AmaxB = new int[A.Length]; int[] BmaxA = new int[B.Length]; for (int i = 0; i < A.Length; i++) { AmaxB[i] = 0; } for (int i = 0; i < B.Length; i++) { BmaxA[i] = 0; } int count = 0; Pair[] pairs = new Pair[N+1]; pairs[count] = new Pair(0, 0, A[0] + B[0]); while (count < N) { count++; int i=0, maxA = 0, maxB=0, maxSoFar = 0, a=0, b=0, max=0; while (true) { if ((i < AmaxB.Length) && ((AmaxB[i] + 1)<B.Length)) a = A[i] + B[AmaxB[i] + 1]; else a = 0; if ((i < BmaxA.Length) && ((BmaxA[i] + 1)<A.Length)) b = B[i] + A[BmaxA[i] + 1]; else b = 0; max = Math.Max(a, b); if (maxSoFar <= max) { maxSoFar = max; maxA = a; maxB = b; i++; } else { if (i>0) i--; break; } } if (maxA > maxB) { AmaxB[i]++; BmaxA[AmaxB[i]] = i; pairs[count] = new Pair(i, AmaxB[i], A[i] + B[AmaxB[i]]); } else { BmaxA[i]++; AmaxB[BmaxA[i]] = i; pairs[count] = new Pair(BmaxA[i], i, A[BmaxA[i]] + B[i]); } } foreach (Pair pair in pairs) Console.WriteLine(pair.ToString()); } private static void NSquareComplexity(int[] A, int[] B) { Pair[] pairs = new Pair[A.Length * B.Length]; int counter = 0; for (int i = 0; i < A.Length; i++) { for (int j = 0; j < B.Length; j++) { pairs[counter++] = new Pair(i, j, A[i] + B[j]); } } Array.Sort(pairs); foreach(Pair pair in pairs) Console.WriteLine(pair.ToString()); } } class Pair : IComparable { int a; int b; public int sum; public Pair(int a, int b, int sum) { this.a = a; this.b = b; this.sum = sum; } public override string ToString() { return string.Format("{0}\t{1}\tSum=\t{2}", a, b, sum); } public int CompareTo(object obj) { return -this.sum.CompareTo(((Pair)obj).sum); } } }
|
|
IP Logged |
|
|
|
fanduBoy
Newbie
Posts: 1
|
|
Re: The hardest interview question I met so far
« Reply #22 on: Dec 13th, 2005, 11:57am » |
Quote Modify
|
Does the above solution look ok to everyone? Any feedback/comments?
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: The hardest interview question I met so far
« Reply #23 on: Dec 13th, 2005, 12:58pm » |
Quote Modify
|
To be honest I was hoping someone else would comment on it first, so I could get an idea what's happening without having to spend too much time on it myself. [e]I can't seem to compile it. Isn't it Java?[/e]
|
« Last Edit: Dec 13th, 2005, 1:05pm by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
JohanC
Senior Riddler
Posts: 460
|
|
Re: The hardest interview question I met so far
« Reply #24 on: Dec 13th, 2005, 1:39pm » |
Quote Modify
|
Without extra explanation, it is hard to understand what exactly is going on. I have the impression that the internal while-loop of your "NComplexity" procedure makes it at least O(N*sqrt(N)). Do you have some proof (or estimation) about how many times that loop is executed? (The loop could be executed up till N times.) I also have the impression that your program will not find all maxima in the correct order, as it stops as soon as the subsequent MaxSoFar aren't increasing anymore. Did you test it with many different inputs?
|
|
IP Logged |
|
|
|
|