|
||||||
Title: The hardest interview question I met so far Post by Frank on Nov 16th, 2005, 9:22pm Given two sorted postive integer arrays A[n] and B[n] (W.L.O.G, let's say they are decreasingly sorted), we define a set S = {(a,b) | a \in A and b \in B}. Obviously there are n^2 elements in S. The value of such a pair is defined as Val(a,b) = a + b. Now we want to get the n pairs from S with largest values. The tricky part is that we need an O(n) algorithm. I got this question from Google onsite interview. I finally got an O(n \log n) algorithm with hints. ??? Any ideas? Thanks ! |
||||||
Title: Re: The hardest interview question I met so far Post by towr on Nov 17th, 2005, 3:14am I can see how to get an O(n log n) algorithm, but a step to O(n) still refuses to show itself. O(n log n): [hideb]The largest value will be (A[0],B[0]). Given the M largest values, the M+1st largest value will be a (4-connected) neighbour of one of the larger values. So use a balanced binary tree initialized with just (A[0], B[0]), then repeatedly extract the pair with the largest value, and insert the two neighbours (except when duplicate). There will always be at most two new neighbours, so the size of the BBT is at most 2n (and in fact generally much less)[/hideb] There are probably more efficient ways though. |
||||||
Title: Re: The hardest interview question I met so far Post by Jelani Nelson on Nov 17th, 2005, 4:42pm I can do this in randomized linear time, using something very similar to randomized selection of the kth largest element in an array. At each step, I pick a random pair from what's left, and partition around it. The key is that the partioning can be done in linear time if you start at the last index of A and work your way to the front (you don't explicitly write the pairs down since that would take too much time -- you just keep for each index i in A the index j in B such that (A[i], B[k]) is in the same partition for all k<=j). You know that if (A[i], B[j]) is in the partition of bigger elements, then so is (A[k],B[j]) for k < i. So, you know when to stop working at A[i] decrement i. There is the issue of, what do you do in later depths of the recursion when you pick a random index out of what's left. You don't know exactly which pair that gives you to partition around since the pairs are not written explicitly. That's fine though -- you can just binary search through the indices of A to figure out the A[i] that goes into the partition pair's first entry. From there it's easy to figure out the B[j]. Sorry...I came up with this sort of quickly and didn't have time to organize my thoughts well and write things down in an easy-to-read way. I hope what I said above makes sense. |
||||||
Title: Re: The hardest interview question I met so far Post by KWillets on Nov 20th, 2005, 1:01pm The lists are ordered, so it's more like ordered list merge. [hideb] Take each list A, B and difference-encode it into arrays DA and DB, so that DA[i] = A[i] - A[i-1] (skip 0). Start with (A[0], B[0]) as the max. Let (i,j) = (0,0). Now iteratively pick the min(DA[i+1], DB[j+1]), incrementing i or j correspondingly on each iteration. The sequence of (i,j) pairs produced is the correct order. For the top n pairs, this is O(n). For more than n, the loop has to be modified to wrap around when i or j reaches n, incrementing the non-wrapped index. And, of course, DA and DB aren't necessary, but they clarify things a bit. [/hideb] |
||||||
Title: Re: The hardest interview question I met so far Post by towr on Nov 20th, 2005, 2:47pm I can't see how that works if always i or j are incremented, but never decremented. Not all elements you want are along a nondecreasing line.. |
||||||
Title: Re: The hardest interview question I met so far Post by KWillets on Nov 20th, 2005, 6:56pm Yes, I see the problem. |
||||||
Title: Re: The hardest interview question I met so far Post by Maniac on Nov 23rd, 2005, 10:28am What do you mean when you say a / in A? That a is NOT in A? |
||||||
Title: Re: The hardest interview question I met so far Post by towr on Nov 23rd, 2005, 2:06pm on 11/23/05 at 10:28:15, Maniac wrote:
So in short he just meant that a is in A. (We used to have a feature that [in] would show the right symbol, but it didn't survive the last forum update) |
||||||
Title: Re: The hardest interview question I met so far Post by Icarus on Nov 24th, 2005, 8:44pm I occasionally use Code:
It looks like the element sign, if you squint hard enough! |
||||||
Title: Re: The hardest interview question I met so far Post by ChunkTug on Nov 25th, 2005, 8:35pm Sri, this is the same solution proposed by KWillets and has the same flaw. Consider A = {3,2,1,0} and B = {6,4,2,0}. Here m=1 and n=2 for each step. We want: (3,6) (2,6) (1,6) (3,4). However, by the algorithm we get: (3,6) (2,6) (1,6) (0,6). |
||||||
Title: Re: The hardest interview question I met so far Post by Frank on Nov 27th, 2005, 6:13pm I want to clear the problem a little bit. It is for sure that A[0]+B[0] is the largest one. The candidates for the 2nd largest are A[0]+B[1], A[1]+B[0]. We got two candidates. Let's suppose the above two candidates turn out to be the 2nd and 3rd largest ones (which is which does not really matter here). How many candidates we have for the 4th largest value? We have three: A[0]+B[2], A[1]+B[1], and A[2]+B[0]. Let's continue to assume that those three take the seats for the 4th, 5th, and 6th largest values. How about the 7th largest value candidates? We got FOUR now. Indeed, we cannot say who is the largest without comparison among A[0]+B[3], A[1]+B[2], A[2]+B[1], and A[3]+B[0]. .... I think the point is clear now. The number of candidates could grow linearly. (I guess towr might miss this in his post by saying "four candidates". Correct me if I misunderstood, towr.) This is the hard part of composing a linear algorithm, which means that we have to pick out the best one from O(N) candidates in constant time. Hope this helps. |
||||||
Title: Re: The hardest interview question I met so far Post by towr on Nov 28th, 2005, 6:36am on 11/27/05 at 18:13:50, Frank wrote:
But it's only an upper bound. You never have to consider more than N candidates, because you only have to look at the maximum element from each row (or column) that hasn't been chosen yet. One of those has to be the largest one overall. |
||||||
Title: Re: The hardest interview question I met so far Post by KWillets on Nov 28th, 2005, 11:00am It's easy to visualize if you plot A x B on a plane, and (doing smallest first, wlog), slide the line x + y = C from C = 0 to +infinity. The AxB pairs form an irregularly spaced grid, and the line will cross them in smallest-to-largest order. The problem is that the frontier can range up to n points, so picking the next candidate seems like a greater-than-O(1) operation. IIRC the problem was to find the top n, not necessarily in order(?), so it may be a trick in placing the line with exactly n points below it. |
||||||
Title: Re: The hardest interview question I met so far Post by Jelani Nelson on Nov 29th, 2005, 5:33am Is the solution I posted flawed? I'm just curious since no one has said anything about it. : ) |
||||||
Title: Re: The hardest interview question I met so far Post by towr on Nov 29th, 2005, 6:00am on 11/29/05 at 05:33:41, Jelani Nelson wrote:
At the time I was probably hoping someone else would find the merit in it. I'll think on it some more. But I'm still not sure about how partial sorting would work here. |
||||||
Title: Re: The hardest interview question I met so far Post by JohanC on Nov 29th, 2005, 7:07am on 11/29/05 at 06:00:52, towr wrote:
Both Jenali's and Kwillets' solutions are incomplete, but they seem to have the same basic ideas. If you add Kwillets' geometric view to Jenali's description, it seems to make sense. If you have a line a[i]+b[j] < c you can count the number of solutions in a time proportional to Max(i,j), which in this case moves around sqrt(2N). To me, that's the main observation made by both ("counting is much faster than generating" and "a complete sort is not asked for"). Practical proving O(N) time (or expected time) can be a little bit tricky. On the other hand, a practical implementation would know about typical sizes and typical distributions, for which it could be highly optimized. Probably Google uses this kind of algorithm to combine search results for different keywords (the same idea generalizes if you have more than 2 arrays with "page rankings"). Does this makes sense? |
||||||
Title: Re: The hardest interview question I met so far Post by KWillets on Nov 29th, 2005, 12:40pm Jelani's idea makes some sense, but I don't see how it's all supposed to pull together. As far as I can tell it starts with picking a random (i,j) and tracing the grid from that point to find how many points are on or within the line x + y = A[i]+B[j]. Let's call that count C(i,j). That's O(n) on average (?). However C(i,j) >= (i+1)(j+1), so i = j = sqrt(n) is probably a good upper bound starting point, and there may be ways to constrain the numbers at each step. |
||||||
Title: Re: The hardest interview question I met so far Post by Jelani Nelson on Nov 30th, 2005, 7:27pm >>Let's call that count C(i,j). That's O(n) on average (?). The idea is that when you're partioning around the pivot you randomly selected, if you draw the pairs in a grid (with A's being the y axis and B's being the x axis), the points on the same side of the partition form something that looks like stairs. What I described above is a method for figuring out the shape of the stairs in O(n) time. Then using the same analysis as randomized select, you can show that the total expected running time is O(n). Does this clear things up? |
||||||
Title: Re: The hardest interview question I met so far Post by Siva Basava K on Dec 1st, 2005, 1:32am Hi Guys, Here is the sollution. print the (a[0],b[0]) it will be a sure pair. start two indices p1,p2 point to a[1] and b[1] respectively. always try to add p1+the ohter array b[0] and viceversa. pick up the max one. Do till u get all the pairs. I have the following program which does this :) . Any flaws in my code. And ofcourse it is O(n) #include <stdio.h> #include <stdlib.h> int main() { /* Read in the number of element in the two arrays */ int n1; int i; int * arr1, * arr2; int num; int curJ, curI, exhaustedJ, exhaustedI; fscanf(stdin, "%d", &n1); arr1 = (int *)malloc(sizeof(int) * n1 + 1); arr2 = (int *)malloc(sizeof(int) * n1 + 1); for( i = 1; i <= n1; i ++) fscanf(stdin, "%d", &arr1[i]); for( i = 1; i <= n1; i ++) fscanf(stdin, "%d", &arr2[i]); arr1[0] = -10000; arr2[0] = -10000; /* * Here goes the actual algorithm for the same. */ num = 1; curJ = 2; curI = 2; exhaustedJ = 0; exhaustedI = 0; printf("List is as follows --- \n"); printf("(%d, %d)#\n", arr1[1], arr2[1]); while(num < n1) { // Calculate the three sums. int cIcJsum = arr1[curI] + arr2[curJ]; int eIcJsum = arr1[exhaustedI + 1] + arr2[curJ]; int cIeJsum = arr1[curI] + arr2[exhaustedJ + 1]; if ((cIcJsum > eIcJsum) && (cIcJsum > cIeJsum)) { printf("(%d, %d)#\n", arr1[curI], arr2[curJ]); num ++; curJ++; curI++; } else if ((eIcJsum >= cIcJsum) && (eIcJsum >= cIeJsum)) { printf("(%d, %d)#\n", arr1[exhaustedI+1], arr2[curJ]); num ++; curJ++; } else { printf("(%d, %d)#\n", arr1[curI], arr2[exhaustedJ + 1]); num ++; curI++; } } return 0; } |
||||||
Title: Re: The hardest interview question I met so far Post by Jelani Nelson on Dec 1st, 2005, 7:13am Sorry, I meant to reply to this above but since I can't modify my posts (not registered), I have to make a new post! >>Let's call that count C(i,j). That's O(n) on average (?). No. C(i,j) is something like n^2 on average during the first iteration. In general it's expected to be half of the remaining pairs. You're picking a random pivot point, which serves as your approximate median, then partitioning all remaining points about that pivot. Then you recurse on the half that has the element that has the rank you desire (which you can tell since you know how many points went on each side of the pivot). This isn't exactly legit since there is randomization involved, but if we had an actual median in our hands each time the recurrence would look something like: T(n^2) = T(n^2 / 2) + O(n) which when doing a substitution turns into: T(m) = T(m/2) + O(sqrt(m)) The solution to the second recurrence is O(sqrt(m)), and doing backsubstitution that is O(n). Since there's randomization involved, more work is needed for analysis. But, you can use an analysis very similar to that of randomized select (see e.g. clrs). |
||||||
Title: Re: The hardest interview question I met so far Post by KWillets on Dec 1st, 2005, 11:27am Sorry, I meant that finding C(i,j) is O(n) complexity on average, tracing the "stairs" you mention. Yes, the value is O(n^2). Sorry for the shorthand. |
||||||
Title: Re: The hardest interview question I met so far Post by fanduBoy on Dec 2nd, 2005, 4:04pm Here is a solution that works: using System; using System.Collections.Generic; using System.Text; namespace ProgrammingProblems { class HighestPairs { static int N = 20; static void Main(string[] args) { int[] A = { 21, 20, 19, 18, 17, 16, 15, 14, 13, 11, 10, 9, 7, 6, 5, 4, 3, 2, 1, 0 }; int[] B = { 20, 19, 18, 17, 16, 15, 14, 13, 12, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 }; NSquareComplexity(A, B); Console.WriteLine(); NComplexity(A, B); } private static void NComplexity(int[] A, int[] B) { int[] AmaxB = new int[A.Length]; int[] BmaxA = new int[B.Length]; for (int i = 0; i < A.Length; i++) { AmaxB[i] = 0; } for (int i = 0; i < B.Length; i++) { BmaxA[i] = 0; } int count = 0; Pair[] pairs = new Pair[N+1]; pairs[count] = new Pair(0, 0, A[0] + B[0]); while (count < N) { count++; int i=0, maxA = 0, maxB=0, maxSoFar = 0, a=0, b=0, max=0; while (true) { if ((i < AmaxB.Length) && ((AmaxB[i] + 1)<B.Length)) a = A[i] + B[AmaxB[i] + 1]; else a = 0; if ((i < BmaxA.Length) && ((BmaxA[i] + 1)<A.Length)) b = B[i] + A[BmaxA[i] + 1]; else b = 0; max = Math.Max(a, b); if (maxSoFar <= max) { maxSoFar = max; maxA = a; maxB = b; i++; } else { if (i>0) i--; break; } } if (maxA > maxB) { AmaxB[i]++; BmaxA[AmaxB[i] = i; pairs[count] = new Pair(i, AmaxB[i], A[i] + B[AmaxB[i]); } else { BmaxA[i]++; AmaxB[BmaxA[i] = i; pairs[count] = new Pair(BmaxA[i], i, A[BmaxA[i] + B[i]); } } foreach (Pair pair in pairs) Console.WriteLine(pair.ToString()); } private static void NSquareComplexity(int[] A, int[] B) { Pair[] pairs = new Pair[A.Length * B.Length]; int counter = 0; for (int i = 0; i < A.Length; i++) { for (int j = 0; j < B.Length; j++) { pairs[counter++] = new Pair(i, j, A[i] + B[j]); } } Array.Sort(pairs); foreach(Pair pair in pairs) Console.WriteLine(pair.ToString()); } } class Pair : IComparable { int a; int b; public int sum; public Pair(int a, int b, int sum) { this.a = a; this.b = b; this.sum = sum; } public override string ToString() { return string.Format("{0}\t{1}\tSum=\t{2}", a, b, sum); } public int CompareTo(object obj) { return -this.sum.CompareTo(((Pair)obj).sum); } } } |
||||||
Title: Re: The hardest interview question I met so far Post by fanduBoy on Dec 13th, 2005, 11:57am Does the above solution look ok to everyone? Any feedback/comments? |
||||||
Title: Re: The hardest interview question I met so far Post by towr on Dec 13th, 2005, 12:58pm To be honest I was hoping someone else would comment on it first, so I could get an idea what's happening without having to spend too much time on it myself. [e]I can't seem to compile it. Isn't it Java?[/e] |
||||||
Title: Re: The hardest interview question I met so far Post by JohanC on Dec 13th, 2005, 1:39pm Without extra explanation, it is hard to understand what exactly is going on. I have the impression that the internal while-loop of your "NComplexity" procedure makes it at least O(N*sqrt(N)). Do you have some proof (or estimation) about how many times that loop is executed? (The loop could be executed up till N times.) I also have the impression that your program will not find all maxima in the correct order, as it stops as soon as the subsequent MaxSoFar aren't increasing anymore. Did you test it with many different inputs? |
||||||
Title: Re: The hardest interview question I met so far Post by mathews1978 on Mar 18th, 2006, 7:04pm How about this solution ? _________________________________________ #include<iostream> using std::cin;; using std::cout; int main() { int A[10]; int B[10]; bool atraverse = true; int sumToBeatAIndex=0; int sumToBeatBIndex=1; int j=1; int k=0; int i; int sumtobeat; int n; int tempBIndex; int tempAIndex; cout<<"\nEnter the size\n"; cin>>n; cout<<"\nEnter the numbers in A (sorted)\n"; for(i=0;i<n;i++) cin>>A[i]; cout<<"\nEnter the numbers in B (sorted)\n"; for(i=0;i<n;i++) cin>>B[i]; cout<<"("<<B[0]<<","<<A[0]<<") "; sumtobeat=A[0]+B[1]; for(i=1;i<n;i++) { if(B[k]+A[j]>sumtobeat) cout<<"("<<B[k]<<","<<A[j]<<") "; else { sumtobeat=B[k]+A[j]; tempBIndex=k; tempAIndex=j; j=sumToBeatAIndex; k=sumToBeatBIndex; sumToBeatAIndex=tempAIndex; sumToBeatBIndex=tempBIndex; cout<<"("<<B[k]<<","<<A[j]<<") "; atraverse=!(atraverse); } if(atraverse) j++; else k++; } cout<<"\n"; return 0; } _________________________________________ |
||||||
Title: Re: The hardest interview question I met so far Post by ChunkTug on Mar 19th, 2006, 3:27am Consider the case: A = {2,1,0} B = {2,1,0} The desired pairs are: (2,2) (2,1) (1,2) Your algorithm won't add both (2,1) and (1,2). At each step either j or k is incremented and/or swapped. The sum j+k increases by one each iteration. The only cases where this algorithm will produce the correct answer are when the solution is: {(A[0],B[0]), (A[0],B[1]), ... , (A[0],B[N-1])} ...or... {(A[0],B[0]), (A[1],B[0]), ... , (A[N-1],B[0])} |
||||||
Title: Re: The hardest interview question I met so far Post by mathews1978 on Mar 19th, 2006, 10:15am Try compiling and running this. It will give you the correct answer for the case you mentioned. If you look at the code j and k are not always incremented. Either j or k is incremented - but not both - during each pass through the loop. So you will get the correct pairs. Just copy the code to visual studio compile and run and see. |
||||||
Title: Re: The hardest interview question I met so far Post by mathews1978 on Mar 19th, 2006, 10:20am And I am not incrementing j or k in each pass through the loop. If you look at the 1st else block inside the loop I am making the value of j and k, sumtobeatAIndex and sumtobeatBIndex. They could be less than the current value of j and k. So j and k always dont get incremented. |
||||||
Title: Re: The hardest interview question I met so far Post by ChunkTug on Mar 19th, 2006, 4:37pm Yeah, I was misreading it. However, there is a problem with the traversal. It will only add pairs where either j=0 or k=0. A=5,4,1,0 B=5,3,1,0 Desired output: (5,5) (5,4) (3,5) (4,3) Program's output: (5,5) (5,4) (3,5) (5,1) If you were to plot AxB on a grid the possible solutions are a "stairstep"-like pattern. A problem arises when you find the next highest pair that the list of candidates for the next pair can grow linearly. The total number of solutions (collections of pairs) for a given N is equal to P(N), The Partition Function P (http://mathworld.wolfram.com/PartitionFunctionP.html). Each solution can be put into a 1-1 mapping with a partition of N (eg. The partition of 6 into 3+2+1 would give 3 pairs from the first column, 2 from the second, 1 from the third). See the image below for the possible solutions when N=6. http://myspace-624.vo.llnwd.net/00581/42/61/581261624_l.jpg I suspect that an O(N) solution does not exist for this problem. If I recall correctly a randomized linear solution was proposed KWillets & Jelani, but was incomplete. Also, fanduboy proposed an algorithm he believed was O(N) that no one could refute because we couldn't tell what the frack was going on :P |
||||||
Title: Re: The hardest interview question I met so far Post by KWillets on Mar 21st, 2006, 10:04am Thanks for the link to P(n). From the estimates it looks like the output size is still O(n) bits, but we don't have any further insight on the inner workings. I was working on some code to test it empirically, using a probabilistic approach, but I haven't had time to finish it. The approach was to move upper and lower stairstep boundaries towards the correct one, picking a new pivot point within the bounds each iteration. |
||||||
Title: Re: The hardest interview question I met so far Post by Ivan on Apr 15th, 2006, 4:53pm Hmm. I agree. I think that the point is we only need to give out the n pairs, NOT NECESSARILY in any order. BTW, I am really happy to see people interested in this problem. :) |
||||||
Title: Re: The hardest interview question I met so far Post by delphos on Apr 18th, 2006, 7:42am Please consider my solution: Code:
|
||||||
Title: Re: The hardest interview question I met so far Post by srinath on Jun 2nd, 2006, 10:31am i suppose delphos' solution is of o(n) and probably the required one....if not please specify the flaw in it... |
||||||
Title: Re: The hardest interview question I met so far Post by SMQ on Jun 2nd, 2006, 11:33am See ChunkTug's post above -- delphos' solution fails in the same way: s[0] = 5 + 5; ai = 0; bi = 0; at = 0; bt = 0 s[1] = 4 + 5; ai = 1; bi = 0; at = 0; bt = 0 s[2] = 5 + 3; ai = 1; bi = 1; at = 0; bt = 0 s[3] = 1 + 5 where s[3] should be 3 + 4, but that sum is not considered by the comparison --SMQ |
||||||
Title: Re: The hardest interview question I met so far Post by srinath on Jun 2nd, 2006, 11:55am i think SMQ's simulation is wrong at s[2]... it is actually s[2]=5+3;ai=1;bi=0;at=1;bt=0; so... s[3] = 4+3;ai=1;bi=1;at=0;bt=0; this indeed seems to be a good solution....the point is only two elements are compared and the elements to be compared are kept track of using the counters 'at' and 'bt'.... ...Correct me if i'm wrong... |
||||||
Title: Re: The hardest interview question I met so far Post by SMQ on Jun 2nd, 2006, 1:38pm :-[ That's what I get for executing code in my head instead of on a computer. I'm still fairly certain it will fail in a similar manner -- where the two elements it's considering don't include the highest sum -- let me look for a failure case and I'll get back to you... --SMQ |
||||||
Title: Re: The hardest interview question I met so far Post by SMQ on Jun 2nd, 2006, 2:24pm OK, take the same example ChunkTug gave: A = {5, 4, 1, 0 } B = {5, 3, 1, 0 } but look at the first nine elements instead of just the first four: s[0] = 5 + 5; ai = 0; bi = 0; at = 0; bt = 0 s[1] = 4 + 5; ai = 1; bi = 0; at = 0; bt = 0 s[2] = 5 + 3; ai = 1; bi = 0; at = 1; bt = 0 s[3] = 4 + 3; ai = 1; bi = 1; at = 0; bt = 0 s[4] = 5 + 1; ai = 1; bi = 1; at = 1; bt = 0 s[5] = 1 + 5; ai = 1; bi = 1; at = 1; bt = 1 s[6] = 4 + 1; ai = 1; bi = 2; at = 0; bt = 1 s[7] = 5 + 0; ai = 1; bi = 2; at = 1; bt = 1 s[8] = 4 + 0 s[8] should have been 0 + 5, but that sum wasn't considered by the comparison. --SMQ |
||||||
Title: Re: The hardest interview question I met so far Post by srinath on Jun 7th, 2006, 11:35pm SMQ is correct to the point that there is a flaw in the delphos' solution...that is because he considered only the condition of the two pointer pairs (ai,at),(bi,bt) crossing each other..but there is another constraint which he failed to check..i.e the movement of the pointer is based on the relative difference between the elements also...thus this code would solve the problem in delphos' solution.... here da[ ] and db[ ] are the difference encoded arrays and the last element in this array is initialised to +infinityi.e integer limit.... printf("%d %d\n",aa[0],bb[0]); for(i=1;i<2*n+4;i++) { if(aa[at]+bb[bi]>bb[bt]+aa[ai]) { printf("%d %d\n",aa[at],bb[bi]); if(at==bt&&bi==ai&&bi!=n-1&&at!=0) {at=bt=0;bi++;ai++;} else{ if(at<bi&&da[at]<=db[bi]) at++; else {at=0;bi++;} } } else { printf("%d %d\n",aa[ai],bb[bt]); if(at==bt&&bi==ai&&bi!=n-1&&at!=0) {at=bt=0;bi++;ai++;} else{ if(bt<ai&&db[bt]<=da[ai]) bt++; else {bt=0;ai++;} } } } This code though solves the problem in delphos' solution ,it has its own flaws...i.e the flaw in this code is that the pointer bt is reverted to 0 when the relative defference i.e da[ai] is less than that in db[bt] ...this is fine when bt=ai...when bt<ai... there some more combinations of (bt+1...ai) and ai which could result in a larger value in latter part of the problem....this can be taken care of by using a queue to store the value of bt and ai whenever there is a need to revert bt back to 1 ....then when there is a need to increment bt at a latter stage ,this previous state can be restored and allowed to proceed till bt=ai....then the next stored state from the queue and so on..... this procedure applies for both (bt,ai)and (at,bi).....i'll code it and upload it soon...correct me if i'm wrong or i'm not clear any where... |
||||||
Title: Re: The hardest interview question I met so far Post by srinath on Jun 7th, 2006, 11:39pm you may avoid bothering about if(at==bt&&bi==ai&&bi!=n-1&&at!=0) {at=bt=0;bi++;ai++;} after the two printf as they are to avoid the same elements being considered twice... |
||||||
Title: Re: The hardest interview question I met so far Post by Brian on Jul 10th, 2006, 12:46pm Description of solution: [hide] 1. Pretend like we are doing an O(N^2) loop: One loop iterates through all elements of A, where an inner loop iterates through all elements of B. Short circuit the inner loop if it turns out we'd get a higher value by jumping to the next iteration of the outer loop. 2. Make two such crawls in parallel - the other crawl reverses the loops. 3. Get N results from the crawl. Because the crawls may overlap, running time is 2N, but of course this is still O(N). [/hide] Code:
|
||||||
Title: Re: The hardest interview question I met so far Post by SMQ on Jul 10th, 2006, 4:17pm on 07/10/06 at 12:46:47, Brian wrote:
An interesting approach -- too bad it doesn't work. ;) Consider a = b = {5, 4, 3, 2, 1}. Your program returns (5,5) (5,4) (4,5) (4,4) (3,5) (5,3) (3,4) (2,5) (4,3) (5,2) (2,4) (1,5) (4,2) (5,1) (1,4) ... Oops: missed (3,3) because neither loop was considering it. Plus, it looks like checking for pairs you've already seen (visited) takes worst-case either N2 extra memory or log2N time for each check, pushing you past O(N) overall. --SMQ |
||||||
Title: Re: The hardest interview question I met so far Post by Brian on Jul 10th, 2006, 4:42pm Quote:
Hrm? I see the output as: 5,5 5,4 4,5 4,4 3,5 Note the original problem asks for the top N pairs, where N is the length of A and B. 3,3 shouldn't be in the list of the top 5 here (you can see yourself that all of the above have a sum greater then 6). Quote:
visited is a dictionary (i.e. hashmap). insert and search are O(1) operations, and I do O(N) of such operations. It also uses only O(N) extra memory. |
||||||
Title: Re: The hardest interview question I met so far Post by SMQ on Jul 10th, 2006, 5:09pm Hrm, I guess the problem statement does use the same N for the size of the arrays and the size of the results. OK, try a = b = {5, 4, 3, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} ;) As for the hashmap, worst case insert and/or search are still O(log2N) (or worse, depending on the implementation.) --SMQ |
||||||
Title: Re: The hardest interview question I met so far Post by Brian on Jul 10th, 2006, 6:27pm Quote:
A and B are sets - they can't contain duplicate items :) Quote:
Not if you use a perfect hash table. In any event, you could use a N^2'd sized array. Yes, a lot of extra storage, but I think it solves the problem. |
||||||
Title: Re: The hardest interview question I met so far Post by SMQ on Jul 10th, 2006, 8:26pm on 07/10/06 at 18:27:18, Brian wrote:
And you, sir, are missing the point. :) a = b = {15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1} still exhibits the problem. 15,15 15,14 14,15 14,14 13,15 15,13 13,14 12,15 14,13 15,12 12,14 11,15 14,12 15,11 11,14 ... whoops, where is 13,13? My point is that given a sufficiently large, properly constructed input, there are areas of the A x B matrix which neither crawlAB nor crawlBA traverse. --SMQ |
||||||
Title: Re: The hardest interview question I met so far Post by Brian on Jul 12th, 2006, 3:24pm This is a fun forum. Ok, what about if, when we short-circuited the inner loop: Instead of discarding the rest of the choices, push the pair onto a stack. Then, everytime we consider a pair from our original algorithm, we check to see if the value on top of the stack is greater. If so, we print that (and also continue checking the rest of the pairs from the original loop. If we get a number smaller then the value from the original value, push THAT pair back onto the stack). Checking the value of the stacks top value is fast, and doesn't make the whole thing slower then O(N). The only problem is making sure we don't end up pushing everything into the array. May not work. |
||||||
Title: Re: The hardest interview question I met so far Post by James Fingas on Jul 25th, 2006, 6:15am Done! This is a pretty good puzzle. You can actually find the cutoff and a description of exactly which pairs are in the solution in less than O(N) time, but outputting all the solutions takes O(N) time, so it doesn't help you overall. This won't be much of a hint, but finding the cutoff point can be done in [hide]O(sqrt(N)*log2N)[/hide] time. Forgive me for not writing code, but this gets pretty nitty-gritty in spots. I think of the problem as a two-dimensional array, S(i,j), of numbers formed by the addition S(i,j) = A[i] + B[j]. A cutoff value C defines a cutoff line through the array so that above the cutoff line S(i,j) >= C and below the line S(i,j) <= C. The particular value C0 that we want gives S(i,j) >= C0 for at least N entries in the array, and S(i,j) <= C0 for at least N(N-1) entries in the array. Here is a better hint, posed as a question: [hideb]Noting that the problem is to find the partition that contains the highest N entries in the square, devise a method for describing a generic partition containing N items that takes less than O(N) memory.[/hideb] Answer: [hideb]We describe a partition as a set of "partition points" that specify the corners of the partition, such that the partition is exactly those S(i,j), so that for some partition point with coordinates i0,j0, i <= i0 and j <= j0. Note, however, that the value S(sqrt(N)+1,sqrt(N)+1) cannot be in the result, so the possible partitions are all skinny, spread along the i=1 and j=1 sides of the array. Therefore, we divide the partition into two pieces along the line i=j. Now both pieces of the partition have a height or width of at most sqrt(N). Therefore, we need to allocate space for only 2*sqrt(N) partition points. [/hideb] Continuing: [hideb]Now our general aim is to find the particular cutoff C0 such that the partition defined by C0 contains exactly 100 items (I'll assume all the items have different values, but no fundamental problem is introduced otherwise). First, note that any C0 must be less than A[1] + B[1], and must be greater than or equal to A[sqrt(N)+1] + B[sqrt(N)+1]. We therefore do a binary search on this range of values, to find C0. As we do our binary search, we must determine how many items are above a trial cutoff C. Doing this is not particularly difficult. First, we move each partition point to the value in its row or column corresponding to the smallest sum larger than C. After all partition points are placed (which takes O(logN) per partition point, for O(sqrt(N)logN) total), we add up the number of elements larger than C in the entire array. This also takes O(sqrt(N)logN) time. Now we just have to try this with various C until we get C0, our ideal value.[/hideb] Now for a wrinkle: [hideb] My first thought was that it would take O(logN) trials of C to get our ideal value, but that is not entirely true. The number of trials depends on the data type of the values we are comparing, and their distribution in that data type. Using integers, it takes a maximum of 16 steps to find a value with precision 1. But if A[i] and B[j] are real numbers, with large values but potentially small differences between them, it could take much longer. So the order of the naive algorithm is O(sqrt(N)logNlogK), where K is a measure of the granularity of the numbers.[/hideb] A refinement: [hideb]In order to have a proper binary search, so that we take O(logN) trial values of C, we must use values that are within the array. The way to do this is to keep track of more information. For each partition point, we will keep three values: one value is known to be too high, one value is known to be too low, and one value is our current guess (which will be proven too high or too low). Putting all the partitions together, we now have something like O(Nsqrt(N)) numbers. My initial thought was to choose a new C from the largest row in the array (corresponding to i=1). But what if we find that all the values in the largest row are either too large or too small? Then we must select a value from a different row or column. We might have to repeat this O(sqrt(N)) times, which would give us O(NlogN) performance overall. So instead, we can use a median-of-medians approach by which we choose the median from each potential partition point range, then choose the median of those medians as a new trial C. This should give us fairly good performance. Choosing the median of each set can be done in constant time, and we need to choose O(sqrt(N)) medians, then find the median of those, which can be done by sorting then indexing, taking O(sqrt(N)log(sqrt(N)) = O(sqrt(N)logN) time, which is the same time it takes to evaluate the new C.[/hideb] The second wrinkle: [hideb]But another wrinkle crops up, because a median-of-median calculation guarantees performance by guaranteeing that a certain fraction of the total values are above the median, and a similar fraction are below the median. Because the potential partition point ranges have different sizes, we can't come up with a similar guarantee for this method. So we actually fail to ensure that we only need to choose O(logN) trial values for C.[/hideb] A further refinement: [hideb]One simple way to address this shortcoming is to store the size of the potential partition point range along with its median, sorting them as a pair, and then after sorting select a median based on the range sizes. This provides a guarantee of a minimum fraction of values both above and below the median. Now we know it will take only O(logN) trial values of C to find C0. This gives a total time of O(logN(sqrt(N)logN + sqrt(N)logN)) = O(sqrt(N)log2N) time to find C0. It then takes O(N + sqrt(N)) = O(N) to output the values.[/hideb] |
||||||
Title: Re: The hardest interview question I met so far Post by KWillets on Jan 3rd, 2007, 1:42pm I had some time over the holidays and actually coded a solution to this. I wanted to do some empirical testing, and it works surprisingly well. The main points:
Testing on random data seems to show that it's sublinear (on average). I haven't tried to figure out what bound it should have, but the average number of pairs visited seems to decline relative to n as n increases. The number of pairs visited is also smaller than n above 1000 or so, declining to .2n around 100,000. I'll post the code soon; I don't have it with me at the moment. |
||||||
Title: Re: The hardest interview question I met so far Post by KWillets on Jan 4th, 2007, 8:35am Here's the code (OS X BSD, but hopefully portable). Paradoxically, this is C++ (I started with several objects, then deleted them):
|
||||||
Title: Re: The hardest interview question I met so far Post by sumitmal on Mar 18th, 2008, 9:23am i think i got this one .... the way it can be done is take two pointers starting from A[0] and B[0]. now compare A[0] + B[1] & A[1] + B[0]. For the greater one run the iterator as follows. assume A[1] + B[0] exceeds the other . check A[i] + B[j] where j increments till, it is more than B[1] and A[0]. once it happens store j for next iteration...and change j = 0 and i = 1 and iterate i, till it is more than A[0] + B[nextiterator] or no. of pairs reaches n. for the next iteration, mark j and i as the ones where above loops ended. Number of comparisons will be of O(n) |
||||||
Title: Re: The hardest interview question I met so far Post by Hippo on Mar 19th, 2008, 2:10am It's always pleasure to read James Fingas posts. I have read the thread just now ... of course You cannot give the n numbers sorted. It should cost n log n. So the pivot approach is required if trying to make O(n) solution. The sqrt compression method is very nice trick! I should think about choosing correct pivot from medians of diferent sized samples but the solution looks well. I would start by choosing pivots on the diagonal by binary search until two neighbouring pivots are found such that n-th pair is among their values. Now we are trying to find n-th pair from all, which is http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/ell.gif-th after the last smaller pivot choosen, we ignore all pairs out of the interval between last smaller and last larger pivot. If the number of remaining pairs N http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/in.gifo(n) we can find the n-th value in the o(n) remaining time by standard k-th from N algorithm. In our case the number of remaining pairs can be http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/theta.gif(nhttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifn). [edit]Oops only 2n(1+1/2+1/3+1/4+...1/http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifn)-n=n(1+1/2+1/3+...+1/n)http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/approx.gif2n ln http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifn[/edit] OK, if you resign on finding the pivot in O(http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifn) time and O(http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifn log n) is sufficient (actually this is good choice as the pivot test costs http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/theta.gif(http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifn log n), you can easily sort all sample medians and traverse them from smallest to highest. After that you can count in O(1) amortized per original sample median how many pairs are garanted to be smaller and how many larger than the sample median. I would choose the sample median such that the number of garanted smaller is less than http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/ell.gif such that number of garanted larger is as close to number of garanted smaller as possible. Actually if Nhttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/in.gifhttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/comega.gif(n) the choosen pivot would be largest which guarantees less than http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/ell.gifsmaller values. If the choosen pivot would be higher than n-th pair, we speeded up the search highly and the remaining N' would be http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/ell.gif/2 +O(N/2) and each subinterval longer than http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/ell.gifis halved. If the choosen pivot would be less than n-th pair, we can choose as the next pivot its higher neighbour in the list which is surely bigger than n-th pair. In that case we shortened the interval using sample medians as much as possible and it became O(N/2) and all the subintervals are halved. So at least after log N steps the interval of remaining pairs is O(n) long. I should analyse it better ... [edit]And I did here (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_cs;action=display;num=1225416544;start=0#13) and here (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_cs;action=display;num=1241589953)[/edit] |
||||||
Title: Re: The hardest interview question I met so far Post by scm007 on Dec 25th, 2008, 2:35pm Isn't this solution as simple as keeping track of wherever we branch in terms of searching for the solution, then reverting back to that branch if it is greater than the current? I have attached C++ code, no comments if you can't figure it out just ask. I coded this up in about 5 minutes so it is ugly and slow. #include <vector> #include <string> #include <iostream> using namespace std; int sum(pair<int, int> p, vector<int> A, vector<int> B) { return A[p.first] + B[p.second]; } vector< pair<int, int> > maxPairs(vector<int> A, vector<int> B, int n) { vector< pair<int, int> > pairs; pair<int, int> sec, cur, temp; cur = make_pair(0, 0); pairs.push_back(cur); if ( sum(make_pair(cur.first, cur.second + 1), A, B) > sum (make_pair(cur.first + 1, cur.second), A, B) ) { cur = make_pair(0, 1); sec = make_pair(1, 0); } else { cur = make_pair(1,0); sec = make_pair(0, 1); } pairs.push_back(cur); for (int i = 2; i < n; i++) { temp = cur; if ( sum(make_pair(cur.first, cur.second + 1), A, B) > sum (make_pair(cur.first + 1, cur.second), A, B) ) { cur.second++; } else { cur.first++; } if (sum(sec, A, B) > sum(cur, A, B)) { temp = cur; cur = sec; sec = temp; } pairs.push_back(cur); } return pairs; } int main() { vector< pair<int, int> > ret; vector<int> A, B; for (int i = 0; i < 10; i++) { A.push_back(5*(i+1)); B.push_back(3*(i+2)); } sort(A.begin(), A.end()); sort(B.begin(), B.end()); reverse(A.begin(), A.end()); reverse(B.begin(), B.end()); ret = maxPairs(A, B, 10); for (int i = 0; i < ret.size(); i++) { cout << '(' << ret[i].first << ',' << ret[i].second << ')' << ' '; } cout << endl; for (int i = 0; i < ret.size(); i++) { cout << sum(ret[i], A, B) << ' '; } cout << endl; for (int i = 0; i < A.size(); i++) { cout << A[i] << ' '; } cout << endl; for (int i = 0; i < B.size(); i++) { cout << B[i] << ' '; } cout << endl; return 0; } |
||||||
Title: Re: The hardest interview question I met so far Post by scm007 on Jan 20th, 2009, 9:13pm Can someone verify that my solution is correct? Logically I think it is, and it passes all of the test cases supplied thus far. |
||||||
Title: Re: The hardest interview question I met so far Post by computer on Jan 21st, 2009, 8:29am Can somebody elaborate a bit more about the O(n log n) solution ? What is meant by 4-connected here and which balanced tree is Towr referring to ? |
||||||
Title: Re: The hardest interview question I met so far Post by towr on Jan 21st, 2009, 8:55am on 01/21/09 at 08:29:24, computer wrote:
Quote:
on 01/20/09 at 21:13:18, scm007 wrote:
Is it supposed to be O(n), or merely solve the problem? The use of recursion makes me suspect runtime complexity may not be that good. |
||||||
Title: Re: The hardest interview question I met so far Post by scm007 on Jan 21st, 2009, 6:50pm It's O(n) and has no recursion, I think you guys have made this problem way too hard for yourselves. I'm pretty sure it's correct, I tried to explain what I was doing but didn't comment very well, compile it and run it I suppose. I mean it was an interview question, it seemed to me to be pretty easy, but I have been wrong before :) Stephen |
||||||
Title: Re: The hardest interview question I met so far Post by SMQ on Jan 22nd, 2009, 5:51am on 01/21/09 at 18:50:42, scm007 wrote:
I always love to see that -- it's even occasionally true. ;) In this case, though, I don't think so. Don't get me wrong -- it's a good effort -- but I believe it's flawed: when you make the first choice (before the loop) you remember the other option in sec, but you don't do this for subsequent choices (the corresponding if block within the loop). Because you don't remember all not-taken choices, you lose track of some possible paths. I don't have a failure case off the top of my head, but I'm confident I can come up with one: Edit -- a failure case: A = B = {6, 5, 4, 3, 2, 1} Your program picks: (0,0) (1,0) (0,1) (1,1) (2,0) (3,0) = 12, 11, 11, 10, 10, 9 One correct answer: (0,0) (1,0) (0,1) (1,1) (2,0) (0,2) = 12, 11, 11, 10, 10, 10 Another failure case -- revisiting an element already chosen: A = B = {6, 5, 3, 2, 1} Your program picks: (0,0) (1,0) (0,1) (1,1) (1,1) = 12 11 11 10 10 One correct answer: (0,0) (1,0) (0,1) (1,1) (2,0) = 12 11 11 10 9 --SMQ |
||||||
Title: Re: The hardest interview question I met so far Post by eklavya on Jun 11th, 2009, 3:14am on 11/17/05 at 03:14:59, towr wrote:
Correct me if I am wrong,but wont the highest n sums definitely contain a[0] or b[0] as one component. If so then the problem becomes- for(m,n from 1 to n) nexthighest[k]=max(A[0]+B[m],B[0]+A[n]) if(1st was larger m++ else n++). Whats ur take on this? |
||||||
Title: Re: The hardest interview question I met so far Post by towr on Jun 11th, 2009, 3:44am on 06/11/09 at 03:14:18, eklavya wrote:
For example, take [10, 9, 5, 4, 3, 2] and [12, 11, 9, 6, 1, 0]
Here two sums don't contain 10 or 12, and four do. But as the two arrays get larger, so does the upper-left triangle, which means fewer elements are at its top and left edge. (But I don't really want to draw larger matrices) |
||||||
Title: Re: The hardest interview question I met so far Post by singhar on Apr 6th, 2010, 11:00am Did anyone find the O(n) solution for this question? I dont understand James Fingas and Hippo's solution at all. Could anyone explain in a bit more detail or give some pseudo code. Towr, the scanning of anti-diagonal elements (in bottom-left to top-right direction) would lead to a O(nlogn) solution, since for the pair with kth largest sum we are checking at most k pairs. Am I right? |
||||||
Title: Re: The hardest interview question I met so far Post by towr on Apr 6th, 2010, 12:25pm on 04/06/10 at 11:00:20, singhar wrote:
|
||||||
Powered by YaBB 1 Gold - SP 1.4! Forum software copyright © 2000-2004 Yet another Bulletin Board |