wu :: forums - Print Page


    
      
        wu :: forums
        (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
      

        riddles >> hard >> Industrial/Distributed Computing
        
(Message started by: k2xl on Feb 18^th, 2008, 10:48am)

Title: Industrial/Distributed Computing
Post by k2xl on Feb 18^th, 2008, 10:48am

I'm a student at Georgia Tech working with distributed systems. This is a unique problem I came up with in my research.

This problem came from me explaining the concept of distributed computing to some friends. I've done many experiments with it but have come up empty. I've shown various professors this problem but nobody has been able to generalize it.

Let's say you want to calculate the sum of 10 numbers. Let's also say it takes you 1 second to add up two numbers together.
Therefore:
10 Numbers with 1 calculator would take 9 seconds to calculate.

Now let's say you give a second calculator to a friend. You give the second calculator 5 calculations to calculate and yourself 5. Working simultaneously, you would take 4 seconds and your friend 4 seconds. Then you could combine these two sums (which would take 1 second) to get the overall sum in a total of 5 seconds. Therefore:

10 Numbers to sum with 2 calculators would take 5 seconds.

Now let's add a third calculator. Since 10 doesn't divide evenly by 3, one calculator will do more work than the other two.
Calculator A: 3 sums
Calculator B: 3 sums
Calculator C: 4 sums

Therefore:
Calculator A: Takes 2 seconds (must wait 1 second)
Calculator B: Takes 2 seconds (must wait 1 second)
Calculator C: Takes 3 seconds

Adding all the sums takes an additional 2 seconds.
This means:

10 Numbers to sum with 3 calculators would take 5 seconds.

So here is the problem. Using 3 calculators doesn't save you any time. It's cheaper to use 2 calculators.

So given N numbers to sum, find the minimum C calculators to use to obtain the lowest T time.

A quick formula that I came up with to get the minimum time is:
(N/C - 1) + (N-1) = N/C + N - 2

Note: These calculators work independently. Meaning they can't pass along sums for the other calculators to calculate to save time. This means if one calculator has more work than others the others can't do more work while it finishes.
Note 2: This means that the calculators do not communicate with each other. Assume that a central server sends data to nodes and waits for all of the data to be returned and then compiles all the data.

If I didn't explain this well please post what needs more clarification. Thanks!

Title: Re: Industrial/Distributed Computing
Post by k2xl on Feb 18^th, 2008, 11:14am

on 02/18/08 at 11:13:48, towr wrote:

Yeah, I just noticed.. sorry about that; and I deleted it before I saw your response.. (albeit relatively long after I posted it)

Removed my post following it too. No problems :-)

Title: Re: Industrial/Distributed Computing
Post by towr on Feb 18^th, 2008, 11:19am

Ok, let's see if I can contribute anything useful..
As I understand it, we want to minimize http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifN/C - 1http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif + (C-1), dependent on C
So, pretending it's continuous we can take the derivative wrt C and see when it's zero.

-N/C²+1=0
C²=N
C=http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifN

That should be in the right neighbourhood.

Title: Re: Industrial/Distributed Computing
Post by k2xl on Feb 18^th, 2008, 11:42am

Unfortunately, Sqrt(N) is incorrect.

Sqrt(N) is what I got when doing the derivative too.

But look at 10.

Sqrt(10) is around 3 (rounded up is 4).

But as we just pointed, 2 calculators can find the minimum time (cheaper than using 3)

Title: Re: Industrial/Distributed Computing
Post by towr on Feb 18^th, 2008, 11:53am

I only claimed it was in the right neighbourhood, not that it was in the right house ;)
Obviously the ceiling function throws a spanner in the works.

Title: Re: Industrial/Distributed Computing
Post by towr on Feb 18^th, 2008, 12:20pm

Quite the horrible little function.. If we ever figure it out, consider submitting it to http://www.research.att.com/~njas/sequences/index.html

Here's a few values to work on.. (hopefully I haven't made any mistakes in generating it)

------------
1 1
2 1
3 1
4 2
5 2
6 2
7 2
8 2
9 3
10 2
11 3
12 3
13 3
14 3
15 3
16 4
17 3
18 3
19 4
20 4
21 3
22 4
23 4
24 4
25 5
------------

------------
26 4
27 4
28 4
29 5
30 5
31 4
32 4
33 5
34 5
35 5
36 6
37 5
38 5
39 5
40 5
41 6
42 6
43 5
44 5
45 5
46 6
47 6
48 6
49 7
50 5
------------

------------
51 6
52 6
53 6
54 6
55 7
56 7
57 6
58 6
59 6
60 6
61 7
62 7
63 7
64 8
65 6
66 6
67 7
68 7
69 7
70 7
71 8
72 8
73 7
74 7
75 7
------------

------------
76 7
77 7
78 8
79 8
80 8
81 9
82 7
83 7
84 7
85 8
86 8
87 8
88 8
89 9
90 9
91 7
92 8
93 8
94 8
95 8
96 8
97 9
98 9
99 9
100 10
------------

------------
101 8
102 8
103 8
104 8
105 9
106 9
107 9
108 9
109 10
110 10
111 8
112 8
113 9
114 9
115 9
116 9
117 9
118 10
119 10
120 10
121 11
122 9
123 9
124 9
125 9
------------

------------
126 9
127 10
128 10
129 10
130 10
131 11
132 11
133 9
134 9
135 9
136 10
137 10
138 10
139 10
140 10
141 11
142 11
143 11
144 12
145 10
146 10
147 10
148 10
149 10
150 10
------------

------------
151 11
152 11
153 11
154 11
155 12
156 12
157 10
158 10
159 10
160 10
161 11
162 11
163 11
164 11
165 11
166 12
167 12
168 12
169 13
170 10
171 11
172 11
173 11
174 11
175 11
------------

------------
176 11
177 12
178 12
179 12
180 12
181 13
182 13
183 11
184 11
185 11
186 11
187 11
188 12
189 12
190 12
191 12
192 12
193 13
194 13
195 13
196 14
197 11
198 11
199 12
200 12
------------

Title: Re: Industrial/Distributed Computing
Post by k2xl on Feb 18^th, 2008, 1:21pm

What exactly is this sequence towr? Because the minimum increases or stays the same as N increases

I have some spreadsheets, I can post them if anyone is interested.

One thing I remember is that there is NEVER a case in which you should use 9 calculators... 100 uses 10, 101 uses 8! Isn't that interesting?

Title: Re: Industrial/Distributed Computing
Post by towr on Feb 18^th, 2008, 1:32pm

on 02/18/08 at 13:21:46, k2xl wrote:

What exactly is this sequence towr? Because the minimum increases or stays the same as N increases

This is a sequence of what (smallest) C give the minimum time; that's what we want to find, isn't it?

Quote:

I have some spreadsheets, I can post them if anyone is interested.

It can't hurt.

Quote:

One thing I remember is that there is NEVER a case in which you should use 9 calculators... 100 uses 10, 101 uses 8! Isn't that interesting?

The first one where you should use 9 calculators is 81..
For every N=K², you should use K calculators..

Title: Re: Industrial/Distributed Computing
Post by k2xl on Feb 18^th, 2008, 1:44pm

Ah right.
Btw, how did you generate your list?

Title: Re: Industrial/Distributed Computing
Post by towr on Feb 18^th, 2008, 1:50pm

on 02/18/08 at 13:44:53, k2xl wrote:

Ah right.
Btw, how did you generate your list?

brute force javascript

Code:

javascript:
for(i=1; i<200; i++)
{
m1=200;
m2=0;
for(j=1; j<200; j++)
{
t = Math.ceil(i/j)+j-2;
if(t < m1)
{
m1=t;
m2=j;
}
}
document.write(i+" "+m2+"<br>");
}

(If you put it on one line, it'll run in your browser, or if you have linux and firefox you can paste it as it is.)

Title: Re: Industrial/Distributed Computing
Post by k2xl on Feb 18^th, 2008, 1:57pm

Awesome, okay. So now just to find a formula :-)

Title: Re: Industrial/Distributed Computing
Post by Icarus on Feb 18^th, 2008, 3:30pm

Minimizing http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifN/C - 1http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif + (C-1) might be tough enough, but I don't think this exactly describes the situation: Suppose that you have 200 numbers to add up and 100 calculators. You can distribute the 200 numbers, get your 100 results, but it would be ridiculous at this point to use a single calculator to add up the 100 numbers. Instead, you distribute the results to 50 calculators, then those results to 25, etc. So you should be minimizing

http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifN/C - 1http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif + http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.giflog₂ C http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif

I think.

Title: Re: Industrial/Distributed Computing
Post by Eigenray on Feb 18^th, 2008, 5:13pm

It's not too hard to minimize just http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifn/chttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif + c:

Let f(x) = n/x + x, F(x) = http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.giff(x)http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif. Since f(x) is minimized at x=http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifn, it follows that the minimum value of F(x) is either F(http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lfloor.gifhttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifnhttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rfloor.gif) or F(http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifhttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gifnhttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif). If n = k², then it's easy to see F(k) = 2k, while F(k-1) > 2k, so C=k.

If n is not a square, write n = k² + r, 0 < r http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif 2k. Then

2k < f(k+1) = (k²+r)/(k+1) + (k+1) http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif 2k+2,

so F(k+1) is 2k+1 when r http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif k, and 2k+2 when r > k. Similarly

2k < f(k) = (k²+r)/k + k http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif 2k+2,

and so in fact F(k) = F(k+1). It follows that F(x) http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/ge.gif F(k) for all x, and the question is simply: what is the smallest x for which F(x) = F(k)?

If r http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif k, then F(x) = F(k) when f(x) = n/x + x http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif 2k+1, or

k+1/2 - http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gif{k-r+1/4} http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif x http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif k+1/2 + http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gif{k-r+1/4},

so the smallest such integer x is C = http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifk+1/2 - http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gif{k-r+1/4}http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif.

Similarly, if r > k, then F(x) = F(k) when n/x + x http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/le.gif 2k+2, and the smallest such integer x is C = http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifk+1-http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/surd.gif{2k-r+1]http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif.

However, it looks like what you really want to do is recursively let F(n) = min{ http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifn/c - 1http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif+ F(c) : c < n }?

Title: Re: Industrial/Distributed Computing
Post by k2xl on Feb 18^th, 2008, 7:15pm

on 02/18/08 at 15:30:39, Icarus wrote:

Please see the note. The calculators are independent of each other. This system doesn't follow this scheme. What your suggesting is a classic distributed computing model. But many models do not function in this way.

The problem is basically minimizing time and then number of calculators.

Title: Re: Industrial/Distributed Computing
Post by Icarus on Feb 18^th, 2008, 10:54pm

I don't follow this. If the calculators cannot pass their results, then they cannot cooperatively sum the numbers at all, because the result of one cannot be passed to another. So in the end you are left with independent calculators each with the sum of their allotted portion, and no means to collect the totals together to find the grand total.

I suppose you can demand that only one calculator can sum up the individual totals, but the only reason I could see you doing it this way in the real world is to free up the other calculators for different tasks, while one finishes off this task (whose output is not needed by other processes until the sole remaining calculator has time to finish the job).

Title: Re: Industrial/Distributed Computing
Post by Eigenray on Feb 19^th, 2008, 12:47am

on 02/18/08 at 17:13:27, Eigenray wrote:

However, it looks like what you really want to do is recursively let F(n) = min{ http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifn/c - 1http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif + F(c) : c < n }?

Actually, this works out to F(n) = http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.giflg nhttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif, the same as Icarus's suggestion: http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifn/c-1http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif + http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.giflg chttp://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif is minimized at c=http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/lceil.gifn/2http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/rceil.gif or c=n (among possibly other values, but the presence of two separate ceilings makes it a lot trickier to analyze).

Title: Re: Industrial/Distributed Computing
Post by towr on Feb 19^th, 2008, 1:16am

on 02/18/08 at 22:54:51, Icarus wrote:

I suppose you can demand that only one calculator can sum up the individual totals, but the only reason I could see you doing it this way in the real world is to free up the other calculators for different tasks, while one finishes off this task (whose output is not needed by other processes until the sole remaining calculator has time to finish the job).

Typically, in distributed computing (afaik) you have one node distributing the task and collecting the results. Redistributing the intermediary results for further work is often not worth the bother (Of course this doesn't fit the current model either, because here distributing tasks seems to be free)

Title: Re: Industrial/Distributed Computing
Post by Grimbal on Feb 19^th, 2008, 1:20am

on 02/18/08 at 11:42:28, k2xl wrote:

I don't understand. With 3 calculators you need only 4 seconds, not 5. While the 3rd calculator does its 3rd sum, the first and 2nd calculator can combine their sums.

Title: Re: Industrial/Distributed Computing
Post by k2xl on Feb 19^th, 2008, 5:20am

on 02/19/08 at 01:20:57, Grimbal wrote:

I don't understand. With 3 calculators you need only 4 seconds, not 5. While the 3rd calculator does its 3rd sum, the first and 2nd calculator can combine their sums.

Please see the note.

This model is simple.

Think of it like this:
You are a central server. You distribute data to nodes. You wait until all the nodes return you results and then you compile the results.

Title: Re: Industrial/Distributed Computing
Post by rmsgrey on Feb 19^th, 2008, 5:21am

on 02/19/08 at 01:20:57, Grimbal wrote:

I don't understand. With 3 calculators you need only 4 seconds, not 5. While the 3rd calculator does its 3rd sum, the first and 2nd calculator can combine their sums.

That's ruled out by a condition stated near the end of the original post - that all calculators need to complete their initial batch of calculations before the final summation can take place...

Title: Re: Industrial/Distributed Computing
Post by Grimbal on Feb 19^th, 2008, 6:06am

Sorry, it was indeed very clear in the initial post.

Note to myself: always read the fine print at the end of the page.

Title: Re: Industrial/Distributed Computing
Post by Icarus on Feb 19^th, 2008, 3:51pm

on 02/19/08 at 01:16:58, towr wrote:

That depends on how much work is left to do. It seems to me that many tasks require intermediary results be shared even though the bulk of the computing remains. Collecting and redistributing should be a common enough approach. Only when you are near the end of the calculation does it make sense to leave the rest to the controller. Certainly here, if C is large enough, significant time can be saved by redistributing.

I assumed that the use of summing as the task was merely because it makes it easy to understand the process. In a more realistic application, the task involved would be considerably more complex, which is why by comparison, distributing tasks is effectively free (just like you don't consider adding and subtracting when counting flops). In that case, correlating a large number of sub-processor results can still take significant time if performed by only one processor. Unless there is a great need to free up the slave processors for other tasks, it makes sense to use as many as necessary to speed up the final calculation.

But this is k2xl's question, so he controls what the task is. I'll give over my grouching.

Title: Re: Industrial/Distributed Computing
Post by towr on Feb 20^th, 2008, 12:58am

on 02/19/08 at 15:51:09, Icarus wrote:

I assumed that the use of summing as the task was merely because it makes it easy to understand the process. In a more realistic application, the task involved would be considerably more complex, which is why by comparison, distributing tasks is effectively free (just like you don't consider adding and subtracting when counting flops).

Well yes, it wouldn't make sense to distribute tasks unless it's effectively free (or at least cheaper than not distributing it). But the delegated tasks have to be pretty big for this to be the case, because communication is several orders of magnitude more costly than memory access.
I suppose the biggest issue is whether the task remains the same size (e.g. weather simulation), or gets smaller over time (adding, or other 'data aggregation'(?)). In the latter case communication very quickly becomes more costly compared to the gain from distributing.
Most of the examples I've dealt with are of the latter type, where you have to work through huge amounts of data and join the results together.

Title: Re: Industrial/Distributed Computing
Post by k2xl on Feb 29^th, 2008, 12:26pm

Bumping to see if anyone wants to give another shot :-)

Title: Re: Industrial/Distributed Computing
Post by k2xl on Feb 29^th, 2008, 12:45pm

Btw, here's a spreadsheet (from data points 1-200 as posted in an earlier post)

http://www.mediafire.com/?nexjsg99ho4

Title: Re: Industrial/Distributed Computing
Post by Eigenray on Feb 29^th, 2008, 1:26pm

on 02/29/08 at 12:26:29, k2xl wrote:

Bumping to see if anyone wants to give another shot :-)

What was wrong with my first one?