|
||||
Title: Pandigital expressions Post by Leonid Broukhis on Sep 17th, 2007, 9:29pm What is the min positive integer that cannot be represented as an expression that uses only digits 1 to 9 - each one exactly once in any order, four arithmetic operations (+, -, *, /) and parentheses? Concatenating digits is not allowed. E.g. 1 = (3+5+7-2-9)/8*(6-4)*1 10 = 3*5*1+6-9+8+4-7*2 100 = 4*1+(2+7)*8+6+9*(5-3) 1000 = 8*(3+7)*(6+9-2*5/4/1) 10000 = (7*(((2+3)*9)*(5-1))-6-4) * 8 |
||||
Title: Re: Pandigital expressions Post by Grimbal on Sep 18th, 2007, 8:27am I wouldn't try that without a computer. |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 18th, 2007, 8:33am I'm working on trying it with a computer, but there are a huge number (tens of trillions) of legal expressions, with likely multiple billions of unique --SMQ |
||||
Title: Re: Pandigital expressions Post by Grimbal on Sep 18th, 2007, 9:25am on 09/18/07 at 08:33:23, SMQ wrote:
Or maybe 9!*3/2. ;) |
||||
Title: Re: Pandigital expressions Post by Leonid Broukhis on Sep 18th, 2007, 10:04am Quote:
Exactly! SMQ: I have not started writing such a program, but the base data structure should be the bitmask of used digits, the value, top arith op, and two links to the operands. This way all equivalences are automatically avoided. |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 18th, 2007, 10:19am Actually, after some thought, I think I'm going to use common subexpression elimination "in reverse." Two subexpressions are unique iff they have different values or have the same values but utilize different terms. Given all all unique subexpressions of lengths 1 through n-1 it is fairly straightforward to calculate all subexpressions of length n, and by eliminating duplicate values early in the process the number of calculations is greatly reduced. Together with a few other tricks, I think that can be made to run in a reasonable amount of time. --SMQ |
||||
Title: Re: Pandigital expressions Post by Barukh on Sep 18th, 2007, 10:46am I've got a program (written for another thread long time ago). It's highly non-optimized - takes a long time to run! :( |
||||
Title: Re: Pandigital expressions Post by Leonid Broukhis on Sep 18th, 2007, 11:11am Another interesting question: what is the expected value of such an expression (assuming all 4 arith ops are equiprobable, and omitting division by zero). |
||||
Title: Re: Pandigital expressions Post by Barukh on Sep 19th, 2007, 2:18am Just for the record: Digits Min non-rep ----------------------------- 1-5 76 1-6 284 1-7 1413 |
||||
Title: Re: Pandigital expressions Post by Leonid Broukhis on Sep 19th, 2007, 2:28am Thank you. http://www.research.att.com/~njas/sequences/A060315 |
||||
Title: Re: Pandigital expressions Post by Grimbal on Sep 19th, 2007, 2:37am It is not exactly the same. The Integer Sequence Encyclopedia includes zero, which could be used to cancel some terms. It doesn't seem to affect the result much, though. |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 19th, 2007, 6:16am Also, the Sloanes sequence doesn't require you to use all the digits (see the example there). My program looks like it'll be reasonably fast, but it runs out of memory, so I have to rewrite parts of it to be able to store intermediate results to disk... --SMQ |
||||
Title: Re: Pandigital expressions Post by Barukh on Sep 19th, 2007, 6:51am SMQ, I think that restriction won't change the final result. But computing the result in a reasonable amount of time is an interesting and non-trivial task in its own right. My naive program would compute it in more than 27 years on a single machine... :o |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 19th, 2007, 2:44pm I can now confirm that the smallest positive integer which cannot be represented as a pandigital expression is 38103 (matching the sequence from Sloanes). 8) My program takes a bit over 1/2 hour (on a 3.2 GHz Pentium 4) to generate all unique integer values of pandigital expressions (there are 205901 of them). I'll work on the expected value question tomorrow. ;) --SMQ |
||||
Title: Re: Pandigital expressions Post by Leonid Broukhis on Sep 19th, 2007, 3:16pm SMQ: Disallow division, and things should be much faster :) This will let you answer the question what is the min number that really needs division, and how many of them out of 205901 do. |
||||
Title: Re: Pandigital expressions Post by Barukh on Sep 20th, 2007, 6:49am Very impressive, SMQ! I would like to learn about your program more than you outlined in one the previous posts. |
||||
Title: Re: Pandigital expressions Post by Hippo on Sep 20th, 2007, 7:02am Hmm, so far I count the number of expressions as 9! 48(C714-C614). Which is of order around 1013. So you must optimise the enumeration as you have limited time... |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 20th, 2007, 7:30am on 09/20/07 at 06:49:36, Barukh wrote:
Thanks. :) The primary data structure is a 512-element array of sets of rationals in lowest terms (one set of rationals for each set of digits used). After populating the trivial sets for single-digit subexpressions, the main loop is as follows: For each n from 2 through 9 For each set S of n digits For each pair of (nonempty) sets of digits A, B, A http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/cup.gif B = S, A http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/cap.gif B = http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/varnothing.gif For each rational a in the set of rationals associated with A For each rational b in the set of rationals associated with B Add (iff not already present) to the set of rationals associated with S, each of: a + b, a - b, b - a, a * b, a / b (iff b http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/ne.gif 0), and b / a (iff a http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/ne.gif 0). By keeping the sets of rationals in lowest terms, duplicate subexpression values are eliminated early in the process, bringing the number of calculations down by a factor of several thousand. To reach the 1/2 hour timeframe I additionally only consider integer results for n = 9, rather than deriving the entire set of unique rational results. The rest of the details are in caching large sets of rationals to disk in order to keep the RAM requirements reasonable for my platform (under 1GB). --SMQ |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 20th, 2007, 7:39am on 09/20/07 at 07:02:59, Hippo wrote:
You must be eliminating some obvious duplicates, as I get the number of lexicographically-distinct expressions as (17C8/17) http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/cdot.gif 9! http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/cdot.gif 48 = 16C8 http://www.ocf.berkeley.edu/~wwu/YaBBImages/symbols/cdot.gif 48. --SMQ |
||||
Title: Re: Pandigital expressions Post by Hippo on Sep 21st, 2007, 6:42am I enumerate all expresions in polish notation. The hardest question was to count positions where can the operators be placed. (The 14C7-14C6 term). BTW: I have thought about the algoritmn meanwhile ... I would use another set of 4 operations: +/- as one of them, *, / , \ the others. I would apply less complex expression to more complex therefore two kinds of division should be used /, \. I would not store rational results if quotient is not multiple of not used numbers. I would store results without sign with flag denoting if both signs are reachable. If the same result is obtained for the given set of numbers, the flags are ored. (*,/,\ or's the flag's too, +/- ... the flag rules are more complicated but easy ... it generates two results, one with flag set (-), the other is ored flag of the originals (+)) Rest will be simillar algorithm to yours, but I will generate expressions in order of growing number of used numbers. ... I haven't start codding yet (and I am not sure I will). |
||||
Title: Re: Pandigital expressions Post by Grimbal on Sep 21st, 2007, 7:09am I thought that if I were to write the program, I would restrict the expressions to something like expression = term ['+' term]* ['-' term]* term = factor ['*' factor]* ['/' factor]* factor = number | '(' expression ')' where the terms and factors are in increasing order within each operation. As a result, for instance, I would not try A+B if B is already a sum or if A is the result of a subtraction. This way, I would not try the same sum in all possible permutations. But I later realized that SMQ's method of pairing intermediate results in all possible ways and checking whether the value exists already for the same set of digits is just as efficient to prune these cases. The only difference is that the result needs to be actually computed. So in the end, I don't think it would be much faster. |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 21st, 2007, 7:45am on 09/21/07 at 06:42:25, Hippo wrote:
As did I, but I'm pretty sure that term should be Cat(8) (where Cat(n) is the nth Catalan Number) = 17C8/17 = 16C8 - 16C7 = 1430 rather than Cat(7) = 15C7/15 = 14C7 - 14C6 = 429. My program to enumerate the complete set of result values, along with their multiplicities and an example expression, looks like it should finish sometime tomorrow (the relative slowness due mainly to my inefficient algorithm to merge partial results in memory with the 3GB+ file of results on disk). Then I'll be able to answer the expected value question. ;) --SMQ |
||||
Title: Re: Pandigital expressions Post by Grimbal on Sep 21st, 2007, 8:28am I think that a mean value has no meaning unless you tell exactly what expression are considered as equal. For instance, do "1-(2-3)", "(1-2)+3" and (3+1)-2 count as different expressions? Co "(1+2)+3" and "1+(2+3)" count as the same? |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 21st, 2007, 8:46am I would consider all of those distinct expressions, some of which have the same value as others. My reasoning being that when written in postfix notation or graphed as an operator tree they would have distinct representations. That's the definition of multiplicity my currently-running program is using. ... or would be if I had remembered to multiply by 2 for addition and multiplication, gol durn it! >:( :-[ Now I'm going to have to start it over. Maybe I can improve on the merge algorithm while I'm at it. (That definition is useful in this case as the total number of distinct expressions is fairly easily calculated (although Hippo and I seem do disagree on which Catalan number to use) and can serve as a sanity check on the results of the program...) --SMQ |
||||
Title: Re: Pandigital expressions Post by Leonid Broukhis on Sep 21st, 2007, 9:05am on 09/21/07 at 06:42:25, Hippo wrote:
And that would be your mistake. Note, for example, that 21 = 6 / (1 - 5/7) |
||||
Title: Re: Pandigital expressions Post by Hippo on Sep 21st, 2007, 10:26am Leonid: Oops, thanks ... so this one optimisation does not work ... so I should stay just with +/- trick. SMQ: I was not sure ... is this Catalanyan number or not. ... I have solved the recurence. May be I have made a +/-1 mistake in indexes. I have heard about Catalanyan numbers just one small talk ... I have never realised so far that you obtain them from Pascall triangle "by subtracting its mirror image". Yes, you are right ... I am one number off. I would compute that 2C1-2C0=1 is the number of possibilities how to place two operators around last one operand (into 3 operands string). So nXk = n+k+1Ck+1 - n+k+1Ck is corrected expression for the number of possibilities how to place n operators around last k operands. OK, I had a bit of time so I made a code. First observation: To represent results for each set of operands separately is must. If you use one set and filter it, you spend majority of time filtering. Second observation: Not saving results of form a/b for eight operands when remains number c and neither b|c nor a|c can be helpful. For full set of operands just a/1 results are required. Even with these tricks there are more than 3 000 000 results for 1,...,8 operands. There can be one more trick applied ... as an aditive flag compresses representation of -a/b to a/b (and speeds up *,/,\ oprations by doing them in parallel). We can use a multiplicative flag to compress representatin of +/-b/a to +/-a/b. In that case the +/- operation will have at most 8 results, but operations *,/,\ can be replaced by one operation \*/ having just 2 results. This will parallelize the process even more. (There is small problem here ... the flag is not enough ... you need 3 states a/b means a/b in the first state, it means b/a in the second state and either a/b or b/a in the third state. This allows you to store a/b such that a<b) Oops ... how to represent 1/5 and +/-5 at the same time ... I should use distinct aditive flags. So far the \*/ operation has 2 results, but the rules for flags are more complicated. The +/- operation generates at most 2,4 or 8 results depending on multiplicative flags set (its results have to unset multiplicative flag). The speed-up by the parallelisation is notable, (and the size of the sets fits to my remaining HDD space now). OK: done :) ... 38103,38201,38618,... Now I have read the describtion of SMQ's algorithm in details ... I suppose you coded it in rather low level language. On my NOTEBOOK (1.6GHz, 512MB)it took around half a day time (But I didn't implement hash tables and I used indexed tables in FoxPro instead so the multiplicative "constant" is notable (for upto 5 operands I use cursors ... tables in memory). Just created table (corresponding to S) was indexed.). Filtering writes on 8th level saved a lot of space, too. Oops, the filtering trick was implemented wrongly and didn't filter at all. With the filtering the file size shrinks around 10 times. May be the trick with multiplicative and additive flags should be described better: I store items in a/b form where a<=b and LCD(a,b)=1. With each a/b I store 4 bits. bit 0 seys -b/a is reachable, bit 1 seys b/a is reachable, bit 2 seys -a/b is reachable and bit 3 seys a/b is reachable. When working with an item I first compare bits 32 with bits 10 and reorder them such that bits 32 are higher than bits 10. I swap a and b as well. Now There ar 5 possible bitmaps: 1111,1110,1100,1010 and 1000. I have precomputed matrix of corresponding bits for * and \*/ operations. (Yes you can use larger matrices and not to swap but this was easier for me.) ... instead of bits the counts should be used when counting results and the matrices holding four flags results cannot be easily adopted for counting purposes. You must multiply corresponding multicities and sum them together. So you loose the parallelisation in this case, the trick only saves space, and as the ratio of info/key size increases, the save is smaller. If eight bytes for a count are required, counting would consume 4 times much HDD space ... around 2GB. (And with the shrinking on 8th level ...) ... yes its feasible, but I am not sure I am interested in it. Wow ... I have now realised that the original computation would fit into memory ... but who can expected it. |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 26th, 2007, 8:53pm Nice work, Hippo! As for my own program, after fixing the multiplicity bug and overhauling the file merge code, I can now announce, after a run of only 4.5 hours: {1,2,3,4,5,6,7,8,9}: 115521109 unique values, ints to 38103 34,007,836,262,400 distinct expressions 33,854,394,991,592 rational expressions 153,441,270,808 divisions by 0 expected value: 151.409 :D --SMQ |
||||
Title: Re: Pandigital expressions Post by Leonid Broukhis on Sep 26th, 2007, 8:59pm SMQ, please run it again with disabled division! |
||||
Title: Re: Pandigital expressions Post by Hippo on Sep 27th, 2007, 3:11am on 09/26/07 at 20:53:25, SMQ wrote:
Thanks, the same to you SMQ. ... I would never do such stupid experiment if not excited by your work ;). Can you do something for me? Try to do the +/-,\*/ trick on your platform and compare the time to the original one (Not to the counting one). It should take much less then an half of an hour I expected. Usability of not saving the 8 opernad results when hashing and having sufficient amount of memory is discutable ... I had other thoughts about saveing memory ... to made a count how many times you will need the resulting set of 7 or 8 operands in the future and delete them as early as possible ... Do you hash keys or you find a fast injection to memory and you maintain a separate list of results? |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 27th, 2007, 6:06am on 09/26/07 at 20:59:17, Leonid Broukhis wrote:
{1,2,3,4,5,6,7,8,9} 205898 unique values, ints to 38103 3,404,623,622,400 distinct expressions expected value: 746.466 (in under 2 minutes) --SMQ |
||||
Title: Re: Pandigital expressions Post by SMQ on Sep 27th, 2007, 1:23pm on 09/27/07 at 03:11:41, Hippo wrote:
I'm not sure it's strictly comparable, as implementation details of my program have changed quite a bit since that first run, but it looks like about 9 minutes. Quote:
I hash them, based on a random number derived (by table lookup) from the numerator and denominator. Checks for equality are a fair bit quicker than checks for order (two comparisons and a boolean AND vs. two signed multiplies and a comparison), so I never bother to sort the intermediate results, just the final integers. --SMQ |
||||
Title: Re: Pandigital expressions Post by Hippo on Sep 27th, 2007, 2:24pm Of course, sorting of subresults while hashing is a nonsense. The list in question was not sorted ... if the lookup table is much larger than represented set the unsorted list can be helpful. Nice work SMQ, thanks. It seems to be closed for me now. |
||||
Powered by YaBB 1 Gold - SP 1.4! Forum software copyright © 2000-2004 Yet another Bulletin Board |