Author |
Topic: Comparing sudokus (Read 636 times) |
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Comparing sudokus
« on: Aug 30th, 2007, 5:35am » |
Quote Modify
|
I'll assume you all know what a sudoku is. And if you want the short story, read the last 2 paragraphs. We have a couple of free newspapers around my place, and like almost every newspaper, they have a daily sudoku. I once was doing a difficult sudoku, and at some point needed to resort to some non-trivial reasoning to continue. But I noticed that I did a very similar reasoning in the sudoku of the other newspaper of the same day. I investigated and found that the 2 sudokus were actually the same, with just a permutation of rows, columns and digits. Further investigation revealed that all difficult sudokus in 3 newspapers were permutations of a single template! The medium sudokus had some 6 templates, for the easy ones I found 2 template among 3 sudokus. For that investigation I recorded the sudokus from these newspapers and wrote a program to compare them agains the templates I identified. Some sudokus did not match, but it appeared that it was because of typos (by the newspaper). One time it was a digit in the wrong place and once there were 2 digits missing. So I needed to compare not the equivalence of 2 sudokus, but some kind of distance between 2 sudokus. This was to be able to match a sudoku against a template even if there are a few typos. The comparator for equality could rely on the fact that a row still has the same number of digits after permutation, and on the fact that once a digit has been matched, all other digits must match in the same way. That was speeding things up. But that is not possible any more if we want to allow for errors. So, the question is: given 2 sudokus, how to best compute the minimum distance between them? The difference between 2 sudokus is defined as the minimum number of digits that are different over all possible transformations of one of them by a permutation of the rows (compatible with sudoku rules), a permutation of the columns, a permutation of the digits and possibly a transposition (exchange of rows and columns). A brute-force approach requires 64·64·9!·2 = 1.2e12 comparisons of 2 grids, which is a bit slow if you have many sudokus to compare. Is there a faster way? I have some ideas, but I'd like to see yours before.
|
« Last Edit: Aug 30th, 2007, 10:02am by Grimbal » |
IP Logged |
|
|
|
mikedagr8
Uberpuzzler
A rich man is one who is content; not wealthy.
Gender:
Posts: 1105
|
|
Re: Comparing sudokus
« Reply #1 on: Aug 30th, 2007, 5:43am » |
Quote Modify
|
Wouldn't some be just rotations of others? This would narrow it down, unless you already calculated this. I'm off to bed now. That's a nice problem though. I'll discuss with my maths teacher to see what he says. If he has the time though.
|
« Last Edit: Aug 30th, 2007, 5:43am by mikedagr8 » |
IP Logged |
"It's not that I'm correct, it's that you're just not correct, and so; I am right." - M.P.E.
|
|
|
Barukh
Uberpuzzler
Gender:
Posts: 2276
|
|
Re: Comparing sudokus
« Reply #2 on: Aug 30th, 2007, 6:05am » |
Quote Modify
|
Grimbal, could you send two instances of the same template you encountered?
|
|
IP Logged |
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Comparing sudokus
« Reply #3 on: Aug 30th, 2007, 6:12am » |
Quote Modify
|
Tonight.
|
|
IP Logged |
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Comparing sudokus
« Reply #4 on: Aug 30th, 2007, 6:17am » |
Quote Modify
|
on Aug 30th, 2007, 5:43am, mikedagr8 wrote:Wouldn't some be just rotations of others? This would narrow it down, unless you already calculated this. I'm off to bed now. That's a nice problem though. I'll discuss with my maths teacher to see what he says. If he has the time though. |
| Yes, this is one special case of the transformations I consider. But It still needs to be checked.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Comparing sudokus
« Reply #5 on: Aug 30th, 2007, 6:47am » |
Quote Modify
|
Are we talking about completely filled in sudokus? It seems it should be possible to do some sort of normalizing process. Although exchanging digits is a bit of a problem (transforming the sudoku so you always have a 1 in the top-left doesn't make sense if you can just exchange two digits throughout)
|
« Last Edit: Aug 30th, 2007, 6:53am by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
pex
Uberpuzzler
Gender:
Posts: 880
|
|
Re: Comparing sudokus
« Reply #6 on: Aug 30th, 2007, 6:54am » |
Quote Modify
|
on Aug 30th, 2007, 6:47am, towr wrote:transforming the sudoku so you always have a 1 in the top-left doesn't make sense if you can just exchange two digits throughout |
| But you could transform it so that the top row is 123456789.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Comparing sudokus
« Reply #7 on: Aug 30th, 2007, 7:04am » |
Quote Modify
|
on Aug 30th, 2007, 6:54am, pex wrote:But you could transform it so that the top row is 123456789. |
| Yes, but you can make any of the rows the top one, exchange digits throughout to get 123456789; and you can still do it after permuting columns and after rotation. So that's a lot of possible normal forms starting from just one sudoku. (You could pick whichever is most ordered, but then you'd still have to create them all.)
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
TenaliRaman
Uberpuzzler
I am no special. I am only passionately curious.
Gender:
Posts: 1001
|
|
Re: Comparing sudokus
« Reply #8 on: Aug 30th, 2007, 7:12am » |
Quote Modify
|
An idea. Let X be a sudoku grid and let T1, T2, T3, T4 be the set of transformations. Consider all the possible sudoku grids that can be obtained by applying the transformations. We can graph them. X1 is connected to X2 in this graph, if there is a transformation Ti such that Ti(X1) = X2. Now, if we call X as the start node and Y as the goal node, then we should be able to do a bit of modified A* search and obtain the desired min distance (err well, atleast i think we should be able to obtain optimal). -- AI
|
|
IP Logged |
Self discovery comes when a man measures himself against an obstacle - Antoine de Saint Exupery
|
|
|
Barukh
Uberpuzzler
Gender:
Posts: 2276
|
|
Re: Comparing sudokus
« Reply #9 on: Aug 30th, 2007, 8:31am » |
Quote Modify
|
IMHO, first of all, we need to define all the "equivalence" transformations. Rotations and digit permutations are easy. Row/column/block permutations are a bit trickier. Can we come with an exact formulation that covers all such transformations? I didn't understand the "translation" part.
|
|
IP Logged |
|
|
|
SMQ
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 2084
|
|
Re: Comparing sudokus
« Reply #10 on: Aug 30th, 2007, 8:46am » |
Quote Modify
|
on Aug 30th, 2007, 8:31am, Barukh wrote:Row/column/block permutations are a bit trickier. Can we come with an exact formulation that covers all such transformations? |
| - Rows 1, 2, and 3 may be freely permuted with one another. Likewise rows 4, 5, and 6 and rows 7, 8 and 9. Likewise the correspondingly-numbered columns. - The three groups of: rows 1, 2, and 3; rows 4, 5, and 6; and rows 7, 8, and 9 may be freely permuted with one another. Likewise the correspondingly numbered columns. I believe that in combination those rules cover all row and column permutations which preserve the sudoku, and that all legal block permutations can be expressed as row group and/or column group permutations. --SMQ
|
|
IP Logged |
--SMQ
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Comparing sudokus
« Reply #11 on: Aug 30th, 2007, 10:01am » |
Quote Modify
|
on Aug 30th, 2007, 6:47am, towr wrote:Are we talking about completely filled in sudokus? |
| In my case it is non-filled sudokus (i.e. as published in a newspaper). This helps for an exact match, because you need to match rows/columns only where they have the right number of cells filled. on Aug 30th, 2007, 7:12am, TenaliRaman wrote:Let X be a sudoku grid and let T1, T2, T3, T4 be the set of transformations. Consider all the possible sudoku grids that can be obtained by applying the transformations. We can graph them. X1 is connected to X2 in this graph, if there is a transformation Ti such that Ti(X1) = X2. |
| I have to look into it. But I don't feel it is a graph problem. The set of transformations is quite orthogonal. on Aug 30th, 2007, 8:31am, Barukh wrote:I didn't understand the "translation" part. |
| Sorry, I meant transposition. on Aug 30th, 2007, 8:31am, Barukh wrote:Row/column/block permutations are a bit trickier. Can we come with an exact formulation that covers all such transformations? |
| I think row permutations only need to keep the groups of 3 rows together. I.e. you make a permutation of the groups and a permutation of the rows within the groups. Same for the columns. PS: as SMQ proposes
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Comparing sudokus
« Reply #12 on: Aug 30th, 2007, 10:34am » |
Quote Modify
|
on Aug 30th, 2007, 10:01am, Grimbal wrote: In my case it is non-filled sudokus (i.e. as published in a newspaper). This helps for an exact match, because you need to match rows/columns only where they have the right number of cells filled. |
| Ah, that improves things a lot. To normalize the sudoku you could try to pick a block with a (least) unique number of filled cells for a start; there won't always be one, but perhaps often there is. And you could do something with how many different numbers there are in a group of three rows/columns. I'll give it some more thought; there's a lot more to work with when most of it is empty..
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Comparing sudokus
« Reply #13 on: Aug 30th, 2007, 1:41pm » |
Quote Modify
|
on Aug 30th, 2007, 6:05am, Barukh wrote:Grimbal, could you send two instances of the same template you encountered? |
| These 3 are equivalent and from 3 different newspapers. Le Matin bleu, Mercredi 22/8/2007 Superieur ..4.53... .9.6.72.1 .......4. ....8.... 2.....96. .5.7.64.3 6...9.1.. ..3..2.7. 5.8...... 20 minutes, Mercredi 22/8/2007 Superieur .27.9..86 4.1..3... ......3.. 9..2....8 ..6..17.. ...4.5... ...6..2.9 5........ .72.4..13 La Cote, Jeudi 23/8/2007 Difficile ..4..798. .8.2..1.. ..7....3. 8..35.... ....614.. ..9....6. ..1....2. 6....9..5 3.24...1.
|
|
IP Logged |
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Comparing sudokus
« Reply #14 on: Aug 30th, 2007, 2:26pm » |
Quote Modify
|
on Aug 30th, 2007, 10:34am, towr wrote: Ah, that improves things a lot. To normalize the sudoku you could try to pick a block with a (least) unique number of filled cells for a start; there won't always be one, but perhaps often there is. And you could do something with how many different numbers there are in a group of three rows/columns. |
| Yes, for an exact match, it helps a lot. The problem is when you want to consider the possibility of errors. You cannot just match for instance the 2 single 2-cell rows, because one of them could be a 3-cell row with a cell missing.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Comparing sudokus
« Reply #15 on: Aug 30th, 2007, 2:39pm » |
Quote Modify
|
One place to start, counting all the number. In your examples you'd get 233334323 343323323 433323233 So the 6, 2 and 1 from the three should respectively match up, as the count of 4 is unique. (And of course the each count should occur equally often in each case, which it does) You can do it similarly for groups of rows/columns (with rotation you have some extra choice, rows in one case can be matched with column in another; a global check first can weed out some non-matches)
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Comparing sudokus
« Reply #16 on: Aug 30th, 2007, 2:41pm » |
Quote Modify
|
on Aug 30th, 2007, 2:26pm, Grimbal wrote:Yes, for an exact match, it helps a lot. The problem is when you want to consider the possibility of errors. You cannot just match for instance the 2 single 2-cell rows, because one of them could be a 3-cell row with a cell missing. |
| Yes, 'errors' is a bit of a problem. Things may have been added as well as left out.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Comparing sudokus
« Reply #17 on: Aug 31st, 2007, 5:20am » |
Quote Modify
|
The best I did so far for the case with errors is to try all permutations of rows and columns on one sudoku. For a permutation I count the frequency of each pair of digits, i.e. for each pair (a,b) of digits, (in range 0..9 to take care of missing digits), I count how many time digit a is found in the first grid where digit b in the second. From there, I search the permutation of digits that get the best score. The score is calculated as freq[0,0]+sum(freq[d,p(d)]) where the sum is over all digits 1..9, freq is the frequency table for digit pairs, p is a permutation of digits 1..9. The distance is 81-score. Finding the best digits permutation seems to be an "assignment problem". Find a mapping of the digits that maximizes the matches observed. I haven't tried implementing that yet. For time being, I try all permutations. Second, I was wondering if the assignment can be used on the whole problem. To find how to assign not only the digits, but also the rows and columns. About the matching of a filled sudoku, it seems it is not that hard. For each permutation of the columns, try to match the first row of the transformed sudoku to each of the rows of the template. The matching specifies the digit permutation. Once you have the digit permutation, you can match all other rows using the leftmost cell. That makes 1296 column permutations times 9 rows to try.
|
« Last Edit: Aug 31st, 2007, 5:29am by Grimbal » |
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Comparing sudokus
« Reply #18 on: Sep 2nd, 2007, 7:36am » |
Quote Modify
|
It might be fruitful to first (partially) solve the sudoku.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Comparing sudokus
« Reply #19 on: Sep 2nd, 2007, 3:23pm » |
Quote Modify
|
I thought of it, but I decided against it. The point with allowing for errors is that the newspaper or myself can have made mistakes when copying it. If I started solving a wrongly stated sudoku, with a wrong digit somewhere, I would probaly add more errors.
|
|
IP Logged |
|
|
|
Sameer
Uberpuzzler
Pie = pi * e
Gender:
Posts: 1261
|
|
Re: Comparing sudokus
« Reply #20 on: Sep 2nd, 2007, 3:30pm » |
Quote Modify
|
How do you determine if there is an error? I am not much of a sudoku player, but each digit has a correlation on a row, a column and a 3x3 square. That means it is encoded by three bits of information. If you were to solve the trivial spaces, would you be able to find the conflicting number or detect an error? If you can, you can always mark that "erroneous" space as suspect and treat it such way when doing the permutation comparison? Can't you? Might need more analysis or maybe some math to figure out the error rate!!
|
« Last Edit: Sep 2nd, 2007, 3:31pm by Sameer » |
IP Logged |
"Obvious" is the most dangerous word in mathematics. --Bell, Eric Temple
Proof is an idol before which the mathematician tortures himself. Sir Arthur Eddington, quoted in Bridges to Infinity
|
|
|
|