wu :: forums
« wu :: forums - Comparing sudokus »

Welcome, Guest. Please Login or Register.
Jan 7th, 2025, 11:03pm

RIDDLES SITE WRITE MATH! Home Home Help Help Search Search Members Members Login Login Register Register
   wu :: forums
   riddles
   cs
(Moderators: ThudnBlunder, Grimbal, william wu, SMQ, towr, Eigenray, Icarus)
   Comparing sudokus
« Previous topic | Next topic »
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: Comparing sudokus  (Read 636 times)
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Comparing sudokus  
« on: Aug 30th, 2007, 5:35am »
Quote Quote Modify Modify

I'll assume you all know what a sudoku is.  And if you want the short story, read the last 2 paragraphs.
 
We have a couple of free newspapers around my place, and like almost every newspaper, they have a daily sudoku.  I once was doing a difficult sudoku, and at some point needed to resort to some non-trivial reasoning to continue.  But I noticed that I did a very similar reasoning in the sudoku of the other newspaper of the same day.  I investigated and found that the 2 sudokus were actually the same, with just a permutation of rows, columns and digits.  Further investigation revealed that all difficult sudokus in 3 newspapers were permutations of a single template!  The medium sudokus had some 6 templates, for the easy ones I found 2 template among 3 sudokus.
 
For that investigation I recorded the sudokus from these newspapers and wrote a program to compare them agains the templates I identified.
 
Some sudokus did not match, but it appeared that it was because of typos (by the newspaper).  One time it was a digit in the wrong place and once there were 2 digits missing.
 
So I needed to compare not the equivalence of 2 sudokus, but some kind of distance between 2 sudokus.  This was to be able to match a sudoku against a template even if there are a few typos.
 
The comparator for equality could rely on the fact that a row still has the same number of digits after permutation, and on the fact that once a digit has been matched, all other digits must match in the same way.  That was speeding things up.  But that is not possible any more if we want to allow for errors.
 
So, the question is: given 2 sudokus, how to best compute the minimum distance between them?  The difference between 2 sudokus is defined as the minimum number of digits that are different over all possible transformations of one of them by a permutation of the rows (compatible with sudoku rules), a permutation of the columns, a permutation of the digits and possibly a transposition (exchange of rows and columns).
 
A brute-force approach requires 64·64·9!·2 = 1.2e12 comparisons of 2 grids, which is a bit slow if you have many sudokus to compare.  Is there a faster way?  I have some ideas, but I'd like to see yours before.
« Last Edit: Aug 30th, 2007, 10:02am by Grimbal » IP Logged
mikedagr8
Uberpuzzler
*****



A rich man is one who is content; not wealthy.

   


Gender: male
Posts: 1105
Re: Comparing sudokus  
« Reply #1 on: Aug 30th, 2007, 5:43am »
Quote Quote Modify Modify

Wouldn't some be just rotations of others? This would narrow it down, unless you already calculated this. I'm off to bed now. That's a nice problem though. I'll discuss with my maths teacher to see what he says. If he has the time though.
« Last Edit: Aug 30th, 2007, 5:43am by mikedagr8 » IP Logged

"It's not that I'm correct, it's that you're just not correct, and so; I am right." - M.P.E.
Barukh
Uberpuzzler
*****






   


Gender: male
Posts: 2276
Re: Comparing sudokus  
« Reply #2 on: Aug 30th, 2007, 6:05am »
Quote Quote Modify Modify

Grimbal, could you send two instances of the same template you encountered?
IP Logged
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: Comparing sudokus  
« Reply #3 on: Aug 30th, 2007, 6:12am »
Quote Quote Modify Modify

Tonight.
IP Logged
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: Comparing sudokus  
« Reply #4 on: Aug 30th, 2007, 6:17am »
Quote Quote Modify Modify

on Aug 30th, 2007, 5:43am, mikedagr8 wrote:
Wouldn't some be just rotations of others? This would narrow it down, unless you already calculated this. I'm off to bed now. That's a nice problem though. I'll discuss with my maths teacher to see what he says. If he has the time though.

Yes, this is one special case of the transformations I consider.  But It still needs to be checked.
IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: Comparing sudokus  
« Reply #5 on: Aug 30th, 2007, 6:47am »
Quote Quote Modify Modify

Are we talking about completely filled in sudokus?
 
It seems it should be possible to do some sort of normalizing process. Although exchanging digits is a bit of a problem (transforming the sudoku so you always have a 1 in the top-left doesn't make sense if you can just exchange two digits throughout)
« Last Edit: Aug 30th, 2007, 6:53am by towr » IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
pex
Uberpuzzler
*****





   


Gender: male
Posts: 880
Re: Comparing sudokus  
« Reply #6 on: Aug 30th, 2007, 6:54am »
Quote Quote Modify Modify

on Aug 30th, 2007, 6:47am, towr wrote:
transforming the sudoku so you always have a 1 in the top-left doesn't make sense if you can just exchange two digits throughout

But you could transform it so that the top row is 123456789.
IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: Comparing sudokus  
« Reply #7 on: Aug 30th, 2007, 7:04am »
Quote Quote Modify Modify

on Aug 30th, 2007, 6:54am, pex wrote:
But you could transform it so that the top row is 123456789.
Yes, but you can make any of the rows the top one, exchange digits throughout to get 123456789; and you can still do it after permuting columns and after rotation.
So that's a lot of possible normal forms starting from just one sudoku. (You could pick whichever is most ordered, but then you'd still have to create them all.)
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
TenaliRaman
Uberpuzzler
*****



I am no special. I am only passionately curious.

   


Gender: male
Posts: 1001
Re: Comparing sudokus  
« Reply #8 on: Aug 30th, 2007, 7:12am »
Quote Quote Modify Modify

An idea.
 
Let X be a sudoku grid and let T1, T2, T3, T4 be the set of transformations. Consider all the possible sudoku grids that can be obtained by applying the transformations. We can graph them. X1 is connected to X2 in this graph, if there is a transformation Ti such that Ti(X1) = X2.
 
Now, if we call X as the start node and Y as the goal node, then we should be able to do a bit of modified A* search and obtain the desired min distance (err well, atleast i think we should be able to obtain optimal).
 
-- AI
IP Logged

Self discovery comes when a man measures himself against an obstacle - Antoine de Saint Exupery
Barukh
Uberpuzzler
*****






   


Gender: male
Posts: 2276
Re: Comparing sudokus  
« Reply #9 on: Aug 30th, 2007, 8:31am »
Quote Quote Modify Modify

IMHO, first of all, we need to define all the "equivalence" transformations.
 
Rotations and digit permutations are easy.
 
Row/column/block permutations are a bit trickier. Can we come with an exact formulation that covers all such transformations?
 
I didn't understand the "translation" part.
IP Logged
SMQ
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 2084
Re: Comparing sudokus  
« Reply #10 on: Aug 30th, 2007, 8:46am »
Quote Quote Modify Modify

on Aug 30th, 2007, 8:31am, Barukh wrote:
Row/column/block permutations are a bit trickier. Can we come with an exact formulation that covers all such transformations?

- Rows 1, 2, and 3 may be freely permuted with one another.  Likewise rows 4, 5, and 6 and rows 7, 8 and 9.  Likewise the correspondingly-numbered columns.
- The three groups of: rows 1, 2, and 3; rows 4, 5, and 6; and rows 7, 8, and 9 may be freely permuted with one another. Likewise the correspondingly numbered columns.
 
I believe that in combination those rules cover all row and column permutations which preserve the sudoku, and that all legal block permutations can be expressed as row group and/or column group permutations.
 
--SMQ
IP Logged

--SMQ

Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: Comparing sudokus  
« Reply #11 on: Aug 30th, 2007, 10:01am »
Quote Quote Modify Modify

on Aug 30th, 2007, 6:47am, towr wrote:
Are we talking about completely filled in sudokus?

In my case it is non-filled sudokus (i.e. as published in a newspaper).  This helps for an exact match, because you need to match rows/columns only where they have the right number of cells filled.
 
on Aug 30th, 2007, 7:12am, TenaliRaman wrote:
Let X be a sudoku grid and let T1, T2, T3, T4 be the set of transformations. Consider all the possible sudoku grids that can be obtained by applying the transformations. We can graph them. X1 is connected to X2 in this graph, if there is a transformation Ti such that Ti(X1) = X2.

I have to look into it.  But I don't feel it is a graph problem.  The set of transformations is quite orthogonal.
 
on Aug 30th, 2007, 8:31am, Barukh wrote:
I didn't understand the "translation" part.

Sorry, I meant transposition.
 
on Aug 30th, 2007, 8:31am, Barukh wrote:
Row/column/block permutations are a bit trickier. Can we come with an exact formulation that covers all such transformations?

I think row permutations only need to keep the groups of 3 rows together.  I.e. you make a permutation of the groups and a permutation of the rows within the groups.  Same for the columns.
PS: as SMQ proposes
IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: Comparing sudokus  
« Reply #12 on: Aug 30th, 2007, 10:34am »
Quote Quote Modify Modify

on Aug 30th, 2007, 10:01am, Grimbal wrote:

In my case it is non-filled sudokus (i.e. as published in a newspaper).  This helps for an exact match, because you need to match rows/columns only where they have the right number of cells filled.
Ah, that improves things a lot.
To normalize the sudoku you could try to pick a block with a (least) unique number of filled cells for a start; there won't always be one, but perhaps often there is.
And you could do something with how many different numbers there are in a group of three rows/columns.
 
I'll give it some more thought; there's a lot more to work with when most of it is empty..
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: Comparing sudokus  
« Reply #13 on: Aug 30th, 2007, 1:41pm »
Quote Quote Modify Modify

on Aug 30th, 2007, 6:05am, Barukh wrote:
Grimbal, could you send two instances of the same template you encountered?

These 3 are equivalent and from 3 different newspapers.
Le Matin bleu, Mercredi 22/8/2007 
Superieur
..4.53...
.9.6.72.1
.......4.
....8....
2.....96.
.5.7.64.3
6...9.1..
..3..2.7.
5.8......
 
20 minutes, Mercredi 22/8/2007
Superieur
.27.9..86
4.1..3...
......3..
9..2....8
..6..17..
...4.5...
...6..2.9
5........
.72.4..13
 
La Cote, Jeudi 23/8/2007
Difficile
..4..798.
.8.2..1..
..7....3.
8..35....
....614..
..9....6.
..1....2.
6....9..5
3.24...1.
IP Logged
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: Comparing sudokus  
« Reply #14 on: Aug 30th, 2007, 2:26pm »
Quote Quote Modify Modify

on Aug 30th, 2007, 10:34am, towr wrote:

Ah, that improves things a lot.
To normalize the sudoku you could try to pick a block with a (least) unique number of filled cells for a start; there won't always be one, but perhaps often there is.
And you could do something with how many different numbers there are in a group of three rows/columns.

Yes, for an exact match, it helps a lot.  The problem is when you want to consider the possibility of errors.  You cannot just match for instance the 2 single 2-cell rows, because one of them could be a 3-cell row with a cell missing.
IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: Comparing sudokus  
« Reply #15 on: Aug 30th, 2007, 2:39pm »
Quote Quote Modify Modify

One place to start, counting all the number.
In your examples you'd get
233334323
343323323
433323233
So the 6, 2 and 1 from the three should respectively match up, as the count of 4 is unique. (And of course the each count should occur equally often in each case, which it does)
 
You can do it similarly for groups of rows/columns (with rotation you have some extra choice, rows in one case can be matched with column in another; a global check first can weed out some non-matches)
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: Comparing sudokus  
« Reply #16 on: Aug 30th, 2007, 2:41pm »
Quote Quote Modify Modify

on Aug 30th, 2007, 2:26pm, Grimbal wrote:
Yes, for an exact match, it helps a lot.  The problem is when you want to consider the possibility of errors.  You cannot just match for instance the 2 single 2-cell rows, because one of them could be a 3-cell row with a cell missing.
Yes, 'errors' is a bit of a problem. Things may have been added as well as left out.
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: Comparing sudokus  
« Reply #17 on: Aug 31st, 2007, 5:20am »
Quote Quote Modify Modify

The best I did so far for the case with errors is to try all permutations of rows and columns on one sudoku.  For a permutation I count the frequency of each pair of digits, i.e. for each pair (a,b) of digits, (in range 0..9 to take care of missing digits), I count how many time digit a is found in the first grid where digit b in the second.  From there, I search the permutation of digits that get the best score.  The score is calculated as freq[0,0]+sum(freq[d,p(d)]) where the sum is over all digits 1..9, freq is the frequency table for digit pairs, p is a permutation of digits 1..9.  The distance is 81-score.
 
Finding the best digits permutation seems to be an "assignment problem".  Find a mapping of the digits that maximizes the matches observed.  I haven't tried implementing that yet.  For time being, I try all permutations.
 
Second, I was wondering if the assignment can be used on the whole problem.  To find how to assign not only the digits, but also the rows and columns.
 
 
About the matching of a filled sudoku, it seems it is not that hard.  For each permutation of the columns, try to match the first row of the transformed sudoku to each of the rows of the template.  The matching specifies the digit permutation.  Once you have the digit permutation, you can match all other rows using the leftmost cell.  That makes 1296 column permutations times 9 rows to try.
« Last Edit: Aug 31st, 2007, 5:29am by Grimbal » IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: Comparing sudokus  
« Reply #18 on: Sep 2nd, 2007, 7:36am »
Quote Quote Modify Modify

It might be fruitful to first (partially) solve the sudoku.
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: Comparing sudokus  
« Reply #19 on: Sep 2nd, 2007, 3:23pm »
Quote Quote Modify Modify

I thought of it, but I decided against it.  The point with allowing for errors is that the newspaper or myself can have made mistakes when copying it.  If I started solving a wrongly stated sudoku, with a wrong digit somewhere, I would probaly add more errors.
IP Logged
Sameer
Uberpuzzler
*****



Pie = pi * e

   


Gender: male
Posts: 1261
Re: Comparing sudokus  
« Reply #20 on: Sep 2nd, 2007, 3:30pm »
Quote Quote Modify Modify

How do you determine if there is an error? I am not much of a sudoku player, but each digit has a correlation on a row, a column and a 3x3 square. That means it is encoded by three bits of information. If you were to solve the trivial spaces, would you be able to find the conflicting number or detect an error? If you can, you can always mark that "erroneous" space as suspect and treat it such way when doing the permutation comparison? Can't you? Might need more analysis or maybe some math to figure out the error rate!!  Undecided
« Last Edit: Sep 2nd, 2007, 3:31pm by Sameer » IP Logged

"Obvious" is the most dangerous word in mathematics.
--Bell, Eric Temple

Proof is an idol before which the mathematician tortures himself.
Sir Arthur Eddington, quoted in Bridges to Infinity
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board