Sudoku Solver!

[Dynamic Link — Static Link]
Posted: 2021-05-14
Last Edited: 2021-05-14

Solving a Sudoku

A while ago, for a university assignment, we were asked to write a sudoku solver. My first implementation used the classical brute force approach, written in python. It worked relatively well for simple grids, but took ridiculously long for more complex grids with fewer given values. It also had a pathological case which couldn't be completed after 3 hours of runtime:

 . . . | . . . | . . .
 . . . | . . 3 | . 8 5
 . . 1 | . 2 . | . . .
=======#=======#=======
 . . . | 5 . 7 | . . .
 . . 4 | . . . | 1 . .
 . 9 . | . . . | . . .
=======#=======#=======
 5 . . | . . . | . 7 3
 . . 2 | . 1 . | . . .
 . . . | . 4 . | . . 9

This case is pathological, because the initial row consists of the numbers 9 -> 1. This means that the algorithm takes the longest amount of time possible on the first row, having to test 9! posibilities; the algorithm starts with 1, attempts to fill in the rest of the grid, and if it fails backtracks and tries with 2, then 3, etc. This poor performance motivated me to try to write a better solver.

Sudoku and Exact Covers

During a trip down a particular internet rabbit hole concerning sudoku solvers, I ran across the so-called Algorithm X. It was invented by Donald E. Knuth so solve exact cover problems. An exact cover problem is one where you have a set X of possible values, and you have another set S containing sets of values in X:

For example:
X = { 0, 1, 2, 3, 4, 5, 6 }

S = { A, B, C, D }
A = { 0, 1, 3 }
B = { 2, 4, 5 }
C = { 2, 6 }
D = { 6 }

In an exact cover problem, you are trying to find a subset of S such that each element in the universe set X is present exactly once. Luckily for us, sudoku can be expressed as exactly this kind of problem!

In sudoku, you have a 2D grid. This grid usually has 9 rows and 9 columns, and is arranged as a 3x3 grid of 3x3 cells. To satisfy a sudoku, you must have exactly one of each number between 1 and 9 (inclusive) in each row, in each column, and in each cell. See the link? We have 4 sets of constraints:

value-constraints : (i, j) = v
row-constraints : v in row j
column-constraints: v in column i
cell-constraints : v in cell (i', j')

Our universe set (aka constraint set) X is therefore the union of these 4 sets, and our possibility set S will be the union of all numer-position pairs.

Introducing Algorithm X

Knuth's Algorithm X is an efficient way of solving exact cover problems. It takes in a matrix representation of the problem, with the universe set X as the columns of the matrix, and the possibility set S as the rows. For each row, the presence of a 1 in a given column means that the element represented by the column is present in the set represented by the row, and similarly a 0 in the same column would mean that the element is not present in the set.

Algorithm X works as follows:

Let M be our possibility-constraint matrix
Let M(i,j) denote the value in column i, row j

1. If M has no columns, return true
2. Pick a column c in M
3. Pick a row r in column c such that M(c,r) = 1
4. Add the row to a solution set
5. For each column j such that M(j,r) = 1
      For each row i such that M(j,i) = 1
         Remove row i from the matrix
      Remove column j from the matrix
6. Recurse on the new, reduced matrix M

Knuth recommends picking the column with the fewest 1s in step 2. Although any consistent way of picking a column would work, it could lead to more operations and thus a slower solve. Thus, we will stick with the recommended method. Using our previous example problem, heres how Algorithm X would run through the problem:

X = { 0, 1, 2, 3, 4, 5, 6 }

S = { A, B, C, D }
A = { 0, 1, 3 }
B = { 2, 4, 5 }
C = { 2, 6 }
D = { 6 }

M:
    0 1 2 3 4 5 6 
A [ 1 1 0 1 0 0 0 ]
B [ 0 0 1 0 1 1 0 ]
C [ 0 0 1 0 0 0 1 ]
D [ 0 0 0 0 0 0 1 ]

Solution = { }

Algorithm:
1. M has columns, continue
2. Pick column 0 (first with fewest 1s)
3. Pick row A
4. S += A (S = { A })
5. For j in { 0, 1, 3 }
      For i in { A }
         Remove row i from M
      Remove column j from M

M:
    2 4 5 6
B [ 1 1 1 0 ]
C [ 1 0 0 1 ]
D [ 0 0 0 1 ]

6. Recurse on the new, reduced M
7. M has columns, continue
8. Pick column 4 (first with fewest 1s)
9. Pick row B
10. S += B (S = { A, B })
11. For j in { 2, 4, 5 }
       For i in { B, C }
          Remove row i from M
       Remove column j from M

M:
    6
D [ 1 ]

12. Recurse on the new, reduced M
13. M has columns, continue
14. Pick column 6 (first with fewest 1s)
15. Pick row D
16. S += D (S = { A, B, D })
17. For j in { 6 }
       For i in { D }
          Remove row i from M
       Remove column j from M

M:
[ ]

18. Recurse on the new, reduced M
19. M has no columns, return true

Therefore, S = { A, B, D}

As we can see, the algorithm is quite efficient. We didn't have to make many tests and comparisons, and arrived at the correct answer quickly.

Optimisations

Using a Linked List Matrix

The example given above was fairly trivial. It showed how the algorithm works, but it was a tiny problem. For larger problems, the naive version of Algorithm X (using an array-based matrix) spends a lot of its time iterating over rows and columns looking for a 1. This problem is exacerbated when the matrix is sparse (a lot of the values are 0). To solve this problem, Knuth recommends using a linked-list based matrix, which stores only the 1s in the matrix. This is done by having a row node indicate the presence of a 1.

An example of what the linked-list based matrix would look like for our example:

"Dense" Matrix:
    0 1 2 3 4 5 6 
A [ 1 1 0 1 0 0 0 ]
B [ 0 0 1 0 1 1 0 ]
C [ 0 0 1 0 0 0 1 ]
D [ 0 0 0 0 0 0 1 ]

"Sparse" Matrix:
  | 0      1     2     3     4     5    6
==|=======================================
A | A0 <> A1 <------> A3
B |             B2 <------> B4 <> B5
C |             C2 <------------------> C6
D |                                     D6

All nodes in a column are doubly-linked to the elements above them and below them (i.e the column header for 2 is linked downwards to B2, and upwards to C2). All nodes in a row are doubly-linked to the elements to their left and right (i.e A0 is linked to the left to A3, and to the right to A1).

Fun Fact: This 2D doubly-linked list forms a torus topology, where the column headers form the initial 2D ring, and the rows make the ring 3D (give the doughnut its thickness).

Dancing Links

Another optimisation that follows naturally from the use of a linked-list matrix is to use a technique called dancing links. This is a method of implementing backtracking using linked lists, by splicing out an element (without resetting its previous and next pointers; this means it is in an invalid and unsafe state), recursing, then splicing it back afterwards (using its stored pointers).

Splicing an element out of a linked list:

A <---> B <---> C

After splicing out B:
A <---> C
 \_ B _/

Here, B has retained its previous pointer to A and its next pointer to C. It can then be added to the list again by settings its previous element's (A) next pointer to itself (B), and settings its next element's (C) previous pointer to itself (B):

A <---> C
 \_ B _/

After splicing in B:
A <---> B <---> C

Implementation

Although this post has mostly been on Algorithm X, and hasn't shown any code, my implementation (in C) can be found here.