Sudoku

Sudoku puzzles are a simple and popular amusement given as a nine-by-nine grid of cells, some of them containing digits:

7 _ _   1 _ _   _ _ _
_ 2 _   _ _ _   _ 1 5
_ _ _   _ _ 6   3 9 _
 
2 _ _   _ 1 8   _ _ _
_ 4 _   _ 9 _   _ 7 _
_ _ _   7 5 _   _ _ 3
 
_ 7 8   5 _ _   _ _ _
5 6 _   _ _ _   _ 4 _
_ _ _   _ _ 1   _ _ 2
The challenge is to fill the empty cells with the digits 1 through 9 in such a way that no row, column, or three-by-three sub-grid contains the same digit two or more times.

This note describes a simple sudoku solver based on depth-first search; the solver assumes that the puzzle is well-formed, with exactly one solution. Input is a string of eighty-one characters representing a sudoku grid in row-major order, top-to-bottom and left-to-right, with a zero in empty cells. For instance, the sudoku puzzle shown above is represented as the string 700100000020000015000006390200018000040090070000750003078500000560000040000001002. Output is represented in the same form. The sudoku puzzle is solved by the function (sudoku puzzle), where puzzle is the string containing the puzzle.

(define (sudoku puzzle)
  « utility functions »
  « read and process puzzle »)

The sudoku puzzle is represented internally as two lists, empty and filled, which between them always contain eighty-one cells. Rows are numbered 0 through 8 from top to bottom, columns are numbered 0 through 8 from left to right, and boxes (three-by-three sub-grids) are numbered 0 through 8 from top-to-bottom, left-to-right, in row-major order. Digits are #\1 through #\9, plus #\0 as a placeholder for a cell with a yet-unknown digit. The code below reads a sudoku puzzle as an eighty-one-character string and creates the internal representation of the puzzle in the two lists empty and filled.

« read and process puzzle »≡
(let scan ((s 0) (empty '()) (filled '()))
  (if (< s 81)
      (let* ((row (quotient s 9))
             (col (modulo s 9))
             (box (+ (* (quotient row 3) 3) (quotient col 3)))
             (digit (string-ref puzzle s))
             (cell (vector row col box digit)))
        (if (char=? digit #\0)
            (scan (+ s 1) (cons cell empty) filled)
            (scan (+ s 1) empty (cons cell filled))))
      « solve puzzle and write output »))

The reader scans the input string left-to-right, with s indexing the characters of the string. The row, column and box are computed by various arithmetic on the index s, and a four-slot vector representing the cell is created; then the cell is added to either the empty or filled list, as appropriate. When the entire string has been scanned the puzzle is passed to the solver.

The solver works by moving cells from the empty list to the filled list. At each iteration of the solve loop, the cell at the head of the empty list is examined. If a digit can be added at the current cell without violating any sudoku constraints, the solver advances by moving the cell from the empty list to the filled list. However, if no digit can safely be added at the current cell, the solver backtracks by clearing (resetting the digit to #\0) the top cell on the empty list and moving the head of the filled list back to the head of the empty list.

« solve puzzle and write output »≡
(let solve ((empty empty) (filled filled))
  (if (pair? empty)
      (let try ((cell (car empty)) (digit (next (vector-ref (car empty) 3))))
        (cond ((char<? #\9 digit) ; backtrack
                (solve (cons (car filled) (cons (new cell #\0) (cdr empty))) (cdr filled)))
              ((safe? filled digit cell) ; advance
                (solve (cdr empty) (cons (new cell digit) filled)))
              (else (try cell (next digit))))) ; try next digit
      « write output »))

The inner loop of the solver is the try loop, which tries to place successive digits in the current cell. The cond expression has three clauses. The first clause backtracks when all possible digits have been considered, and the second clause advances when it is safe to place the digit in the current cell. The third clauses advances to the next digit. The try and solve loops are intertwined, doing their work by twiddling cells back and forth between the empty and filled lists until digits have been safely placed in all eighty-one cells and the empty list is null.

The safe? function, which is called in the try loop, returns #t if digit can be placed in cell without violating any of the sudoku constraints, and #f otherwise. It works by scanning the filled list. If any cell on the filled list has the same row and digit as the prospective new cell, or the same column and digit, or the same box and digit, the prospective cell cannot be placed in the puzzle, and safe? returns #f. If the end of the filled list is reached without violating any of the constraints, the prospective cell can be placed in the puzzle, and safe? returns #t.

« utility functions »≡
(define (safe? filled digit cell)
  (cond ((null? filled) #t)
        ((and (= (vector-ref (car filled) 0) (vector-ref cell 0))
              (char=? (vector-ref (car filled) 3) digit)) #f)
        ((and (= (vector-ref (car filled) 1) (vector-ref cell 1))
              (char=? (vector-ref (car filled) 3) digit)) #f)
        ((and (= (vector-ref (car filled) 2) (vector-ref cell 2))
              (char=? (vector-ref (car filled) 3) digit)) #f)
        (else (safe? (cdr filled) digit cell))))

The solver uses two other utility functions. (next digit) calculates the successor to a digit, in the sequence #\1 to #\9. Note that the first successor is #\1 because the reader initializes empty cells with #\0; this is a fine example of the case where properly specifying the form of the input can affect the simplicity of the code. (new old digit) returns a newly-allocated cell with the row, column and box coordinates of the old cell and a new digit.

« utility functions »≡
(define (next digit) (integer->char (+ (char->integer digit) 1)))
(define (new old digit) (vector (vector-ref old 0) (vector-ref old 1) (vector-ref old 2) digit))

When the solver drives the empty list to null, output is written by scanning the filled list, copying the digit from each cell into the appropriate index of the string.

« write output »≡
(let ((str (make-string 81 #\0)))
  (do ((filled filled (cdr filled))) ((null? filled) str)
    (let* ((cell (car filled)) (s (+ (* (vector-ref cell 0) 9) (vector-ref cell 1))))
      (string-set! str s (vector-ref cell 3)))))

It should be obvious now why the solver assumes well-formed puzzles. If the puzzle has multiple solutions, the solver will find one of them, then stop. But if the puzzle has no solution, the solver will backtrack into the original givens of the puzzle, changing them as needed until it can advance to a solution. If desired, it is possible to compare the solution to the original puzzle and reject any where the givens have changed, but in the interest of simplicity, we decide to simple require that all puzzles are well-formed.

The sample puzzle given at the beginning of this note is solved as follows:

> (sudoku "700100000020000015000006390200018000040090070000750003078500000560000040000001002")
"789135624623947815451286397237418569845693271916752483178524936562379148394861752"
Or, in grid form:
7 8 9   1 3 5   6 2 4
6 2 3   9 4 7   8 1 5
4 5 1   2 8 6   3 9 7
 
2 3 7   4 1 8   5 6 9
8 4 5   6 9 3   2 7 1
9 1 6   7 5 2   4 8 3
 
1 7 8   5 2 4   9 3 6
5 6 2   3 7 9   1 4 8
3 9 4   8 6 1   7 5 2
Even though this particular puzzle is rated "evil" according to the web site from which it was taken, the solver finds the solution in less than a second on a recent-vintage personal computer. That's not bad, considering that, with twenty-five non-empty cells in the initial grid, there are (expt 9 (- 81 25)) or 273892744995340833777347939263771534786080723599733441 possible placements of the remaining digits, of which only one is the solution to the puzzle.

— Phil Bewig, 6th June 2007