Sudoku

Sudoku puzzles are a simple and popular amusement given as a nine-by-nine grid of cells, some of them containing digits:
7 _ _   1 _ _   _ _ _
_ 2 _   _ _ _   _ 1 5
_ _ _   _ _ 6   3 9 _
 
2 _ _   _ 1 8   _ _ _
_ 4 _   _ 9 _   _ 7 _
_ _ _   7 5 _   _ _ 3
 
_ 7 8   5 _ _   _ _ _
5 6 _   _ _ _   _ 4 _
_ _ _   _ _ 1   _ _ 2
The challenge is to fill the empty cells with the digits [[1]] through [[9]] in such a way that no row, column, or three-by-three sub-grid contains the same digit two or more times. This note describes a simple sudoku solver based on depth-first search; the solver assumes that the puzzle is well-formed, with exactly one solution. Input is a string of eighty-one characters representing a sudoku grid in row-major order, top-to-bottom and left-to-right, with a zero in empty cells. For instance, the sudoku puzzle shown above is represented as the string [[700100000020000015000006390200018000040090070000750003078500000560000040000001002]]. Output is represented in the same form. The sudoku puzzle is solved by the function [[(sudoku puzzle)]], where [[puzzle]] is the string containing the puzzle. (define (sudoku puzzle) << utility functions >> << read and process puzzle >>) The sudoku puzzle is represented internally as two lists, [[empty]] and [[filled]], which between them always contain eighty-one cells. Rows are numbered [[0]] through [[8]] from top to bottom, columns are numbered [[0]] through [[8]] from left to right, and boxes (three-by-three sub-grids) are numbered [[0]] through [[8]] from top-to-bottom, left-to-right, in row-major order. Digits are [[#\1]] through [[#\9]], plus [[#\0]] as a placeholder for a cell with a yet-unknown digit. The code below reads a sudoku puzzle as an eighty-one-character string and creates the internal representation of the puzzle in the two lists [[empty]] and [[filled]]. << read and process puzzle >>= (let scan ((s 0) (empty '()) (filled '())) (if (< s 81) (let* ((row (quotient s 9)) (col (modulo s 9)) (box (+ (* (quotient row 3) 3) (quotient col 3))) (digit (string-ref puzzle s)) (cell (vector row col box digit))) (if (char=? digit #\0) (scan (+ s 1) (cons cell empty) filled) (scan (+ s 1) empty (cons cell filled)))) << solve puzzle and write output >>)) The reader scans the input string left-to-right, with [[s]] indexing the characters of the string. The [[row]], [[col]]umn and [[box]] are computed by various arithmetic on the index [[s]], and a four-slot vector representing the cell is created; then the cell is added to either the [[empty]] or [[filled]] list, as appropriate. When the entire string has been scanned the puzzle is passed to the solver. The solver works by moving cells from the [[empty]] list to the [[filled]] list. At each iteration of the [[solve]] loop, the cell at the head of the [[empty]] list is examined. If a digit can be added at the current cell without violating any sudoku constraints, the solver advances by moving the cell from the [[empty]] list to the [[filled]] list. However, if no digit can safely be added at the current cell, the solver backtracks by clearing (resetting the digit to [[#\0]]) the top cell on the [[empty]] list and moving the head of the [[filled]] list back to the head of the [[empty]] list. << solve puzzle and write output >>= (let solve ((empty empty) (filled filled)) (if (pair? empty) (let try ((cell (car empty)) (digit (next (vector-ref (car empty) 3)))) (cond ((char>)) The inner loop of the solver is the [[try]] loop, which tries to place successive digits in the current cell. The [[cond]] expression has three clauses. The first clause backtracks when all possible digits have been considered, and the second clause advances when it is safe to place the [[digit]] in the current cell. The third clauses advances to the next [[digit]]. The [[try]] and [[solve]] loops are intertwined, doing their work by twiddling cells back and forth between the [[empty]] and [[filled]] lists until digits have been safely placed in all eighty-one cells and the [[empty]] list is [[null]]. The [[safe?]] function, which is called in the [[try]] loop, returns [[#t]] if [[digit]] can be placed in [[cell]] without violating any of the sudoku constraints, and [[#f]] otherwise. It works by scanning the [[filled]] list. If any cell on the [[filled]] list has the same [[row]] and [[digit]] as the prospective new cell, or the same [[col]]umn and [[digit]], or the same [[box]] and [[digit]], the prospective cell cannot be placed in the puzzle, and [[safe?]] returns [[#f]]. If the end of the [[filled]] list is reached without violating any of the constraints, the prospective cell can be placed in the puzzle, and [[safe?]] returns [[#t]]. << utility functions >>= (define (safe? filled digit cell) (cond ((null? filled) #t) ((and (= (vector-ref (car filled) 0) (vector-ref cell 0)) (char=? (vector-ref (car filled) 3) digit)) #f) ((and (= (vector-ref (car filled) 1) (vector-ref cell 1)) (char=? (vector-ref (car filled) 3) digit)) #f) ((and (= (vector-ref (car filled) 2) (vector-ref cell 2)) (char=? (vector-ref (car filled) 3) digit)) #f) (else (safe? (cdr filled) digit cell)))) The solver uses two other utility functions. [[(next digit)]] calculates the successor to a digit, in the sequence [[#\1]] to [[#\9]]. Note that the first successor is [[#\1]] because the reader initializes empty cells with [[#\0]]; this is a fine example of the case where properly specifying the form of the input can affect the simplicity of the code. [[(new old digit)]] returns a newly-allocated cell with the row, column and box coordinates of the [[old]] cell and a new [[digit]]. << utility functions >>= (define (next digit) (integer->char (+ (char->integer digit) 1))) (define (new old digit) (vector (vector-ref old 0) (vector-ref old 1) (vector-ref old 2) digit)) When the solver drives the [[empty]] list to [[null]], output is written by scanning the [[filled]] list, copying the digit from each cell into the appropriate index of the string. << write output >>= (let ((str (make-string 81 #\0))) (do ((filled filled (cdr filled))) ((null? filled) str) (let* ((cell (car filled)) (s (+ (* (vector-ref cell 0) 9) (vector-ref cell 1)))) (string-set! str s (vector-ref cell 3))))) It should be obvious now why the solver assumes well-formed puzzles. If the puzzle has multiple solutions, the solver will find one of them, then stop. But if the puzzle has no solution, the solver will backtrack into the original givens of the puzzle, changing them as needed until it can advance to a solution. If desired, it is possible to compare the solution to the original puzzle and reject any where the givens have changed, but in the interest of simplicity, we decide to simple require that all puzzles are well-formed. The sample puzzle given at the beginning of this note is solved as follows:
> (sudoku "700100000020000015000006390200018000040090070000750003078500000560000040000001002")
"789135624623947815451286397237418569845693271916752483178524936562379148394861752"
Or, in grid form:
7 8 9   1 3 5   6 2 4
6 2 3   9 4 7   8 1 5
4 5 1   2 8 6   3 9 7
 
2 3 7   4 1 8   5 6 9
8 4 5   6 9 3   2 7 1
9 1 6   7 5 2   4 8 3
 
1 7 8   5 2 4   9 3 6
5 6 2   3 7 9   1 4 8
3 9 4   8 6 1   7 5 2
Even though this particular puzzle is rated "evil" according to the web site from which it was taken, the solver finds the solution in less than a second on a recent-vintage personal computer. That's not bad, considering that, with twenty-five non-empty cells in the initial grid, there are [[(expt 9 (- 81 25))]] or [[273892744995340833777347939263771534786080723599733441]] possible placements of the remaining digits, of which only one is the solution to the puzzle. — Phil Bewig, 6th June 2007