Solving Sudoku Puzzles
1 Jan 2007
In their capacity as a tool, computers will be but a ripple on the surface of our culture. In their capacity as intellectual challenge, they are without precedent in the cultural history of mankind.
—Edsger W. Dijkstra
(This is inspired by Philip Wadler's collection of material introducing general people to computer science, or "Computational Thinking" as he calls it.)
The aim of this essay is to look at solving Sudoku puzzles from my vantage point: as a programmer.
Note: The code below is written in the programming language Objective Caml. This discussion, however, has very little to do with the code itself. If I were writing the code professionally, I would be considerably more explicit about why the code did what I claimed it did, and rather less about why I would want it to do that. (The former is important to understand and maintain the code in the future, while the latter should be reasonably apparent from my set-up of the problem.)
Further, I have used some moderately uncommon techniques in the code. For one thing, OCaml is a good, although uncommon, functional language, cheerfully capable of using higher-order functions (functions which manipulate other functions), and I have not been shy about using those. However, the ideas behind the code, which I am attempting to describe, are fully capable of being expressed in any programming language, if not so concisely.
Sudoku puzzles should be familiar to almost anyone; I first heard of them when they started showing up twice a week in the local paper. According to The Ultimate Sudoku Challenge, presented by Will Shortz,
A sudoku puzzle consists of a 9x9-square grid subdivided into nine 3x3 boxes. Some of the squares contain digits. The object is to fill in the remaining squares so that every row, column, and 3x3 box contains each of the digits from 1 to 9 exactly once.
I am not an "Advanced Solver"; the easy puzzles in the paper frequently defeat me. However, I am a tolerably competent programmer and when I was given a copy of The Ultimate Sudoku Challenge, I beat my head against the first few puzzles for a while, then decided to write some code to solve them instead.
I am not alone there. Solving Sudoku puzzles is a favorite task in Computer Science courses, because the puzzles are ideal toy problems---small and constrained enough to be suitable for the time allowed to an assignment or three, yet complex enough to demonstrate some moderately advanced techniques. There is even a book about it, Programming Sudoku by Wei-Meng Lee.
There are, roughly, two ways to approach this kind of problem: I could focus only on solving the puzzle, for example by using an approach called state space search which would be guaranteed to solve the puzzle if a solution was possible. Or, I could take a "strategic" approach: write a more-or-less general framework for manipulating puzzles, then write a set of strategies each representing a single technique for solving the puzzles, and finish with another general framework for feeding a puzzle to the strategies. The second approach is weaker than the first: it is not guaranteed to find a solution, because (short of having a solid proof that my set of strategies is complete) it is always possible that some puzzle is not solvable without some trick that is not represented in my strategies. However, the second approach is much more likely to teach me more about solving the puzzles: the strategies that I develop are the same techniques that I would use to solve it manually. For me, the interest is in learning to solve the puzzles rather than the programming techniques. (There is an old adage to the effect of: "You know how to do something when you can teach someone else to do it, but you really know how to do something when you can get a computer to do it.")
To start my general framework, I began with some basic constants:
- The lowest number in a cell, lower, is 1.
- Each "square" is 3x3 cells (square_size), and the overall puzzle is 3x3 squares.
These two constants determine the highest number that can appear in a cell (upper), 3*3=9, the number of cells in a "band" (band_size, three squares, used when printing a puzzle), and the total number of cells (nelements), 9*9=81:(* Basic constants: lowest number in the puzzle and the size of a small square *) let lower = 1 let square_size = 3 let upper = square_size * square_size let band_size = upper * square_size let nelements = upper * upper
One construct that is under-appreciated in Computer Science, and vastly under-appreciated outside it, is that of a mathematical set. A set is a collection of unique objects; in this case, I am using a set to represent each cell, such that the contents of the set represent the possible values in the cell. A completely empty puzzle would have cells each containing all the digits between 1 and 9. A cell with a determined value, one of the hints for example, would be represented by a set containing just that one value.
Each of my strategies, then, is an attempt to reduce the number of elements of one or more cells. For example, if I know that one cell has an 8 in it, then I further know that no other cell in that row can have an 8, and I can remove the 8 from any of the sets representing cells in that row.
A set has a certain group of operations: I can take two sets and compute their union, the set consisting of all the elements in either original set; I can compute their intersection, the set consisting of only those elements in both of the original sets; and so forth. Further on down, I will be making serious use of these operations.(* * Representation of a cell: a set of integers. * A determined cell is a singleton set, while * in an undetermined cell, the contents of the * set indicate the possible values of the cell. *) module IntSet = Set.Make(struct type t = int let compare = compare end) type cell = IntSet.t
The first thing to see, which I touched on above, is that in the initial set-up I do not know any of the possible values for any cell that does not contain a hint; an empty cell thus starts with all of the digits between 1 and 9 in it:(* Construct an empty cell *) let empty_cell = let rec _init i c = if i > upper then c else _init (i+1) (IntSet.add i c) in _init lower IntSet.empty
On the other hand, a filled cell with a hint, has just its determined digit as its content:(* Construct a determined cell *) let filled_cell i = IntSet.singleton i
Finally, when dealing with cells I need a few special operations:
- A test of whether a cell has a determined value, whether it is a singleton set.
- A test of whether a cell is overdetermined, whether its set is empty; this indicates that the cell can have no possible values. If this happens, I am looking at some kind of serious problem, usually an error in determining the value of some previous cell.
- A function to return the value of a cell (or at least the lowest allowable digit in the cell).
- And finally, a couple of ways of printing a cell.
What I have done is to create an abstract data type representing a Sudoku cell. An abstract data type is a combination of a description of data (in this case, a set of integers) and a group of operations on that data (testing whether the cell has a definite value, for example). Another example abstract data type is the ordinary integers and arithmetic operations, such as plus, minus, multiply, and so on.
Because I am lazy, I chose a simple representation of a puzzle, as a simple, linear array of cells. I could have gotten more complex and tried to represent the rows, columns, or squares that make up the structure of the puzzle, but it turns out that the simplest representation is the most flexible and that it does not make any of the ways I need to look at the puzzle any harder. That is an important point in a data structure.(* Representation of the entire puzzle is as an array of cells. *) type puzzle = cell array
From the choice of an array, I can build some simple functions for puzzles:
A puzzle is solved if every cell in it is determined.
A puzzle is over-constrained, and therefore in error, if any cell is overdetermined.
Two puzzles are equal if every cell in each is equal to the corresponding cell in the other.
Finally, I can print the puzzle in two ways: in a pretty fashion, showing the determined cells with their values and the undetermined cells as empty, and in a more detailed fashion, showing each cell's possible values.
Pretty printing is the only place where some of the constants I started with, such as band_size, are used.
Filtering an array of cells, to find a list of cells that some question.
Let's say I need to find all of the determined cells in a puzzle, and I don't particularly need to know where in the puzzle each determined cell is located. I can use filter_cells with the question, or predicate, "Is this cell determined?" to get a list of determined cells by putting each cell that answers "yes" to the predicate into my list.
An array of cells seems like a useless structure for a puzzle. What if I need to know which row, or column, or 3x3 square a given cell is in?
It turns out that those and many similar questions are fairly easy to answer, because I am using an array, and because of a happy linguistic convention: in most programming languages, the cells in an array are numbered starting from zero. For example, the first cell in a puzzle is cell 0, the last is cell 80, and the 15th is cell 14.
I can use that fact to compute the number of the row a given cell is in (again, starting from zero) by dividing the location of the cell by the number of cells in a row using grade-school, integer division (in other words, division returning only the integer part of the result and dropping any fractional part). Cell 0 is in row 0, as is cell 1 (1/9 = 0, in integers), and cell 80 is in row 80/9 = 8.
I can use that fact to compute the number of the column of a given cell by using the other half of integer division, modulus. This operator returns, in this case, a number from 0 to 8 which works out to be the column number of the cell.
Finally, using a bit more mathematical tap dancing as well as the row and column numbers of the cell, I can calculate the number of the square a cell is in.(* the location of a given cell *) let row_of_loc i = (i / upper) let col_of_loc i = (i mod upper) let sqr_of_loc i = (((row_of_loc i) / square_size) * square_size) + ((col_of_loc i) / square_size)
With a way to calculate the row, column, or square of a cell, I can now come up with some predicates for my filter_cells function that can be used to get the list of all of the cells in given row, column, or square.(* filters for cells in a given row, column, or square *) let rowFilter r i = (row_of_loc i) = r and colFilter c i = (col_of_loc i) = c and sqrFilter s i = (sqr_of_loc i) = s
Further, I can get slightly fancy and use filter_cells to get a list of all of the cells in a row, column, or square except some given cell. That turns out to be useful in some of my strategies.(* filters for cells in a given row, column, or square, excluding one *) let rowExcluding r e i = (rowFilter r i) && (e != i) and colExcluding c e i = (colFilter c i) && (e != i) and sqrExcluding s e i = (sqrFilter s i) && (e != i)
Once I have a list of cells satisfying some predicate, I need to be able to filter it further, for example to find all of the determined or undetermined cells from the cells in a given row. The filter_cells function will not work; it is intended for arrays of cells, not lists (and that makes a difference). So, I came up with some quick new functions.(* list of values of determined cells *) let determined = List.fold_left (fun a cell -> if is_determined_cell cell then cell::a else a)  (* list of undetermined cells *) let undetermined = List.fold_left (fun a cell -> if is_determined_cell cell then a else cell::a) 
Now, say I have a list of cells, which is to say, a list of sets of numbers. One thing I need to be able to do (I promise, I'm going somewhere here) is to find the union of all the sets. For example, if I have a list of all of the determined cells in a row, I can use this to compute the set of all of the numbers that have been used in the row. That would be useful: those numbers cannot be used in any of the other, undetermined, cells in the row.(* construct the union of a list of IntSets *) let union_of_intsets = List.fold_left (fun set c -> IntSet.union set c) IntSet.empty (* construct the set of determined elements for each row, column, square *) let determined_row p r = union_of_intsets (determined (filter_cells (rowFilter r) p)) and determined_col p c = union_of_intsets (determined (filter_cells (colFilter c) p)) and determined_sqr p s = union_of_intsets (determined (filter_cells (sqrFilter s) p))
Finally, I have some functions that are not immediately useful (in fact, they're downright bizarre). However, they are fundamental to some of the strategies below.
- First, some functions to compute the union of the contents of the cells in a row, column, or square, excluding some particular cell in the row, column, or square.
- Second, using some additional helping functions, some functions to find a list of the values that appear more than n times in a row, column, or square, where n is a parameter.
The puzzle and its operations also form an abstract data type. (I should probably remove the word "abstract", for fear of offending other programmers. I have been rather sloppy in my use of the data types, mixing cells and operations on the set representations of cells.)
Given the building blocks above, I can start looking at strategies. Each strategy has a particular form: it is a function which accepts a puzzle and returns another puzzle. The returned puzzle is the original either unchanged, or with some of the possibilities in some of the cells reduced; that is, a step or two closer to a solution.
First, consider a basic strategy: when looking at a row of a puzzle, if some cell has a determined value then that value cannot be a possibility in any other cell.
If my original structure for the puzzle is equivalent to writing all of the digits from 1 to 9 in every empty cell when starting a puzzle, these first three strategies are equivalent to finding every determined cell, say one containing a 4, and ensuring that every other cell in that row, column, or square cannot be a 4 by erasing the 4 in any of those undetermined cells.
The code below may be a bit impenetrable, but I hope the intention is not.
- rs in the first function is the collection of sets of digits that are known to be used in each row: rs.0 (the first entry) is the set containing all of the digits that have determined locations in the top row of the puzzle. In the second function, cs, and in the third, ss, serve the same purpose.
- apply is a function that can be given both the location and the contents of a cell. If the cell is determined, then it returns the cell unchanged, otherwise it returns the set difference between the cell and the rs entry for the row (using the location). That difference represents the original possibilities for the cell, minus the digits that are already known to have places in the row.
- Finally, the strategy invokes apply for every cell in the puzzle.
Now, consider a second strategy: every digit from 1 to 9 has to show up at least once in each row, column, and square. This is equivalent to asking, for each cell in the puzzle, if there is a possible value for that cell that does not appear elsewhere in the row.
Each of the following functions goes through all of the cells of a puzzle; if the set consisting of that cell minus the union of the other cells in the row is a singleton (that is, if exactly one of the possible values for that cell does not appear in any other cell in the row), that singleton must be the determined value of the cell. Otherwise, if the cell is determined, or if the computed set is not a singleton, the cell is left alone.let row_atleast_once p = let apply i c = if is_determined_cell c then c else let det = IntSet.diff c (row_union_excluding p (row_of_loc i) i) in if (IntSet.cardinal det) == 1 then det else c in Array.mapi apply p let col_atleast_once p = let apply i c = if is_determined_cell c then c else let det = IntSet.diff c (col_union_excluding p (col_of_loc i) i) in if (IntSet.cardinal det) == 1 then det else c in Array.mapi apply p let sqr_atleast_once p = let apply i c = if is_determined_cell c then c else let det = IntSet.diff c (sqr_union_excluding p (sqr_of_loc i) i) in if (IntSet.cardinal det) == 1 then det else c in Array.mapi apply p
Finally, consider a third strategy: in a given row, if there are two cells that both only contain, say, 3 and 5 as possibilities, then I know something special about that row: no other cell can be either 3 or 5. I do not know which of the two cells should contain the 3 and which should contain the 5, but none of the other seven can be either. This strategy may not determine any cell itself, but it will provide more information for subsequent applications of other strategies.let row_pairs p = let pairs = Array.init upper (multiples_in_row p 2) in let pair_sets = Array.init upper (fun i -> union_of_intsets pairs.(i)) in let apply i c = let j = row_of_loc i in if List.exists (IntSet.equal c) pairs.(j) then c else IntSet.diff c pair_sets.(j) in Array.mapi apply p let col_pairs p = let pairs = Array.init upper (multiples_in_col p 2) in let pair_sets = Array.init upper (fun i -> union_of_intsets pairs.(i)) in let apply i c = let j = col_of_loc i in if List.exists (IntSet.equal c) pairs.(j) then c else IntSet.diff c pair_sets.(j) in Array.mapi apply p let sqr_pairs p = let pairs = Array.init upper (multiples_in_sqr p 2) in let pair_sets = Array.init upper (fun i -> union_of_intsets pairs.(i)) in let apply i c = let j = sqr_of_loc i in if List.exists (IntSet.equal c) pairs.(j) then c else IntSet.diff c pair_sets.(j) in Array.mapi apply p
The way I have described this solution is certainly not the way I wrote it. I did start with the idea of using a set to descibe each cell, and I did decide to see how far I could get by using a simple array for a puzzle. However from that point on, I worked on the strategies, trying to figure out how to write them. Or, more precisely, how I wanted to write them. The numerous operations on cells and puzzles that I began with came out of this process, mostly by being factored out of strategies as I determined that they might be useful.
That is, in general, how I usually prefer to work: to start by deciding what raw materials I have to work with, then deciding how I ultimately want to solve the problem, and finally by bringing the two together by building tools from the former towards the latter.
This kind of programming is called "bottom up", as opposed to "top down", where I would start writing the solver and leaving holes that would be filled in as I get closer to the details of the solution. Either works, but I have found that the former is more useful if I do not have a clear idea how to solve the problem or if my solution will need to be changed or extended later.
These three strategies that I have described are not the only possible strategies. In fact, Will Shortz in the book that started this described a strategy that was claimed to be necessary to solve the hardest puzzles in the book without guessing. I have not implemented that strategy here, and so far have found that claim to be true: these strategies do not solve the hardest puzzle.
Here are some questions for any budding mathematicians out there: How would you know if you have a group of strategies that would solve any puzzle? Is there such a group of strategies? If so, what would one be? What would the smallest set of strategies be? I do not know if you would solve any ground-breaking issues by answering any of those questions, but you would certainly get your name in the papers.
How do I solve the puzzles with the strategies? Simple: I loop through the 9 strategies, trying each in turn. When I have finished with all of them, I check to see if I have solved the puzzle, in which case I am done, or if I have reached an overconstrained situation, in which case I declare an error and give up, or if this version of the puzzle is the same as what I started with, in which case I am not going to be able to make any more progress and might as well give up again. If none of those conditions hold, I start over again, trying the strategies on the new puzzle state.exception Overconstrained exception Underconstrained of puzzle let rec constrained_solver p = let apply_constraints = List.fold_left (fun p cnst -> cnst p) in let q = apply_constraints p [row_constraints; col_constraints; sqr_constraints; row_atleast_once; col_atleast_once; sqr_atleast_once; row_pairs; col_pairs; sqr_pairs] in begin print_puzzle q; print_newline (); if solved q then q else if equal p q then raise (Underconstrained q) else if overconstrained q then raise Overconstrained else constrained_solver q end
I have called that the constrained_solver because it operates under the constraint that it cannot attempt to make progress by guessing. Instead, it must follow the strategies I have provided, and therefore it is limited in the problems it can solve.
How well does the solver do? The three variables easy, medium, and hard are three puzzles from the book, named for the difficulty associated with each.let puzzle_of_array a = Array.init nelements (fun i -> if a.(i) = 0 then empty_cell else filled_cell a.(i)) let easy = puzzle_of_array [| 1;4;0; 0;5;3; 6;0;0; 0;0;0; 0;7;0; 0;0;0; 0;5;6; 0;0;0; 7;0;0; 8;0;9; 0;6;1; 0;2;0; 7;0;0; 9;0;0; 0;0;0; 0;0;0; 0;0;8; 0;1;0; 0;0;0; 0;9;0; 0;0;0; 0;6;3; 0;1;0; 4;5;0; 0;7;1; 0;3;0; 0;6;0 |] let medium = puzzle_of_array [| 0;0;0; 6;0;8; 0;1;0; 0;0;0; 0;0;0; 7;0;0; 7;0;0; 5;0;0; 0;0;0; 0;0;0; 7;0;0; 3;8;0; 6;0;0; 0;0;3; 1;0;5; 0;7;2; 0;0;0; 6;0;0; 0;0;0; 3;0;0; 8;0;0; 0;0;9; 0;0;0; 0;0;0; 4;0;6; 0;2;0; 0;0;3 |] let hard = puzzle_of_array [| 0;3;0; 0;0;0; 4;0;5; 0;0;0; 0;0;0; 0;0;0; 0;0;7; 0;0;0; 0;1;0; 0;0;0; 7;9;0; 5;0;0; 0;0;6; 0;5;0; 8;0;0; 2;0;0; 0;4;0; 0;7;3; 0;0;0; 9;2;0; 0;0;7; 0;4;0; 3;0;0; 9;0;0; 0;0;0; 1;0;6; 0;0;0 |]
The result of running constrained_solver on easy is:147|253|698 398|176|245 256|489|731 ----------- 839|761|524 714|925|386 625|348|917 ----------- 582|694|173 963|817|452 471|532|869
The result of running constrained_solver on medium is:243|678|519 165|934|728 798|512|436 ----------- 951|746|382 684|293|175 372|185|694 ----------- 527|369|841 839|451|267 416|827|953
In both of these cases, the solver has succeeded. All of the cells have been filled in and the puzzle is solved.
Finally, the result of running constrained_solver on hard is:3 |8 7|4 5 | |7 7| | 1 ----------- |79 |5 6|25 |8 2 |648|173 ----------- |924| 7 4 |3 5|9 1 |1 6|
Clearly, the constrained solver is missing some strategy: it cannot solve the puzzle, although a comparison between the result and the original state of the puzzle shows that the solver has made some progress.
I could try to add some more strategies, but I was done fooling around. I intended to bring out the big gun: state space search.
Pure state space search, for a sudoku puzzle, would go something like this:
Find an undetermined cell. If none is to be found, the puzzle is solved.
Guess a value for the cell.
Test the result:
a. If the puzzle is still legal, go back to step 1 and look for another undetermined cell.
b. If the puzzle now violates one of the rules, give up. Go back to the last guess and try another value. (This is known as "backtracking", due to the similarity to wandering a maze leaving a breadcrumb trail.)
This method will eventually look at every possible solution for the puzzle, discarding those that violate the rules. It is guaranteed to find a solution if one exists.
The difficulty is that there are probably 20-25 determined cells in the original puzzle, leaving 81-20=61 undetermined cells which can each take one of 9 different values. That is not a terribly large state space, as such things go, but guessing and backtracking can still waste an enormous amount of time. Plus, I have this lovely constrained_solver; have I wasted all the effort that went into it?
Well, no. Building an unconstrained solver is vastly simplified by using the constrained solver. The new method looks something like:
Attempt to solve the puzzle using the constrained_solver. If it succeeds, the result is a solution. (Yippie!)
If it does not, it fails in one of two ways:
a. The resulting puzzle is underconstrained. There are still undetermined cells in it. Pick one, pick one of the possible values from the cell, and make a guess out of it. Go back to step 1 with using the hypothetical puzzle.
b. The resulting puzzle is overconstrained. Some guess made in the past is wrong. Go back to the last guess and try another.
The actual implementation of the unconstrained_solver below does not quite use that method. (Instead it keeps track of a list of puzzles in various stages of solvitude.) However, it is very similar and spiritually equivalent.let unconstrained_solver p = let rec make_guess p i = if i >= Array.length p then raise Overconstrained (* shouldn't happen; indicates solved puzzle *) else if is_determined_cell p.(i) then make_guess p (i+1) else let g = IntSet.choose p.(i) in (* the guess *) let cp = Array.copy p in begin cp.(i) <- (IntSet.singleton g); p.(i) <- (IntSet.remove g p.(i)); (cp, p) end in let rec solve ps = match ps with  -> raise Overconstrained | p::t -> try constrained_solver p with Overconstrained -> begin print_endline "backtracking"; print_newline (); solve t end | Underconstrained q -> let (p,q) = make_guess q 0 in begin print_endline "making guess"; print_newline (); solve (p::q::t) end in solve [p]
Unlike the constrained solver, the unconstrained_solver can solve the hard puzzle:132|867|495 964|512|738 857|439|216 ----------- 481|793|562 376|251|849 295|648|173 ----------- 513|924|687 648|375|921 729|186|354
In fact, it does a fairly good job. On the hard puzzles I tried, it only had to guess one or two times; all other determined cells were identified by the constrained solver's strategies. That is better than I do.
Он повернулся, но было уже поздно. Чьи-то стальные руки прижали его лицо к стеклу. Панк попытался высвободиться и повернуться. - Эдуардо.