During this week I was working on one of Project Euler’s problems. The meat of the problem was implementing a Sudoku solver. After much work, I was finally able to program a solver that recursively tried filling the empty cases and tracked back when necessary. In this post we will see how to implement a simple version of the solver.
Note: This solver is a simplified implementation. It will take too long to solve difficult Sudoku puzzles. For a more efficient solution based on this one, see here.
Setting up the Sudoku grid
Sudoku: What is it?
If you’re not familiar with it, Sudoku is a mathematical puzzle where you have a 9×9 grid partially filled with digits from 1 to 9. The grid is also divided into nine 3×3 sub-grids. The goal is to fill the grid’s empty cases, having each digit between 1-9 appear only once in each row, column and 3×3 sub-grid.
Using Numpy for the grid
For the 9×9 grid, we will use Numpy arrays (the experience I got using Numpy to implement a tic-tac-toe board came in handy here). We will use as input a string of digits. Empty cases will be denoted by 0
and each line of the string will represent a row of the grid. That means we will have 9 lines of 9 digits each. As an example, we will work with:
puzzle = """043080250 600000000 000001094 900004070 000608000 010200003 820500000 000000005 034090710"""
We can populate the 9×9 Sudoku grid with the following function:
def create_grid(puzzle_str): # Deleting whitespaces and newlines (\n) lines = puzzle_str.replace(' ','').replace('\n','') # Converting it to a list of digits digits = list(map(int, lines)) # Turning it to a 9x9 numpy array grid = np.array(digits).reshape(9,9) return grid
Which will return the grid as expected:
>>> grid = create_grid(puzzle) array([[0, 4, 3, 0, 8, 0, 2, 5, 0], [6, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0, 9, 4], [9, 0, 0, 0, 0, 4, 0, 7, 0], [0, 0, 0, 6, 0, 8, 0, 0, 0], [0, 1, 0, 2, 0, 0, 0, 0, 3], [8, 2, 0, 5, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 5], [0, 3, 4, 0, 9, 0, 7, 1, 0]])
Dividing into sub-grids
Part of the analysis has to do with checking the values of each sub-grid. We can group the values of each 3×3 sub-grid as follows:
def get_subgrids(grid): subgrids = [] for box_i in range(3): for box_j in range(3): subgrid = [] for i in range(3): for j in range(3): subgrid.append(grid[3*box_i + i][3*box_j + j]) subgrids.append(subgrid) return np.array(subgrids)
This will return the following sub-grids for our original puzzle:
>>> subgrids = get_subgrids(grid) array([[0, 4, 3, 6, 0, 0, 0, 0, 0], [0, 8, 0, 0, 0, 0, 0, 0, 1], [2, 5, 0, 0, 0, 0, 0, 9, 4], [9, 0, 0, 0, 0, 0, 0, 1, 0], [0, 0, 4, 6, 0, 8, 2, 0, 0], [0, 7, 0, 0, 0, 0, 0, 0, 3], [8, 2, 0, 0, 0, 0, 0, 3, 4], [5, 0, 0, 0, 0, 0, 0, 9, 0], [0, 0, 0, 0, 0, 5, 7, 1, 0]])
See that we’re dividing the original grid into groups of 3×3 sub-grids from left to right, top to bottom.
Getting case candidates
We know that each digit can occur only once for each row, column and sub-grid. That means that each case can be filled with digits that don’t appear in the case’s row, column and sub-grid. We will call that set of options “candidates”.
The following function implements this idea:
def get_candidates(grid): # Helper function def subgrid_index(i, j): return (i//3) * 3 + j // 3 subgrids = get_subgrids(grid) candidates = [] for i in range(9): row_candidates = [] for j in range(9): # Row, column and subgrid digits row = set(grid[i]) col = set(grid[:, j]) sub = set(subgrids[subgrid_index(i, j)]) common = row | col | sub candidates = set(range(10)) - common # If the case is filled take its value as the only candidate if not grid[i][j]: row_candidates.append(list(candidates)) else: row_candidates.append([grid[i][j]]) candidates.append(row_candidates) return candidates
We have used Python set’s operations to handle the candidates. If you aren’t familiar with them, you can see a worked example here.
The candidates are implemented as nested lists. We made use of the helper function subgrid_index
that returns the corresponding sub-grid index for the i
and j
indices of the grid’s case.
This function returns the following candidates for our current puzzle:
>>> candidates = get_candidates(grid) [[[1, 7], [4], [3], [9, 7], [8], [9, 6, 7], [2], [5], [1, 6, 7]], [[6], [8, 9, 5, 7], [1, 2, 5, 7, 8, 9], [9, 3, 4, 7], [2, 3, 4, 5, 7], [2, 3, 5, 7, 9], [8, 1, 3], [8, 3], [8, 1, 7]], [[2, 5, 7], [8, 5, 7], [8, 2, 5, 7], [3, 7], [2, 3, 5, 6, 7], [1], [8, 3, 6], [9], [4]], [[9], [8, 5, 6], [8, 2, 5, 6], [1, 3], [1, 3, 5], [4], [8, 1, 5, 6], [7], [8, 1, 2, 6]], [[2, 3, 4, 5, 7], [5, 7], [2, 5, 7], [6], [1, 3, 5, 7], [8], [1, 4, 5, 9], [2, 4], [1, 2, 9]], [[4, 5, 7], [1], [8, 5, 6, 7], [2], [5, 7], [9, 5, 7], [4, 5, 6, 8, 9], [8, 4, 6], [3]], [[8], [2], [1, 9, 6, 7], [5], [1, 3, 4, 6, 7], [3, 6, 7], [9, 3, 4, 6], [3, 4, 6], [9, 6]], [[1, 7], [9, 6, 7], [1, 9, 6, 7], [1, 3, 4, 7, 8], [1, 2, 3, 4, 6, 7], [2, 3, 6, 7], [3, 4, 6, 8, 9], [2, 3, 4, 6, 8], [5]], [[5], [3], [4], [8], [9], [2, 6], [7], [1], [8, 2, 6]]]
For instance, the case at the 3th row (i
= 2) and the 1st column (j
= 0) could be filled with any of the digits candidates[i][j]
. That means any of the digits in the list [2, 5, 7]
.
Filling single candidates
Once we have generated the candidates for each case, we will see that some of them can be filled by only a single candidate. We will start filling the grid with those “single candidates”.
def fill_singles(grid): grid = grid.copy() candidates = get_candidates(grid) any_fill = True while any_fill: any_fill = False for i in range(9): for j in range(9): if len(candidates[i][j]) == 1 and grid[i][j] == 0: grid[i][j] = candidates[i][j][0] candidates = get_candidates(grid) any_fill = True return grid
This function is implemented in a loop. Every time we fill any case, we generate a new set of possible candidates until there aren’t any new filled cases.
When we apply this function to our grid, we get:
>>> fill_singles(grid) array([[ 0 , 4, 3, 0 , 8, 0, 2, 5, 0], [ 6 , 0, 0, 0 , 0, 0, 0, 0, 0], [ 0 , 0, 0, 0 , 0, 1, 0, 9, 4], [ 9 , 0, 0, 0 , 0, 4, 0, 7, 0], [ 0 , 0, 0, 6 , 0, 8, 0, 0, 0], [ 0 , 1, 0, 2 , 0, 0, 0, 0, 3], [ 8 , 2, 0, 5 , 0, 0, 0, 0, 0], [ 0 , 0, 0, 0 , 0, 0, 0, 0, 5], [*5*, 3, 4, *8*, 9, 0, 7, 1, 0]])
I have enclosed the new inputs with asterisks to remark them. We have only two new values filled in this case. But the fill_singles
function alone will be able to solve some of the easiest Sudoku puzzles!
Once we’re stuck solving Sudoku puzzles, like in this case, we will use a recursive approach.
Recursive Sudoku solver
To implement a Sudoku solver, we will follow four steps:
- Fill cases with only one candidate.
- If the grid is completely filled, return it.
- Check if the current grid is “valid”. If it isn’t erase the last guess and continue with the next guess.
- If the current grid is both incomplete and “valid”, fill one case with a guess and go back to 1.
The fill_singles
function of the last section takes care of the first step. Let’s see how to implement the other steps.
Verifying a solution
The solver will return a grid when it’s completely filled. A solution verifies that all rows, columns and sub-grids have only one occurrence of the digits 1 to 9:
def is_solution(grid): if np.all(np.sum(grid, axis=1) == 45) and \ np.all(np.sum(grid, axis=0) == 45) and \ np.all(np.sum(get_subgrids(grid), axis=1) == 45): return True return False
With this function, we indirectly verify the uniqueness of each digit by taking the sum of the grid values along each row, column and sub-grid. If we sum the digits 1 to 9, we get 45.
Invalid guess
We will know when to backtrack when we reach a Sudoku grid that has a cell that can’t be filled with any value. For example, this could happen if by checking the possible candidates in a cell’s row, we can fill it only with the digit 5 but that digit is already part of its corresponding column or sub-grid. This no-solution situations will arise as empty list candidates. We can easily check for them with:
def is_valid_grid(grid): candidates = get_candidates(grid) for i in range(9): for j in range(9): if len(candidates[i][j]) == 0: return False return True
Making a guess
Whenever we reach a point where we can’t further fill the Sudoku grid by filling single candidates, we will fill a case by guessing which one of its candidates is the correct one. In order to increase the chances of guessing correctly, the guess will be taken from a case that has the least number of possible candidates. Then we will try to find a solution for the grid with the “guessed” value.
def make_guess(grid): grid = grid.copy() candidates = get_candidates(grid) # Getting the shortest number of candidates > 1: min_len = sorted(list(set(map( len, np.array(candidates).reshape(1,81)[0]))))[1] for i in range(9): for j in range(9): if len(candidates[i][j]) == min_len: for guess in candidates[i][j]: grid[i][j] = guess solution = solve(grid) if solution is not None: return solution # Discarding incorrect guess grid[i][j] = 0
The recursive step takes place here, when we call the solve
function that restarts the process of attempting a solution on a new grid. The failing condition of the solve
function will return a None
object, at which point the grid will revert back the changes on the grid and continue with the next possible guesses.
Finally, let’s see how the solve function puts it all together.
Recursive solver
The solver follows the steps described at the beginning of this section:
def solve(grid): grid = fill_singles(grid) if is_solution(grid): return grid if not is_valid_grid(grid): return None return make_guess(grid)
The recursive step and the track back takes place inside the make_guess
function. We have seen this function will call the solve
function again, but on a new grid and revert the changes on failure.
Applying the solve
function on our puzzle, we get:
>>> solve(create_grid(puzzle)) array([[1, 4, 3, 9, 8, 6, 2, 5, 7], [6, 7, 9, 4, 2, 5, 3, 8, 1], [2, 8, 5, 7, 3, 1, 6, 9, 4], [9, 6, 2, 3, 5, 4, 1, 7, 8], [3, 5, 7, 6, 1, 8, 9, 4, 2], [4, 1, 8, 2, 7, 9, 5, 6, 3], [8, 2, 1, 5, 6, 7, 4, 3, 9], [7, 9, 6, 1, 4, 3, 8, 2, 5], [5, 3, 4, 8, 9, 2, 7, 1, 6]])
Which successfully returns a solution to the Sudoku puzzle!
Final analysis and conclusion
The solver implemented in this post, will perform a depth-first search of the viable options to solve the Sudoku puzzle. In the worst case scenario, it will traverse most possible combinations of candidates, discarding some as it fills the grid. This makes it an exponential algorithm. Due to the exponential increase of complexity, this solver will take too long to solve harder Sudoku puzzles. For those puzzles, the candidate combinations should be filtered more efficiently before attempting the recursive step.
You can use this sudoku solver to help you solve your puzzle in an easy way
They have other social media in here: Sudoku solver