A Sudoku solver using Python

During this week I was working on one of Project Euler’s problems. The meat of the problem was implementing a Sudoku solver. After much work, I was finally able to program a solver that recursively tried filling the empty cases and tracked back when necessary. In this post we will see how to implement a simple version of the solver.

Note: This solver is a simplified implementation. It will take too long to solve difficult Sudoku puzzles. For a more efficient solution based on this one, see here.

Setting up the Sudoku grid

Sudoku: What is it?

If you’re not familiar with it, Sudoku is a mathematical puzzle where you have a 9×9 grid partially filled with digits from 1 to 9. The grid is also divided into nine 3×3 sub-grids. The goal is to fill the grid’s empty cases, having each digit between 1-9 appear only once in each row, column and 3×3 sub-grid.

Using Numpy for the grid

For the 9×9 grid, we will use Numpy arrays (the experience I got using Numpy to implement a tic-tac-toe board came in handy here). We will use as input a string of digits. Empty cases will be denoted by 0 and each line of the string will represent a row of the grid. That means we will have 9 lines of 9 digits each. As an example, we will work with:

puzzle = """043080250
            600000000
            000001094
            900004070
            000608000
            010200003
            820500000
            000000005
            034090710"""

We can populate the 9×9 Sudoku grid with the following function:

def create_grid(puzzle_str):
    # Deleting whitespaces and newlines (\n)
    lines = puzzle_str.replace(' ','').replace('\n','')
    # Converting it to a list of digits
    digits = list(map(int, lines))
    # Turning it to a 9x9 numpy array
    grid = np.array(digits).reshape(9,9)
    return grid

Which will return the grid as expected:

>>> grid = create_grid(puzzle)
array([[0, 4, 3, 0, 8, 0, 2, 5, 0],
       [6, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 9, 4],
       [9, 0, 0, 0, 0, 4, 0, 7, 0],
       [0, 0, 0, 6, 0, 8, 0, 0, 0],
       [0, 1, 0, 2, 0, 0, 0, 0, 3],
       [8, 2, 0, 5, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 5],
       [0, 3, 4, 0, 9, 0, 7, 1, 0]])

Dividing into sub-grids

Part of the analysis has to do with checking the values of each sub-grid. We can group the values of each 3×3 sub-grid as follows:

def get_subgrids(grid):
    subgrids = []
    for box_i in range(3):
        for box_j in range(3):
            subgrid = []
            for i in range(3):
                for j in range(3):
                    subgrid.append(grid[3*box_i + i][3*box_j + j])
            subgrids.append(subgrid)
    return np.array(subgrids)

This will return the following sub-grids for our original puzzle:

>>> subgrids = get_subgrids(grid)
array([[0, 4, 3, 6, 0, 0, 0, 0, 0],
       [0, 8, 0, 0, 0, 0, 0, 0, 1],
       [2, 5, 0, 0, 0, 0, 0, 9, 4],
       [9, 0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 4, 6, 0, 8, 2, 0, 0],
       [0, 7, 0, 0, 0, 0, 0, 0, 3],
       [8, 2, 0, 0, 0, 0, 0, 3, 4],
       [5, 0, 0, 0, 0, 0, 0, 9, 0],
       [0, 0, 0, 0, 0, 5, 7, 1, 0]])

See that we’re dividing the original grid into groups of 3×3 sub-grids from left to right, top to bottom.

Getting case candidates

We know that each digit can occur only once for each row, column and sub-grid. That means that each case can be filled with digits that don’t appear in the case’s row, column and sub-grid. We will call that set of options “candidates”.

The following function implements this idea:

def get_candidates(grid):
    # Helper function
    def subgrid_index(i, j):
        return (i//3) * 3 + j // 3
    subgrids = get_subgrids(grid)
    candidates = []
    for i in range(9):
        row_candidates = []
        for j in range(9):
            # Row, column and subgrid digits
            row = set(grid[i])
            col = set(grid[:, j])
            sub = set(subgrids[subgrid_index(i, j)])
            common = row | col | sub
            candidates = set(range(10)) - common
            # If the case is filled take its value as the only candidate
            if not grid[i][j]:
                row_candidates.append(list(candidates))
            else:
                row_candidates.append([grid[i][j]])
        candidates.append(row_candidates)
    return candidates

We have used Python set’s operations to handle the candidates. If you aren’t familiar with them, you can see a worked example here.

The candidates are implemented as nested lists. We made use of the helper function subgrid_index that returns the corresponding sub-grid index for the i and j indices of the grid’s case.

This function returns the following candidates for our current puzzle:

>>> candidates = get_candidates(grid)
[[[1, 7], [4], [3], [9, 7], [8], [9, 6, 7], [2], [5], [1, 6, 7]],
 [[6], [8, 9, 5, 7], [1, 2, 5, 7, 8, 9], [9, 3, 4, 7], [2, 3, 4, 5, 7], [2, 3, 5, 7, 9], [8, 1, 3], [8, 3], [8, 1, 7]],
 [[2, 5, 7], [8, 5, 7], [8, 2, 5, 7], [3, 7], [2, 3, 5, 6, 7], [1], [8, 3, 6], [9], [4]],
 [[9], [8, 5, 6], [8, 2, 5, 6], [1, 3], [1, 3, 5], [4], [8, 1, 5, 6], [7], [8, 1, 2, 6]],
 [[2, 3, 4, 5, 7], [5, 7], [2, 5, 7], [6], [1, 3, 5, 7], [8], [1, 4, 5, 9], [2, 4], [1, 2, 9]],
 [[4, 5, 7], [1], [8, 5, 6, 7], [2], [5, 7], [9, 5, 7], [4, 5, 6, 8, 9], [8, 4, 6], [3]],
 [[8], [2], [1, 9, 6, 7], [5], [1, 3, 4, 6, 7], [3, 6, 7], [9, 3, 4, 6], [3, 4, 6], [9, 6]],
 [[1, 7], [9, 6, 7], [1, 9, 6, 7], [1, 3, 4, 7, 8], [1, 2, 3, 4, 6, 7], [2, 3, 6, 7], [3, 4, 6, 8, 9], [2, 3, 4, 6, 8], [5]],
 [[5], [3], [4], [8], [9], [2, 6], [7], [1], [8, 2, 6]]]

For instance, the case at the 3th row (i = 2) and the 1st column (j = 0) could be filled with any of the digits candidates[i][j]. That means any of the digits in the list [2, 5, 7].

Filling single candidates

Once we have generated the candidates for each case, we will see that some of them can be filled by only a single candidate. We will start filling the grid with those “single candidates”.

def fill_singles(grid):
    grid = grid.copy()
    candidates = get_candidates(grid)
    any_fill = True
    while any_fill:
        any_fill = False
        for i in range(9):
            for j in range(9):
                if len(candidates[i][j]) == 1 and grid[i][j] == 0:
                    grid[i][j] = candidates[i][j][0]
                    candidates = get_candidates(grid)
                    any_fill = True

    return grid

This function is implemented in a loop. Every time we fill any case, we generate a new set of possible candidates until there aren’t any new filled cases.

When we apply this function to our grid, we get:

>>> fill_singles(grid)
array([[ 0 , 4, 3,  0 , 8, 0, 2, 5, 0],
       [ 6 , 0, 0,  0 , 0, 0, 0, 0, 0],
       [ 0 , 0, 0,  0 , 0, 1, 0, 9, 4],
       [ 9 , 0, 0,  0 , 0, 4, 0, 7, 0],
       [ 0 , 0, 0,  6 , 0, 8, 0, 0, 0],
       [ 0 , 1, 0,  2 , 0, 0, 0, 0, 3],
       [ 8 , 2, 0,  5 , 0, 0, 0, 0, 0],
       [ 0 , 0, 0,  0 , 0, 0, 0, 0, 5],
       [*5*, 3, 4, *8*, 9, 0, 7, 1, 0]])

I have enclosed the new inputs with asterisks to remark them. We have only two new values filled in this case. But the fill_singles function alone will be able to solve some of the easiest Sudoku puzzles!

Once we’re stuck solving Sudoku puzzles, like in this case, we will use a recursive approach.

Recursive Sudoku solver

To implement a Sudoku solver, we will follow four steps:

Fill cases with only one candidate.
If the grid is completely filled, return it.
Check if the current grid is “valid”. If it isn’t erase the last guess and continue with the next guess.
If the current grid is both incomplete and “valid”, fill one case with a guess and go back to 1.

The fill_singles function of the last section takes care of the first step. Let’s see how to implement the other steps.

Verifying a solution

The solver will return a grid when it’s completely filled. A solution verifies that all rows, columns and sub-grids have only one occurrence of the digits 1 to 9:

def is_solution(grid):
    if np.all(np.sum(grid, axis=1) == 45) and \
       np.all(np.sum(grid, axis=0) == 45) and \
       np.all(np.sum(get_subgrids(grid), axis=1) == 45):
        return True
    return False

With this function, we indirectly verify the uniqueness of each digit by taking the sum of the grid values along each row, column and sub-grid. If we sum the digits 1 to 9, we get 45.

Invalid guess

We will know when to backtrack when we reach a Sudoku grid that has a cell that can’t be filled with any value. For example, this could happen if by checking the possible candidates in a cell’s row, we can fill it only with the digit 5 but that digit is already part of its corresponding column or sub-grid. This no-solution situations will arise as empty list candidates. We can easily check for them with:

def is_valid_grid(grid):
    candidates = get_candidates(grid)
    for i in range(9):
        for j in range(9):
            if len(candidates[i][j]) == 0:
                return False
    return True

Making a guess

Whenever we reach a point where we can’t further fill the Sudoku grid by filling single candidates, we will fill a case by guessing which one of its candidates is the correct one. In order to increase the chances of guessing correctly, the guess will be taken from a case that has the least number of possible candidates. Then we will try to find a solution for the grid with the “guessed” value.

def make_guess(grid):
    grid = grid.copy()
    candidates = get_candidates(grid)
    # Getting the shortest number of candidates > 1:
    min_len = sorted(list(set(map(
        len, np.array(candidates).reshape(1,81)[0]))))[1]
    for i in range(9):
        for j in range(9):
            if len(candidates[i][j]) == min_len:
                for guess in candidates[i][j]:
                    grid[i][j] = guess
                    solution = solve(grid)
                    if solution is not None:
                        return solution
                    # Discarding incorrect guess
                    grid[i][j] = 0

The recursive step takes place here, when we call the solve function that restarts the process of attempting a solution on a new grid. The failing condition of the solve function will return a None object, at which point the grid will revert back the changes on the grid and continue with the next possible guesses.

Finally, let’s see how the solve function puts it all together.

Recursive solver

The solver follows the steps described at the beginning of this section:

def solve(grid):
    grid = fill_singles(grid)
    if is_solution(grid):
        return grid
    if not is_valid_grid(grid):
        return None
    return make_guess(grid)

The recursive step and the track back takes place inside the make_guess function. We have seen this function will call the solve function again, but on a new grid and revert the changes on failure.

Applying the solve function on our puzzle, we get:

>>> solve(create_grid(puzzle))
array([[1, 4, 3, 9, 8, 6, 2, 5, 7],
       [6, 7, 9, 4, 2, 5, 3, 8, 1],
       [2, 8, 5, 7, 3, 1, 6, 9, 4],
       [9, 6, 2, 3, 5, 4, 1, 7, 8],
       [3, 5, 7, 6, 1, 8, 9, 4, 2],
       [4, 1, 8, 2, 7, 9, 5, 6, 3],
       [8, 2, 1, 5, 6, 7, 4, 3, 9],
       [7, 9, 6, 1, 4, 3, 8, 2, 5],
       [5, 3, 4, 8, 9, 2, 7, 1, 6]])

Which successfully returns a solution to the Sudoku puzzle!

Final analysis and conclusion

The solver implemented in this post, will perform a depth-first search of the viable options to solve the Sudoku puzzle. In the worst case scenario, it will traverse most possible combinations of candidates, discarding some as it fills the grid. This makes it an exponential algorithm. Due to the exponential increase of complexity, this solver will take too long to solve harder Sudoku puzzles. For those puzzles, the candidate combinations should be filtered more efficiently before attempting the recursive step.