Last week we saw how to to implement a Sudoku solver using Python. The idea was to fill the most evident candidates for the empty cells and then make a guess for one empty cell and repeat the process recursively, backtracking when a guess led to a grid with no solutions.
While this process was simple, it had the disadvantage of producing too many combinations of possible solutions. In practice, it wasn’t efficient enough to solve harder Sudoku puzzles in a reasonable amount of time. This week we’ll see how to improve upon last week’s implementation to create a more efficient solver.
The original solver
We can see how the first solver worked by checking its code:
def solve(grid): grid = fill_singles(grid) if is_solution(grid): return grid if not is_valid_grid(grid): return None return make_guess(grid)
It starts by filling empty cells that have a single candidate according to Sudoku’s rules (not repeating any digit between 1 to 9 in any row, column or sub-grid). It returns a completely filled grid when it’s found. To signal failure to solve the grid, it returns None
. Failure to get a solution takes place when a cell with no more possible candidates is found. Finally, it makes a guess to fill the next empty cell with a plausible candidate to repeat the process (make_guess
calls solve
, creating a recursive process).
Exponential complexity
The issue with the last algorithm is that it will try many of the possible candidates combinations. As a simple estimate, let’s assume half of the cells (about 40) are empty and that they have on average 4 possible candidates. In that case, we’re talking about the algorithm having to test and backtrack trillions of trillions of possible combinations (9!^4). Evidently, it will not find a solution in any reasonable amount of time.
The issue here are the combinations of possible candidates on the grid. It increases exponentially. We will see of a way to cut down on the number of combinations to find a solution.
The new algorithm: filtering candidates
What we’ll do will be to “filter” the candidates before using them to “guess” the value of empty cells. By this filtering process, we’ll drastically decrease the amount of possible candidate combinations.
This new algorithm is implemented as follows:
def filtered_solve(grid): candidates = filter_candidates(grid) grid = fill_singles(grid, candidates) if is_solution(grid): return grid if not is_valid_grid(grid): return None return make_guess(grid, candidates)
There are two main differences. First, we have added a filtering of candidates with the filter_candidates
function. Next, we have modified the fill_singles
and the make_guess
functions to accept these filtered candidates
as a parameter. Let’s see in detail the necessary modifications.
Filter candidates
We will filter candidates by filling each empty cell with each one of its candidates, one at a time. Then we will check if this new grid, which includes the single “candidate value”, leads to an grid with no solutions (a grid with a cell with no possible candidates). If that’s the case, the “candidate value” will be deleted from the list of possible candidates for that cell.
from copy import deepcopy def filter_candidates(grid): test_grid = grid.copy() candidates = get_candidates(grid) filtered_candidates = deepcopy(candidates) for i in range(9): for j in range(9): # Check for empty cells if grid[i][j] == 0: for candidate in candidates[i][j]: # Use test candidate test_grid[i][j] = candidate # Remove candidate if it produces an invalid grid if not is_valid_grid(fill_singles(test_grid)): filtered_candidates[i][j].remove(candidate) # Revert changes test_grid[i][j] = 0 return filtered_candidates
Let’s see how this works with last week’s puzzle.
puzzle = """043080250 600000000 000001094 900004070 000608000 010200003 820500000 000000005 034090710"""
And it’s candidates:
>>> candidates = get_candidates(grid) [[[1, 7], [4], [3], [9, 7], [8], [9, 6, 7], [2], [5], [1, 6, 7]], [[6], [8, 9, 5, 7], [1, 2, 5, 7, 8, 9], [9, 3, 4, 7], [2, 3, 4, 5, 7], [2, 3, 5, 7, 9], [8, 1, 3], [8, 3], [8, 1, 7]], [[2, 5, 7], [8, 5, 7], [8, 2, 5, 7], [3, 7], [2, 3, 5, 6, 7], [1], [8, 3, 6], [9], [4]], [[9], [8, 5, 6], [8, 2, 5, 6], [1, 3], [1, 3, 5], [4], [8, 1, 5, 6], [7], [8, 1, 2, 6]], [[2, 3, 4, 5, 7], [5, 7], [2, 5, 7], [6], [1, 3, 5, 7], [8], [1, 4, 5, 9], [2, 4], [1, 2, 9]], [[4, 5, 7], [1], [8, 5, 6, 7], [2], [5, 7], [9, 5, 7], [4, 5, 6, 8, 9], [8, 4, 6], [3]], [[8], [2], [1, 9, 6, 7], [5], [1, 3, 4, 6, 7], [3, 6, 7], [9, 3, 4, 6], [3, 4, 6], [9, 6]], [[1, 7], [9, 6, 7], [1, 9, 6, 7], [1, 3, 4, 7, 8], [1, 2, 3, 4, 6, 7], [2, 3, 6, 7], [3, 4, 6, 8, 9], [2, 3, 4, 6, 8], [5]], [[5], [3], [4], [8], [9], [2, 6], [7], [1], [8, 2, 6]]]
Now let’s see how many candidates we have after filtering:
>>> filtered_candidates = filter_candidates(grid) [[1], [4], [3], [9], [8], [6], [2], [5], [6, 7]] [[6], [8, 9, 5, 7], [5, 7, 8, 9], [4], [2, 4, 5], [2, 5], [8, 1, 3], [8, 3], [1, 7]] [[2], [8, 5], [8, 5, 7], [3, 7], [3, 5, 6], [1], [8, 6], [9], [4]] [[9], [8, 5, 6], [8, 2, 5, 6], [1, 3], [1, 3, 5], [4], [8, 1, 5, 6], [7], [8]] [[3], [5, 7], [2, 5, 7], [6], [1, 3, 5, 7], [8], [1, 4, 9], [2, 4], [1, 2, 9]] [[4], [1], [8, 5, 6, 7], [2], [5, 7], [9, 5, 7], [5, 6, 8, 9], [8, 6], [3]] [[8], [2], [1, 9, 6], [5], [1, 3, 4, 6, 7], [3, 6, 7], [9, 3, 4, 6], [3, 4, 6], [9, 6]] [[7], [9, 6], [1, 9, 6], [1, 4], [1, 2, 3, 4, 6], [2, 3, 6], [3, 4, 6, 8, 9], [2, 3, 4, 6, 8], [5]] [[5], [3], [4], [8], [9], [2, 6], [7], [1], [2, 6]]
Not counting filled cells (those that initially had a single candidate), we have reduced the total number of candidates from 185 to 140. We pass from an average of 3.5 candidates per empty cell to about 2.5. For this particular case, by a single filter, we’re passing from about 10^28 possible combinations to 10^20. We can see the interest of filtering for our recursive algorithm.
Including the filtered candidates
Slight modification will be needed for the make_guess
and the fill_singles
functions to include the filtered candidates as a parameter.
def make_guess(grid, candidates=None): grid = grid.copy() if not candidates: candidates = get_candidates(grid) # Getting the shortest number of candidates > 1: min_len = sorted(list(set(map( len, np.array(candidates).reshape(1,81)[0]))))[1] for i in range(9): for j in range(9): if len(candidates[i][j]) == min_len: for guess in candidates[i][j]: grid[i][j] = guess solution = filtered_solve(grid) if solution is not None: return solution # Discarding a wrong guess grid[i][j] = 0
The changes on make_guess
are minor. We just include candidates
as a parameter and calculate it if it isn’t passed as a parameter. The second minor difference is that we call filtered_solve
, our new solver, recursively here.
The changes on fill_singles
are similar but we also require a helper function, merge
.
def fill_singles(grid, candidates=None): grid = grid.copy() if not candidates: candidates = get_candidates(grid) any_fill = True while any_fill: any_fill = False for i in range(9): for j in range(9): if len(candidates[i][j]) == 1 and grid[i][j] == 0: grid[i][j] = candidates[i][j][0] candidates = merge(get_candidates(grid), candidates) any_fill = True return grid
The merge
function is called to update the list of candidates after each cell is filled. We do so by comparing the regular candidates list with the filtered one and taking the shorter list of candidates for each cell:
def merge(candidates_1, candidates_2): candidates_min = [] for i in range(9): row = [] for j in range(9): if len(candidates_1[i][j]) < len(candidates_2[i][j]): row.append(candidates_1[i][j][:]) else: row.append(candidates_2[i][j][:]) candidates_min.append(row) return candidates_min
With this, we have finished with all of the necessary changes for the improved solver! The rest of the helper functions haven’t changed. Be sure to check the original solver for a detailed explanation.
Let’s see how the new solver works.
Solving a hard Sudoku puzzle
I’m not familiar enough with Sudoku to qualify this puzzle as a difficult one, but I verified that last week’s solver was incapable of solving it.
The puzzle (remember that each zero represents an empty cell):
puzzle = """100920000 524010000 000000070 050008102 000000000 402700090 060000000 000030945 000071006"""
And its solution using filtered_solve
:
>>> filtered_solve(create_grid(puzzle)) array([[1, 7, 6, 9, 2, 3, 5, 8, 4], [5, 2, 4, 8, 1, 7, 6, 3, 9], [8, 9, 3, 6, 5, 4, 2, 7, 1], [9, 5, 7, 3, 4, 8, 1, 6, 2], [6, 3, 8, 1, 9, 2, 4, 5, 7], [4, 1, 2, 7, 6, 5, 3, 9, 8], [2, 6, 5, 4, 8, 9, 7, 1, 3], [7, 8, 1, 2, 3, 6, 9, 4, 5], [3, 4, 9, 5, 7, 1, 8, 2, 6]])
The correct solution!
While it takes a few seconds to get a solution, this is a vast improvement compared to the original solver. You can try using it to solve this puzzle to compare. On my side, I verified that it couldn’t solve this puzzle even after several minutes.
Bonus: You can find the complete code of both solvers here.
Nice job!
Thanks!
This is very cool…I have been trying the code out myself with various combinations…!!!
If I apply, for example:
diff_puzzle =
“””852976243
679143285
031258769
314527896
768391450
925600371
543862917
197435028
I get an error…object of type int32 has no len()…(in the make guess function)any ideas what I have done wrong? Is there a certain type sudoku this will struggle with? Perhaps I have miscoded?
Thank you very much for any help.
Hello Toby!
Thank you for your comment. I tried solving the puzzle you shared and got a similar error message. I’m not 100% sure about what happens. Then I realized that
diff_puzzle
isn’t a valid sudoku. See that on the first row,2
appears two times. To avoid this problem, we should verify if the input sudoku is a valid one before attempting to solve it by checking that every number appears only once for each row, column and sub-grid.The method of solving does not work for some puzzles. For this puzzle:
[[2 6 0 0 7 4 0 0 0]
[4 0 9 0 8 0 0 3 0]
[0 0 0 0 2 0 0 0 0]
[3 0 0 7 0 0 0 4 0]
[0 0 5 0 6 0 0 0 0]
[0 0 6 5 0 9 0 2 0]
[0 0 0 0 0 0 3 8 0]
[0 0 7 0 9 0 2 0 1]
[0 0 0 0 0 0 0 6 0]]
In the make_guess function
min_len = sorted(list(set(map(len, np.array(candidates).reshape(1,81)[0]))))[1]
returns ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (9, 9) + inhomogeneous part.
And in case you’re wondering, it’s a valid puzzle with a solution.