"Model Sudoku as an optimization problem"
Sudoku is one of my favorite puzzles. In classic Sudoku, the goal is to fill a 9×9 grid with digits so that each column, each row, and each of the nine 3×3 sub-grids contains the digits 1-9 without repetition. At the start, we are provided a partially completed grid, which, for a well-posed puzzle, has a single solution.
Software Engineers may come across Sudoku in their LeetCode coding interview preparations through problems such as "Valid Sudoku" (medium; determine if a Sudoku board is valid) and "Sudoku Solver" (hard; solve a Sudoku puzzle).
Throughout the years that I interviewed Operations Research specialists, I liked to ask a variant of these two questions: model Sudoku as an optimization problem. Familiarity with algebraic modeling languages and the assumed access to a solver make this a relatively easy problem to tackle.
Most optimization problems have three components: (1) an objective, (2) a set of variables, and (3) a set of constraints. In the following, we'll model Sudoku as an optimization problem, defining these three components.
Objective
Sudoku doesn't have an objective in the sense that there is nothing to minimize or maximize — it is simply a feasibility problem. Some candidates would start to model the problem by defining the "objective" as or , but would become uncomfortable simply replacing the "" by a scalar like 1. While it may look odd, or are perfectly fine objectives!
Variables
As Robert Frost would say in his famous poem, "The Road Not Taken," this is where the roads diverge. Few candidates would formulate the problem as a Constraint Programming (CP) problem, allowing them to make later use of logical constraints such as AllDifferent. At this point, they would introduce 9 × 9 integer variables (ranging from 1 to 9). But most candidates would formulate the problem as a Mixed-Integer Linear Program (MIP), which is how we'll formulate it here. They would introduce 9×9×9 binary variables as follows:
where are the sets of rows, columns, and values, respectively. Here, means that cell takes value .
Why 9×9×9 = 729 binary variables instead of 81 integer variables? Because this encoding turns the problem into a linear program — the constraints become simple sums of binary variables, which solvers handle very efficiently.
Constraints
We need four families of constraints to encode the rules of Sudoku:
Cell constraint. Each cell must contain exactly one value:
Row constraint. Each value appears exactly once in each row:
Column constraint. Each value appears exactly once in each column:
Subgrid constraint. Each value appears exactly once in each subgrid (where for standard Sudoku):
Finally, we fix the values of the given clues:
That's the entire model. Note how clean and declarative this is compared to a backtracking solver — we simply state what the solution must satisfy, and let the solver figure out how.
Implementation
We'll use AMPL as our algebraic modeling language and SCIP as our solver. Here's the model, which generalizes to any Sudoku:
ampl.eval("""
param n integer > 0;
param N := n * n;
set R := 1 .. N; # Rows
set C := 1 .. N; # Columns
set V := 1 .. N; # Values
param clue {R, C} default 0;
var x {R, C, V} binary;
subject to CellConstraint {r in R, c in C}:
sum {v in V} x[r, c, v] == 1;
subject to RowConstraint {r in R, v in V}:
sum {c in C} x[r, c, v] == 1;
subject to ColumnConstraint {c in C, v in V}:
sum {r in R} x[r, c, v] == 1;
subject to SubgridConstraint {a in 0 .. n - 1, b in 0 .. n - 1, v in V}:
sum {c in 1 .. n, d in 1 .. n} x[a * n + c, b * n + d, v] == 1;
subject to Clues {r in R, c in C: clue[r, c] > 0}:
x[r, c, clue[r, c]] == 1;
""")The mapping from the mathematical formulation to the AMPL code is almost one-to-one — a hallmark of good algebraic modeling languages.
Validating: is the solution unique?
Solving a Sudoku is only half the story. A well-posed Sudoku puzzle must have a unique solution. How can we verify this?
The idea is simple: after finding a first solution, we add a constraint that excludes it, then solve again. If the solver reports the problem as infeasible, the original solution was unique. If it finds a second solution, the puzzle is ambiguous.
We exclude the first solution by requiring that at least one cell must differ:
In AMPL:
ampl.eval("""
param exclude {R, C} default 0;
subject to ExcludeSolution:
sum {r in R, c in C: exclude[r, c] > 0} x[r, c, exclude[r, c]] <= N * N - 1;
""")And the full solve-and-validate loop:
def solve_sudoku(clue, verbose=False):
"""
Solves a Sudoku given a clue.
Returns if the solution is unique, a first solution,
and (if non-unique) a second solution.
"""
# ... (set up model and parameters) ...
# 1st solve
ampl.solve(verbose=verbose)
solution_1 = extract_solution()
# Exclude this solution
ampl.param["exclude"] = solution_1
# 2nd solve
ampl.solve(verbose=verbose)
# If infeasible → unique; if solved → not unique
is_unique = (solve_result_1 == "solved"
and solve_result_2 == "infeasible")
return is_unique, solution_1, solution_2Solving hard Sudokus
Let's put our solver to the test with two famously difficult puzzles.
A 17-clue Sudoku
The following puzzle, from McGuire et al. (2014), has only 17 clues — the minimum number possible for a 9×9 Sudoku with a unique solution (more on this in the next post):
clue = np.array([
[0, 0, 0, 8, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 4, 3],
[5, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 7, 0, 8, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0],
[0, 2, 0, 0, 3, 0, 0, 0, 0],
[6, 0, 0, 0, 0, 0, 0, 7, 5],
[0, 0, 3, 4, 0, 0, 0, 0, 0],
[0, 0, 0, 2, 0, 0, 6, 0, 0]
])
is_unique, solution_1, _ = solve_sudoku(clue)
print(f"Unique: {is_unique}") # TrueThe solver finds the unique solution in a fraction of a second:
2 3 7 | 8 4 1 | 5 6 9
1 8 6 | 7 9 5 | 2 4 3
5 9 4 | 3 2 6 | 7 1 8
------+-------+------
3 1 5 | 6 7 4 | 8 9 2
4 6 9 | 5 8 2 | 1 3 7
7 2 8 | 1 3 9 | 4 5 6
------+-------+------
6 4 2 | 9 1 8 | 3 7 5
8 5 3 | 4 6 7 | 9 2 1
9 7 1 | 2 5 3 | 6 8 4Arto Inkala's "world's hardest Sudoku"
In 2012, Finnish mathematician Arto Inkala designed what he claimed to be the world's hardest Sudoku. It has 21 clues and is notoriously difficult for human solvers and logic-based algorithms:
clue = np.array([
[8, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 3, 6, 0, 0, 0, 0, 0],
[0, 7, 0, 0, 9, 0, 2, 0, 0],
[0, 5, 0, 0, 0, 7, 0, 0, 0],
[0, 0, 0, 0, 4, 5, 7, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 3, 0],
[0, 0, 1, 0, 0, 0, 0, 6, 8],
[0, 0, 8, 5, 0, 0, 0, 1, 0],
[0, 9, 0, 0, 0, 0, 4, 0, 0]
])
is_unique, solution_1, _ = solve_sudoku(clue)
print(f"Unique: {is_unique}") # TrueAgain, our MIP solver handles it with ease:
8 1 2 | 7 5 3 | 6 4 9
9 4 3 | 6 8 2 | 1 7 5
6 7 5 | 4 9 1 | 2 8 3
------+-------+------
1 5 4 | 2 3 7 | 8 9 6
3 6 9 | 8 4 5 | 7 2 1
2 8 7 | 1 6 9 | 5 3 4
------+-------+------
5 2 1 | 9 7 4 | 3 6 8
4 3 8 | 5 2 6 | 9 1 7
7 9 6 | 3 1 8 | 4 5 2The notion of "difficulty" that challenges human solvers — requiring chains of hypothetical reasoning — is largely irrelevant to a solver that explores the entire search space systematically.
Closing thoughts
What started as an interview question — model Sudoku as an optimization problem — turns out to be a beautiful example of declarative problem-solving. Instead of writing a clever backtracking algorithm, we described the rules of Sudoku as linear constraints over binary variables and handed the problem to a solver.
The solver doesn't just find a solution — with one additional constraint, it can prove that the solution is unique, effectively validating the puzzle itself.
But this raises a deeper question: how few clues can a Sudoku puzzle have and still admit a unique solution? For the standard 9×9 grid, the answer is 17 — a result that took significant computational effort to prove. In the next post, we'll explore this question and extend it to Sudoku grids of other sizes.
The full code is available as a Colab notebook.