There is no 16-Clue Minimum Sudoku

The story is over. When you searched "Minumum Sudoku" in Google you could read again and again that "The 16-clue Minimum sudoku" question is still open, but the conjecture would be, that there is no 16-clue Sudoku.

As of 2012-01-01 we know, what we before only strongly suggested. McGuire and his team proved it by full enumeration of all possible grids.

For more than 5 years Gordon Royle was collecting 17er puzzle with a special sudoku generator for 17-clue minimum Sudoku. He was something like the last instance for such type of Sudoku. His famous and free collection of 17-clue minimum Sudoku included end of 2010 almost 50.000 Sudoku. Many and I have used his database to test their own sudoku solvers.

## A brief excursus on "How to generate 17- clue Minimum Sudoko?" and "Maximum Hardness"

With an unspecific Sudoku generator program, you have to wait long, very long, when you hope to create a 17 -Clue Sudoku. It may even happen, that you find none after several days. You have to find a more specific approach to attack the problem:

It's important to assure, that the generator is inserting the restrictions (clues) in a way, that you have least overlappings (!) with other clues or digits in the neighborhood of the same clue.

To illustrate the problem: when the first 3 rows are completely filled with digits, the rest of the Sudoku is in no means filled out and decided. On the other site when clues are widespread in a regular pattern over the whole grid it is no problem at all to create a Sudoku with a unique solution by 27 clues.

A specific generator to create 17-clue Sudoku therefore is sorting out templates with a high degree of overlapping for single digits and numbers in an early stage of the calaculation. This is in order to save computing time. Another approach to create minimum Sudoku is to take existing one and modify those slightly and hope to find new ones.

How does hardness and the property "minimal number of clues" relate? To express it in simple words: 99% of all known 17-clue sudoku have a rating under SE 3.0 and the very few hard one do not exceed the level of SE 9.0. This is not completely surprising as the above mentioned rule to have overlappings (!) is in contradiction to a heuristic for the creation of hard sudoku - the number of other clues each clue is seeing should be minimal.

My own solver likes 17ers - as he could easily solve them by logic. Unfortunately he is failling solving hard ones.

End of Excursus

"A good Sudoku solver can work a lot faster than a puzzle a second, so there’s no problem in checking the existing 17s to see if they contain a 16. The problem is that we have no way of knowing that the database is complete – because of the way they were constructed they are (in some sense) “close” to each other, and so they form a sort of large clump of puzzles in “puzzle space”. It is conceivable, though I believe unlikely, that there is a completely separate clump of puzzles that might contain 16s…

Let me give you three figures at this stage

Upper bound of 16-Clue Sudoku with multiple solution (5,47 * 10^9)*((81!/16!)/(81-16)!) = 2,24*10^26

Upper bound of 12-Clue Sudoku with multiple solution (5,47 * 10^9)*((81!/12!)/(81-12)!) = 4,71*10^23

Why the hell should there only be 50.000 unique 17ers in a puzzle space of something like 7,02*10^26 possible non isomorph 17ers? O.k. still there are a lot of duplicates in this figure because 2 identical puzzles could be created from different grids. Gordon writes: "It is conceivable, though I believe unlikely, that there is a completely separate clump of puzzles that might contain 16s…".

Well, when you use a certain method to create 17ers - as desrcibed above. But why not thinking about a completly other method? I leave it as it is. Gordon's idea, as we know today, was a dead end.

Before McGuire could start his calculations, there was a reasonable progress made in 2010 by Hung-Hsuan Lin and I-Chen Wu. They published their article "Solving the Minimum Sudoku Poblem" in November 2010.

"It is known that solving the minimum Sudoku problem can be done by checking 5,472,730,538 essentially different Sudoku grids, ... the program Checker, written by McGuire, requires about 311 thousand years on one-core CPU to check these grids completely, according to our experimental analysis. ", wrote Lin and Wu and continued: "This paper proposes a new algorithm, named a disjoint minimal unavoidable set (DMUS) algorithm, to help solve the minimum Sudoku problem. ... In our experiment, the performance was greatly improved by a factor of 128.67. Hence, the improved program by us requires about 2417.4 years only."

They started a repsective project on the BOINC platform in August 2010 to run all these 5,472,730,538 essentially different Sudoku grids. Currently. On 2012-01-21 the percentage of the checked grids to the total was 31.48%. Last year when I read about their approach first, I wondered how long it would take them to get that problem - 16-Clue Sudoku - solved. My assumption was that it would take another 5 -10 years. I think my guess was realistic, when you take in consideration that this problem is not one progressing mankind.

However one should never underestimate peoples motivation. Mallory was attacking Mount Everest 3 times in the early 20th of the last century. He died during the 3rd attack, but got famous for his answer, why he would try to climb this mountain. His answer was: “Because it is there.”

George Mallroy (1886 - 1924)

Having spent two years testing the algorithm, McGuire and his team used about 7
million CPU hours at the Irish Centre for High-End Computing in Dublin,
searching through possible grids with the hitting-set algorithm.

“The only
realistic way to do it was the brute force approach,” explained Gordon Royle, mathematician from University of Western Australian in Perth. He had been
working on the problem of counting 17 clue puzzles using different algorithms.

.

17er Sudoku sample

"I have only skimmed the paper so far", Royle congratulated and continued: "but the basic method is clear enough – to examine each of the possible completed grids in turn and exhaustively search the grid to demonstrate that no 16-clue subgrid is uniquely completable. One way of doing this that does not involve actually solving every possible 16-clue subpuzzle is to observe that certain subsets of the cells must necessarily be 'hit' by one of the 16 clues or they could simply be rearranged in the final solution. Finding so many subsets of this type that no collection of 16 cells can hit them all is then a (computer) proof that this particular grid contains no 16-clue uniquely completable subgrid."

Royle's comment sounds like disjoint minimal unavoidable set (DMUS) algorithm invented by Hung-Hsuan Lin and I-Chen Wu in 2010. Latest since their paper was published, it was only a question of time to decide about the 16er question and two teams seem to have started the race like Amundsen and Scott to the South pole 100 years ago.

I may amend that the energy consumed, could
be sufficient to supply a medium sized town close to the Sahel zone
with light and cooling for the same amount time. I think somehow people
must feel like Mallroy to attack a target with such an intensity (and
such an amount of engergy and money). The risk to loose their lives is
negligible however in our days.

Just to mention that we still have no confirmation that the prove is correct. Such kind of computer proof may have faults and miscalculations just by computer failliure and it's hard and costly to reproduce 7 mio core computer hours.