SUDOKU dataset
160 pictures of Sudoku ready for ML
Created by Baptiste Wicht during his PHD about the use of Deep Learning technologies to automatically extract features from images, the Sudoku dataset contains 160 pictures of Sudoku taken in various newspapers using smartphone cameras.
160 pictures of Sudoku, divided into two sets: 120 training images and 40 test images
More than just images, the dataset also contains metadata ! Each image has a corresponding .dat
file containing:
- the brand and model of the phone used to take the picture
- the textual representation of the sudoku grid
For example, to the following image:
is associated to the following .dat file:
sonyEricsson w810i 1632x1224:24 JPG 0 4 2 0 0 0 0 0 5 0 0 0 6 3 2 0 8 0 0 8 0 0 4 0 2 0 0 0 0 0 0 0 0 0 0 0 7 1 5 0 6 8 3 4 0 9 0 8 3 5 0 7 6 1 0 9 1 0 0 6 0 0 0 0 0 0 0 2 0 1 9 0 0 0 6 1 0 0 0 5 0
Downloads and resources
You can find the dataset directly on github or download it via the following links:
Some results have already been obtained on this dataset. To find out more:
- Camera-based Sudoku recognition with deep belief network: Baptiste Wicht / Jean Hennebert (EIA-FR, Switzerland), Hough Transform and DBN : 12.5% error rate
- Sudoku Recognition with Deep Belief Network: Blog post of Baptiste Wicht about the dataset and some results.