Points of Accumulation in Data sets
Lab 11: Digit Lab

By NIST National Institute of Standards and Technology
US Postal Service collected handwriting samples in the 1980's to create a computer mail scanning system.
GOAL: Create a feature set to begin recognizing hand written digits.
Part I: 1 Feature
Definition: A feature a data set \(\Delta\) is a function \[f:\Delta\to \mathbb{R}\] That is, assign a decimal number to each datum.
TASK I: List a 5 features you can think to use when looking at hand written digits.
- Example: The number of "Pen Down" blots.
TASK I: List a 5 features you can think to use when looking at hand written digits.
- Example: The number of "Strokes".

2 strokes
1 stroke
2 strokes
TASK II: Create all the digits on paper using your left and and then your right hand (as a group).
TASK III:
- Choose 2 features from your group's list
- Choose 3 digits as a group.
- Compute the 2 features of each of the three digits in your hand writing sample and the MNIST sample provided.
- Plot them as (feature1, feature2)
TASK IV: Identify the points of accumulation in your plots. Do the individual digits stand apart?
Here is a sample of computer's feature extraction of the 60,000 handwriting samples in the full MNIST training set.

Digit Lab: Points of Accumulation
By James Wilson
Digit Lab: Points of Accumulation
- 8