Skip to content

A minimal fully-connected neural network that reaches ≈96.9 % test accuracy on the classic MNIST digit-recognition task.

License

Notifications You must be signed in to change notification settings

Nithurshen/handwritten-digits-recognition

Repository files navigation

Handwritten Digit Recognition

This project provides a simple application for recognizing handwritten digits using a machine learning model. The model is trained on the MNIST dataset, a large database of handwritten digits commonly used for training image processing systems. A pre-trained version of the model is available in the ./models/handwritten-digits-recognition directory, allowing you to use the prediction script immediately without retraining. You can also use the included Jupyter Notebook to modify and train the model yourself.

The project consists of three main parts:

  • A Python script (app.py) for predicting digits from an image.
  • A Jupyter Notebook (model-training.ipynb) for training the machine learning model.
  • A requirements file (requirements.txt) listing all the necessary libraries.

How to use the project:

To use this project, you'll need to set up your environment and then either use the pre-trained model or train your own.

1. Setup

First, ensure you have Python and pip installed. Then, install the required libraries by running the following command in your terminal:

pip install -r requirements.txt

2. Model Training

The model-training.ipynb notebook is used to train the machine learning model. It uses a sequential neural network with two dense layers to classify the digits. The model is trained for 20 epochs. You can run this notebook using a tool like Jupyter Notebook or JupyterLab. After training, the model is saved to the ./models/handwritten-digits-recognition directory.

The training process involves the following steps:

  1. Load the MNIST dataset: The dataset is divided into training and testing sets.
  2. Preprocess the data: The images are reshaped to a 1D array of 784 pixels (28x28) and normalized to a range of 0 to 1.
  3. Compile the model: The model uses the Adam optimizer and SparseCategoricalCrossentropy for loss calculation.
  4. Train the model: The model is trained on the training data for 20 epochs.
  5. Evaluate the model: The model's performance is evaluated on the test data.
  6. Save the model: The trained model is saved in the SavedModel format, which is a standard format for TensorFlow models.

3. Digit Prediction

The app.py script is used to predict a digit from an image. Before running it, you need to modify the file paths at the top of the script:

  • image_path: Set this to the path of the image you want to predict.
  • model_path: This should point to the directory where your trained model is saved (e.g., ./models/handwritten-digits-recognition).

Once the paths are set, run the script from your terminal:

python app.py

The script will load the image, preprocess it, and use the trained model to predict the digit. The predicted digit will be printed to the console.


Potential Improvements

While the current project provides a solid foundation, several improvements could enhance its functionality and accuracy.

User Interface (UI)

The current script requires manual path modification. A user-friendly graphical interface (GUI) could be added to allow users to select an image file and display the predicted digit visually. The existing tkinter import in app.py suggests this was a potential future feature.

Model Enhancement

The current model uses a simple neural network. You could explore more advanced architectures for potentially better performance, such as:

  • Convolutional Neural Networks (CNNs): CNNs are highly effective for image classification tasks and often outperform simple dense networks on datasets like MNIST.
  • Hyperparameter tuning: Experiment with different optimizers, learning rates, and the number of layers or neurons to find the optimal configuration for the model. For example, the current model has a warning about from_logits=True, which could be addressed by removing the softmax activation from the final layer or setting from_logits=False during compilation.

About

A minimal fully-connected neural network that reaches ≈96.9 % test accuracy on the classic MNIST digit-recognition task.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published