This project provides a simple application for recognizing handwritten digits using a machine learning model. The model is trained on the MNIST dataset, a large database of handwritten digits commonly used for training image processing systems. A pre-trained version of the model is available in the ./models/handwritten-digits-recognition directory, allowing you to use the prediction script immediately without retraining. You can also use the included Jupyter Notebook to modify and train the model yourself.
The project consists of three main parts:
- A Python script (
app.py) for predicting digits from an image. - A Jupyter Notebook (
model-training.ipynb) for training the machine learning model. - A requirements file (
requirements.txt) listing all the necessary libraries.
To use this project, you'll need to set up your environment and then either use the pre-trained model or train your own.
First, ensure you have Python and pip installed. Then, install the required libraries by running the following command in your terminal:
pip install -r requirements.txtThe model-training.ipynb notebook is used to train the machine learning model. It uses a sequential neural network with two dense layers to classify the digits. The model is trained for 20 epochs. You can run this notebook using a tool like Jupyter Notebook or JupyterLab. After training, the model is saved to the ./models/handwritten-digits-recognition directory.
The training process involves the following steps:
- Load the MNIST dataset: The dataset is divided into training and testing sets.
- Preprocess the data: The images are reshaped to a 1D array of 784 pixels (28x28) and normalized to a range of 0 to 1.
- Compile the model: The model uses the Adam optimizer and
SparseCategoricalCrossentropyfor loss calculation. - Train the model: The model is trained on the training data for 20 epochs.
- Evaluate the model: The model's performance is evaluated on the test data.
- Save the model: The trained model is saved in the
SavedModelformat, which is a standard format for TensorFlow models.
The app.py script is used to predict a digit from an image. Before running it, you need to modify the file paths at the top of the script:
image_path: Set this to the path of the image you want to predict.model_path: This should point to the directory where your trained model is saved (e.g.,./models/handwritten-digits-recognition).
Once the paths are set, run the script from your terminal:
python app.pyThe script will load the image, preprocess it, and use the trained model to predict the digit. The predicted digit will be printed to the console.
While the current project provides a solid foundation, several improvements could enhance its functionality and accuracy.
The current script requires manual path modification. A user-friendly graphical interface (GUI) could be added to allow users to select an image file and display the predicted digit visually. The existing tkinter import in app.py suggests this was a potential future feature.
The current model uses a simple neural network. You could explore more advanced architectures for potentially better performance, such as:
- Convolutional Neural Networks (CNNs): CNNs are highly effective for image classification tasks and often outperform simple dense networks on datasets like MNIST.
- Hyperparameter tuning: Experiment with different optimizers, learning rates, and the number of layers or neurons to find the optimal configuration for the model. For example, the current model has a warning about
from_logits=True, which could be addressed by removing thesoftmaxactivation from the final layer or settingfrom_logits=Falseduring compilation.