Web Application with ML
Posted on January 16, 2025 • 4 minutes • 791 words
Here is a video example:
I created a basic application with two ways to detect numbers from 0-9:
- You can either click on a number, and the image is detected using a deep learning algorithm.
- You can draw a number, and it should detect the number.
The dataset for training and testing numbers was assembled by the National Institute of Standards and Technology (the NIST in MNIST) in the 1980s. It consists of 60,000 training images and 10,000 test images. Instead of using test images, I am using my input from either the clicked number or the drawn number.
Problem
The problem we are solving here is to classify grayscale images of handwritten digits (28 ร 28 pixels) into 10 categories (0 through 9).
How to build this application?
Create a basic web application with:
- Backend: FastAPI (FastAPI Documentation )
- Frontend: Basic JavaScript, HTML, and CSS
- Dataset: MNIST (Learn about MNIST ) - grayscale images of handwritten digits
- Deep Learning Framework: Keras (About Keras ) for creating and training the neural network
- Canva: For creating different handwritten number images (Canva )
Step 1: Create the Neural Network
Follow these steps to build the neural network:
- Get the training and testing data from MNIST
from keras.datasets import mnist
from keras.models import load_model
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
- Create deep learning layers and define activation functions
from keras import layers, Sequential
model = Sequential([
layers.Dense(512, activation="relu"),
layers.Dense(10, activation="softmax")
])
- Compile the model with an optimizer, loss function, and metrics
model.compile(optimizer="adam",
loss="sparse_categorical_crossentropy",
metrics=["accuracy"])
- Preprocess the training data and train the model
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
model.fit(train_images, train_labels, epochs=5, batch_size=128)
- Test the trained model with sample test data
test_digit = test_image.reshape(1, test_image.size)
predictions = model.predict(test_digit)
print("\nPredictions:", predictions, "\nPrediction MaxValue:", predictions.argmax(), "\n")
Step 2: Integrate Backend and Frontend
Now, the backend and frontend need to provide input to the trained neural network. The input must be in grayscale format, as the training dataset was grayscale.
For Image Input
- For the functionality where clicking an image is taken as input, display the image using basic HTML.
<img src="/static/images/1.png" alt="Image 1" data-image-name="/static/images/1.png">
- For drawing number, use the canvas element and make sure the image is sent to backend.
<canvas id="drawingArea" width="280" height="280"></canvas>
- Ensure the image is sent to the backend in blob format via a POST API. Here is the JavaScript code used:
images.forEach(image => {
image.addEventListener('click', async () => {
const imageName = image.getAttribute('data-image-name');
// Fetch the image as a Blob
const response = await fetch(imageName);
const blob = await response.blob();
// Create a FormData object to send the file
const formData = new FormData();
formData.append('file', blob, imageName);
// Send the image to the backend
try {
const result = await fetch('http://127.0.0.1:8000/number_detection', {
method: 'POST',
body: formData
});
if (result.ok) {
const jsonResult = await result.json();
document.getElementById("output_number").value = jsonResult.result;
} else {
console.error('Error processing image:', result.statusText);
}
} catch (error) {
console.error('Error uploading image:', error);
}
});
});
Image resize and reshape to 1D array
The backend will get the image, it has to resize into 28x28 pixels and convert this input format into 1d array as our neural networks only takes the input in that format.
# Resize to 28x28 pixels
image = image.resize((28, 28))
# Convert to NumPy array and normalize pixel values to [0, 1]
image_array = np.asarray(image, dtype=np.float32) / 255.0
# Flatten the array to 1D
flattened_array = image_array.flatten()
Predict the image
Now, pass this new image (reshaped and resized) to your neural network and then check if it predicts the correct number as that of the image. In result, you will get 2 parts printed:
Predictions: [[7.1005906e-07 9.0936519e-02 8.9278919e-01 5.7332212e-04 4.2719766e-05
7.1836407e-03 7.6220306e-03 5.4302714e-06 8.4631785e-04 1.0209130e-07]]
Prediction MaxValue: 2
Here :
-
Predictions: As we created the second dense layer with 10 neurons ( layers.Dense(10, activation=“softmax”)), we get 10 responses in prediction. Each value is a probability, meaning how confident the model is that the input corresponds to a specific digit.For example, 8.9278919e-01 (or 0.8928) is the probability that the input represents the digit 1.
-
Prediction MaxValue : From the above list of predictions whichever gives the maximum highest number of confidence/probability is the input number/expected number. In our case, 9.0936519e-02 as the max probability, so the input given to this one should be 2. Display this number on the output block and you are good to go.
And this is how we create the basic web-application and predict the handwritten numbers using deep learning. This is my first app as well, I enjoyed understanding deep learning with an application in action, so hope you enjoyed it too. ๐
You can find the complete git code here: https://github.com/neetaBirajdar/machine_learning_web_app
Okay then see you in my next blog! ๐