Building LeNet-5 From Scratch in TensorFlow – Machine Learning

LeNet-5, developed by Yann LeCun et al. in 1998, is one of the pioneering convolutional neural network (CNN) architectures. Originally designed for handwritten digit recognition tasks, it set the foundation for modern deep learning approaches in image classification.

Codemagnet is here again with this article which will walk you through building LeNet-5 from scratch in TensorFlow, complete with coding examples and detailed explanations.

Overview of LeNet-5 Architecture

LeNet-5 consists of the following layers:

  1. Input Layer: Accepts grayscale images of size 32×3232 \times 3232×32.
  2. Convolutional Layer 1: Applies 6 filters of size 5×55 \times 55×5, followed by activation and subsampling.
  3. Subsampling Layer 1: A pooling layer that reduces spatial dimensions.
  4. Convolutional Layer 2: Applies 16 filters of size 5×55 \times 55×5.
  5. Subsampling Layer 2: A second pooling layer.
  6. Fully Connected Layers:
    • FC1: 120 neurons.
    • FC2: 84 neurons.
    • Output layer: 10 neurons (for classification).

The activation function used in the original LeNet-5 was the sigmoid or tanh, but we use ReLU in modern adaptations for better performance.

Step-by-Step Implementation in TensorFlow

Importing Required Libraries

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

Let us know prepare the dataset

LeNet-5 was originally designed for 32×3232 \times 3232×32 images. Since MNIST images are 28×2828 \times 2828×28, we resize them.

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values and resize images
x_train = x_train / 255.0
x_test = x_test / 255.0

x_train = tf.image.resize(x_train[..., tf.newaxis], (32, 32))
x_test = tf.image.resize(x_test[..., tf.newaxis], (32, 32))

# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

Building the LeNet-5 Model

We use TensorFlow’s Keras API to define the architecture.

def build_lenet5():
    model = models.Sequential([
        # Layer 1: Convolutional + Activation + Subsampling
        layers.Conv2D(6, kernel_size=(5, 5), activation='relu', input_shape=(32, 32, 1), padding='same'),
        layers.AvgPool2D(pool_size=(2, 2)),

        # Layer 2: Convolutional + Activation + Subsampling
        layers.Conv2D(16, kernel_size=(5, 5), activation='relu'),
        layers.AvgPool2D(pool_size=(2, 2)),

        # Flatten the feature maps for fully connected layers
        layers.Flatten(),

        # Fully connected layers
        layers.Dense(120, activation='relu'),
        layers.Dense(84, activation='relu'),
        layers.Dense(10, activation='softmax')  # Output layer
    ])
    return model

# Instantiate the model
lenet5 = build_lenet5()
lenet5.summary()

Compiling and Training the Model

Now, we will compile the model with a suitable loss function, optimizer, and metrics.

lenet5.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Train the model
history = lenet5.fit(x_train, y_train, epochs=10, batch_size=32, validation_data=(x_test, y_test))

Once the compilation is done, we need to evaluate the model’s performance.

test_loss, test_acc = lenet5.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc:.2f}")

Visualizing Results

Plot training and validation accuracy and loss to monitor performance.

import matplotlib.pyplot as plt

# Plot accuracy
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

# Plot loss
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()

Full Code:

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values and resize images
x_train = x_train / 255.0
x_test = x_test / 255.0

x_train = tf.image.resize(x_train[..., tf.newaxis], (32, 32))
x_test = tf.image.resize(x_test[..., tf.newaxis], (32, 32))

# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

def build_lenet5():
    model = models.Sequential([
        # Layer 1: Convolutional + Activation + Subsampling
        layers.Conv2D(6, kernel_size=(5, 5), activation='relu', input_shape=(32, 32, 1), padding='same'),
        layers.AvgPool2D(pool_size=(2, 2)),

        # Layer 2: Convolutional + Activation + Subsampling
        layers.Conv2D(16, kernel_size=(5, 5), activation='relu'),
        layers.AvgPool2D(pool_size=(2, 2)),

        # Flatten the feature maps for fully connected layers
        layers.Flatten(),

        # Fully connected layers
        layers.Dense(120, activation='relu'),
        layers.Dense(84, activation='relu'),
        layers.Dense(10, activation='softmax')  # Output layer
    ])
    return model

# Instantiate the model
lenet5 = build_lenet5()
lenet5.summary()

lenet5.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Train the model
history = lenet5.fit(x_train, y_train, epochs=10, batch_size=32, validation_data=(x_test, y_test))

test_loss, test_acc = lenet5.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc:.2f}")


import matplotlib.pyplot as plt

# Plot accuracy
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

# Plot loss
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()

Output:

Explanation of Key Components

  1. Convolutional Layers: Extract spatial features using filters.
  2. Pooling Layers: Downsample feature maps to reduce computational complexity.
  3. Fully Connected Layers: Perform classification based on extracted features.
  4. Softmax Output: Converts logits to class probabilities for multi-class classification.

The process of building LeNet-5 from scratch using TensorFlow demonstrates a practical and in-depth understanding of fundamental convolutional neural network (CNN) architectures. Designed originally for digit recognition tasks by Yann LeCun and his collaborators, LeNet-5 laid the groundwork for modern deep learning models, showcasing how structured layers can extract spatial hierarchies from image data. Implementing it from scratch is not just a historical exercise but a vital learning step for anyone delving into computer vision and neural network design.

Key Takeaways

  1. Understanding Convolutional Neural Networks (CNNs):
    By implementing LeNet-5, we explored core CNN components such as convolutional layers, pooling layers, and fully connected layers. These elements are foundational to modern architectures and are used to extract and process features effectively.
  2. Data Preprocessing Importance:
    Preparing the MNIST dataset by normalizing pixel values, resizing images, and one-hot encoding labels ensured that the input data was optimized for training. This highlights the importance of thorough preprocessing in improving model performance.
  3. Building Layers Step-by-Step:
    Each layer in LeNet-5 serves a specific purpose:
    • Convolutional Layers: Extract spatial features like edges, corners, and patterns.
    • Pooling Layers: Downsample and reduce dimensionality to focus on prominent features.
    • Fully Connected Layers: Learn high-level abstractions to make predictions.
  4. Model Training and Evaluation:
    Training the model allowed us to observe how the network learns to minimize loss and improve accuracy through backpropagation and optimization techniques. Evaluating its performance on unseen test data provided insights into its generalization capability.
  5. Visualization of Training Metrics:
    Plotting training and validation accuracy/loss offered a clear picture of the model’s learning process, helping to identify overfitting, underfitting, or other training anomalies.

Why LeNet-5 Still Matters?

While modern architectures like ResNet, VGG, and EfficientNet have taken center stage, LeNet-5 remains a cornerstone in understanding CNNs. Its simplicity makes it an excellent learning tool for beginners and a reference point for building more complex networks. The implementation also bridges the gap between theory and application, fostering a deeper appreciation of how these networks function.


Practical Implications

  • Scalability: The principles behind LeNet-5 can be extended to larger datasets and more complex tasks like object detection and segmentation.
  • Customization: Building the network from scratch allows researchers and developers to tweak hyperparameters, layers, or activation functions to fit specific problems.
  • Deployment: With TensorFlow, the trained model can be easily exported and integrated into real-world applications, such as digit recognition systems in banking or postal services.

Challenges and Future Directions

  1. Scaling to Complex Data: LeNet-5’s straightforward structure struggles with high-resolution images or diverse datasets. Exploring deeper architectures is necessary for tackling such challenges.
  2. Efficiency and Optimization: Experimenting with different optimizers, batch sizes, or learning rates can improve performance.
  3. Incorporating Advances: Adding modern features like dropout, batch normalization, or residual connections can enhance the basic LeNet-5 architecture for contemporary needs.

Final Thoughts

Building LeNet-5 from scratch in TensorFlow is not just an exercise in coding but a journey into the fundamental concepts that underpin modern computer vision systems. It equips learners with the skills to design, train, and evaluate neural networks while appreciating the historical significance of pioneering architectures. This project serves as a stepping stone for tackling advanced challenges in deep learning and contributes to a solid foundation in machine learning.

.

Author

Sona Avatar

Written by

Leave a Reply

Trending

CodeMagnet

Your Magnetic Resource, For Coding Brilliance

Programming Languages

Web Development

Data Science and Visualization

Career Section

<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-4205364944170772"
     crossorigin="anonymous"></script>