Python AI: How to Build a Neural Network & Make Predictions

Artificial intelligence (AI) has become one of the most impactful technologies in recent years, enabling systems to learn from data, recognize patterns, and make decisions without human intervention. One of the key technologies driving AI is neural networks, inspired by the structure of the human brain. Neural networks are the foundation of deep learning, which powers everything from image recognition to speech processing and autonomous vehicles.

In this article, we’ll walk through how to build a neural network from scratch in Python using popular libraries such as TensorFlow and Keras, and we’ll make predictions using the model.

What is a Neural Network?

A neural network is a set of algorithms that attempt to recognize underlying relationships in a set of data through a process that mimics how the human brain operates. Neural networks consist of layers of interconnected nodes, or “neurons.” Each neuron processes input data and passes it on to the next layer, ultimately producing an output.

Neural networks typically have three types of layers:

  1. Input Layer: Where the input data is fed into the network.
  2. Hidden Layers: Where the actual computation happens as the data is transformed by neurons.
  3. Output Layer: Where the final prediction or classification result is produced.

Prerequisites

Before we dive into coding, make sure you have the required libraries installed:

pip install tensorflow keras numpy pandas matplotlib scikit-learn

Building a Neural Network from Scratch

We’ll start by building a simple feedforward neural network using the Keras API, which is a high-level API built on top of TensorFlow.

Step 1: Importing Required Libraries\

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_iris

Explanation:

  • TensorFlow and Keras are used to create and train the neural network.
  • NumPy and Pandas are for data manipulation.
  • scikit-learn is used for data preprocessing and splitting the dataset.

Step 2: Load and Preprocess the Data

For this example, we’ll use the Iris dataset, which contains data about different flower species and their features (like petal length, sepal width, etc.). Our goal is to classify the flower species based on these features.

# Load the dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Normalize the data (important for neural networks)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

Explanation:

  • The dataset is split into training (80%) and testing (20%) sets.
  • We apply normalization to the features using StandardScaler to ensure that all input data is on the same scale, which improves the convergence of the neural network during training.

Step 3: Building the Neural Network

Now, we’ll define our neural network architecture using Keras. We’ll create a simple feedforward neural network with one hidden layer

# Initialize the neural network
model = Sequential()

# Add the input layer and first hidden layer with 8 neurons and ReLU activation
model.add(Dense(8, activation='relu', input_shape=(X_train.shape[1],)))

# Add the output layer with 3 neurons (since we have 3 classes) and softmax activation
model.add(Dense(3, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Print the model summary
model.summary()

Explanation:

  • Sequential is a Keras model type that allows us to build a model layer by layer.
  • The first hidden layer has 8 neurons and uses the ReLU (Rectified Linear Unit) activation function, which is widely used in hidden layers.
  • The output layer has 3 neurons (since we have 3 classes in the Iris dataset) and uses the softmax activation function to produce probabilities for each class.
  • The model is compiled with the Adam optimizer and sparse_categorical_crossentropy as the loss function since we are dealing with multi-class classification.

Step 4: Training the Neural Network

Now that the model is defined, we can train it on the training data.

# Train the model
history = model.fit(X_train, y_train, epochs=50, validation_split=0.2)

Explanation:

  • We train the model for 50 epochs, with 20% of the training data used for validation. Each epoch represents one full pass through the entire training dataset.
  • The history object stores the training and validation accuracy and loss over epochs.

Step 5: Evaluating the Model

After training the model, we evaluate its performance on the test data.

# Evaluate the model on test data
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_acc * 100:.2f}%")

Explanation:

  • The evaluate() method tests the model on unseen data (the test set) and reports the accuracy.

Step 6: Making Predictions

We can use the trained model to make predictions on new data. Here, we’ll make predictions on the test data.

# Make predictions
predictions = model.predict(X_test)

# Convert predictions to class labels
predicted_classes = np.argmax(predictions, axis=1)

# Compare predictions with actual labels
print("Predicted classes:", predicted_classes)
print("Actual classes:", y_test)

Explanation:

  • The model.predict() method generates predictions for the test data.
  • We use np.argmax() to convert the probability scores into class labels (i.e., the class with the highest probability).

Visualizing the Training Process

You can visualize how the model’s accuracy and loss change over time to better understand its learning process.

import matplotlib.pyplot as plt

# Plot training and validation accuracy over epochs
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

# Plot training and validation loss over epochs
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

Explanation:

  • These plots help visualize how the model’s accuracy and loss evolve during training and validation, which can help identify issues like overfitting or underfitting.

Advanced Neural Network: Adding More Layers

For more complex tasks, you can add more hidden layers to the network to improve its performance.

# Initialize a more complex neural network
model = Sequential()

# Input layer and first hidden layer
model.add(Dense(16, activation='relu', input_shape=(X_train.shape[1],)))

# Second hidden layer
model.add(Dense(12, activation='relu'))

# Output layer
model.add(Dense(3, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=50, validation_split=0.2)

Explanation:

  • We added a second hidden layer with 12 neurons to increase the capacity of the model.
  • The more layers and neurons you add, the more complex patterns the model can learn, but be cautious of overfitting.

Full Code:

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt

# Load the dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Normalize the data (important for neural networks)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Initialize the neural network
model = Sequential()

# Add the input layer and first hidden layer with 8 neurons and ReLU activation
model.add(Dense(8, activation='relu', input_shape=(X_train.shape[1],)))

# Add the output layer with 3 neurons (since we have 3 classes) and softmax activation
model.add(Dense(3, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Print the model summary
model.summary()

# Train the model
history = model.fit(X_train, y_train, epochs=50, validation_split=0.2)

# Evaluate the model on test data
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_acc * 100:.2f}%")

# Make predictions
predictions = model.predict(X_test)

# Convert predictions to class labels
predicted_classes = np.argmax(predictions, axis=1)

# Compare predictions with actual labels
print("Predicted classes:", predicted_classes)
print("Actual classes:", y_test)

# Plot training and validation accuracy over epochs
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

# Plot training and validation loss over epochs
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

# Initialize a more complex neural network
model = Sequential()

# Input layer and first hidden layer
model.add(Dense(16, activation='relu', input_shape=(X_train.shape[1],)))

# Second hidden layer
model.add(Dense(12, activation='relu'))

# Output layer
model.add(Dense(3, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=50, validation_split=0.2)

Output:

Conclusion

Building a neural network in Python using TensorFlow and Keras is a straightforward process. In this article, we’ve explored how to construct a simple neural network, train it on a dataset, and make predictions. We also discussed how to evaluate and visualize the model’s performance. By understanding the structure and workflow of a neural network, you can now apply this knowledge to solve various AI-related problems like classification, regression, and even more advanced tasks like image recognition and natural language processing.

Whether you’re working on small or large-scale projects, neural networks offer an exciting opportunity to bring intelligence into your Python applications, helping machines learn, adapt, and make predictions that can have a significant real-world impact.

Author

Sona Avatar

Written by

3 responses to “Python AI: How to Build a Neural Network & Make Predictions”

  1. This blog post is a fantastic guide to building a neural network from scratch using TensorFlow and Keras! The step-by-step explanation and code snippets make it easy to follow along and understand the process.

    A logical question that comes to mind is: How would you adapt this neural network architecture for a different dataset with more classes or features? I’m curious about the considerations for scaling up the complexity of the network. Looking forward to hearing your insights!

    1. Thank you! I’m glad you found the guide helpful. Adapting a neural network for a dataset with more classes or features involves several important considerations, and I’ll walk you through them.

      1. Adjusting the Output Layer:
      For a dataset with more classes, the output layer needs to reflect the number of classes you’re predicting. For example, in a classification problem with 10 classes, you would update the number of neurons in the output layer to 10 and use softmax as the activation function.

      model.add(Dense(10, activation=’softmax’)) # for 10 classes

      Increasing Input Features:

      If the dataset has more features, you need to adjust the input layer or the input shape of the network. For example, if you move from a dataset with 20 features to one with 100 features, you would modify the input dimension of the first layer to match the number of features:
      python

      model.add(Dense(128, input_shape=(100,), activation=’relu’)) # 100 features

      Scaling the Number of Neurons:

      With more features or more classes, it’s often necessary to increase the complexity of the network by adding more neurons to hidden layers or adding additional hidden layers. This allows the model to capture more nuanced patterns. However, adding too many neurons can lead to overfitting, so techniques like dropout, batch normalization, and regularization may be necessary to prevent this.
      Regularization and Dropout:

      When scaling up a network, the risk of overfitting increases. You can counteract this by introducing dropout layers to randomly disable some neurons during training, or by applying L2 regularization to constrain the weights. For example:

      model.add(Dropout(0.5)) # Drop 50% of the neurons randomly

      1. OK, yet another extremely helpful and extensive response. Thank you.

Leave a Reply

Trending

CodeMagnet

Your Magnetic Resource, For Coding Brilliance

Programming Languages

Web Development

Data Science and Visualization

Career Section

<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-4205364944170772"
     crossorigin="anonymous"></script>