Face & Hand Landmarks Detection using Python – Mediapipe, OpenCV

Face & Hand Landmarks Detection using Python with Mediapipe and OpenCV is a fascinating way to explore the world of computer vision. With this technology, we can detect and track key points on the face and hands in real-time, allowing for exciting applications like gesture recognition, augmented reality filters, and more. By combining the power of Python programming with advanced libraries like Mediapipe and OpenCV, we can create innovative solutions that enhance the way we interact with technology.

Required Libraries

Mediapipe is a cross-platform library developed by Google that provides amazing ready-to-use ML solutions for computer vision tasks.
OpenCV library in python is a computer vision library that is widely used for image analysis, image processing, detection, recognition, etc.

To install the above libraries, you can write the code below in your terminal

pip install opencv-python mediapipe msvc-runtime

Below is the step-wise approach for Face and Hand landmarks detection

STEP-1: Let us first Import all the necessary libraries, In our case only two libraries are required.

# Import Libraries
import cv2
import time
import mediapipe as mp

STEP-2: Initializing Holistic model and Drawing utils for detecting and drawing landmarks on the image.

# Grabbing the Holistic Model from Mediapipe and
# Initializing the Model
mp_holistic = mp.solutions.holistic
holistic_model = mp_holistic.Holistic(
	min_detection_confidence=0.5,
	min_tracking_confidence=0.5
)

# Initializing the drawing utils for drawing the facial landmarks on image
mp_drawing = mp.solutions.drawing_utils

STEP-3: Detecting Face and Hand landmarks from the image. Holistic model processes the image and produces landmarks for Face, Left Hand, Right Hand and also detects the Pose of the

Capture the frames continuously from the camera using OpenCV.
Convert the BGR image to an RGB image and make predictions using initialized holistic model.
The predictions made by the holistic model are saved in the results variable from which we can access the landmarks using results.face_landmarks, results.right_hand_landmarks, results.left_hand_landmarks respectively.
Draw the detected landmarks on the image using the draw_landmarks function from drawing utils.
Display the resulting Image.

# (0) in VideoCapture is used to connect to your computer's default camera
capture = cv2.VideoCapture(0)

# Initializing current time and precious time for calculating the FPS
previousTime = 0
currentTime = 0

while capture.isOpened():
	# capture frame by frame
	ret, frame = capture.read()

	# resizing the frame for better view
	frame = cv2.resize(frame, (800, 600))

	# Converting the from BGR to RGB
	image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

	# Making predictions using holistic model
	# To improve performance, optionally mark the image as not writeable to
	# pass by reference.
	image.flags.writeable = False
	results = holistic_model.process(image)
	image.flags.writeable = True

	# Converting back the RGB image to BGR
	image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

	# Drawing the Facial Landmarks
	mp_drawing.draw_landmarks(
	image,
	results.face_landmarks,
	mp_holistic.FACEMESH_CONTOURS,
	mp_drawing.DrawingSpec(
		color=(255,0,255),
		thickness=1,
		circle_radius=1
	),
	mp_drawing.DrawingSpec(
		color=(0,255,255),
		thickness=1,
		circle_radius=1
	)
	)

	# Drawing Right hand Land Marks
	mp_drawing.draw_landmarks(
	image, 
	results.right_hand_landmarks, 
	mp_holistic.HAND_CONNECTIONS
	)

	# Drawing Left hand Land Marks
	mp_drawing.draw_landmarks(
	image, 
	results.left_hand_landmarks, 
	mp_holistic.HAND_CONNECTIONS
	)
	
	# Calculating the FPS
	currentTime = time.time()
	fps = 1 / (currentTime-previousTime)
	previousTime = currentTime
	
	# Displaying FPS on the image
	cv2.putText(image, str(int(fps))+" FPS", (10, 70), cv2.FONT_HERSHEY_COMPLEX, 1, (0,255,0), 2)

	# Display the resulting image
	cv2.imshow("Facial and Hand Landmarks", image)

	# Enter key 'q' to break the loop
	if cv2.waitKey(5) & 0xFF == ord('q'):
		break

# When all the process is done
# Release the capture and destroy all windows
capture.release()
cv2.destroyAllWindows()

The holistic model produces 468 Face landmarks, 21 Left-Hand landmarks, and 21 Right-Hand landmarks. The individual landmarks can be accessed by specifying the index of the required landmark. Example: results.left_hand_landmarks.landmark[0]. You can get the index of all the individual landmarks using the below code:

# Code to access landmarks
for landmark in mp_holistic.HandLandmark:
	print(landmark, landmark.value)

print(mp_holistic.HandLandmark.WRIST.value)

HandLandmark.WRIST 0
HandLandmark.THUMB_CMC 1
HandLandmark.THUMB_MCP 2
HandLandmark.THUMB_IP 3
HandLandmark.THUMB_TIP 4
HandLandmark.INDEX_FINGER_MCP 5
HandLandmark.INDEX_FINGER_PIP 6
HandLandmark.INDEX_FINGER_DIP 7
HandLandmark.INDEX_FINGER_TIP 8
HandLandmark.MIDDLE_FINGER_MCP 9
HandLandmark.MIDDLE_FINGER_PIP 10
HandLandmark.MIDDLE_FINGER_DIP 11
HandLandmark.MIDDLE_FINGER_TIP 12
HandLandmark.RING_FINGER_MCP 13
HandLandmark.RING_FINGER_PIP 14
HandLandmark.RING_FINGER_DIP 15
HandLandmark.RING_FINGER_TIP 16
HandLandmark.PINKY_MCP 17
HandLandmark.PINKY_PIP 18
HandLandmark.PINKY_DIP 19
HandLandmark.PINKY_TIP 20
0

Full Code

# Import Libraries
import cv2
import time
import mediapipe as mp

# Grabbing the Holistic Model from Mediapipe and Initializing the Model
mp_holistic = mp.solutions.holistic
holistic_model = mp_holistic.Holistic(
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5
)

# Initializing the drawing utils for drawing the facial landmarks on image
mp_drawing = mp.solutions.drawing_utils

# (0) in VideoCapture is used to connect to your computer's default camera
capture = cv2.VideoCapture(0)

# Initializing current time and previous time for calculating the FPS
previousTime = 0
currentTime = 0

while capture.isOpened():
    # Capture frame by frame
    ret, frame = capture.read()

    # Resize the frame for better view
    frame = cv2.resize(frame, (800, 600))

    # Converting from BGR to RGB
    image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Making predictions using holistic model
    # To improve performance, optionally mark the image as not writeable to
    # pass by reference.
    image.flags.writeable = False
    results = holistic_model.process(image)
    image.flags.writeable = True

    # Converting back the RGB image to BGR
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

    # Drawing the Facial Landmarks
    mp_drawing.draw_landmarks(
        image,
        results.face_landmarks,
        mp_holistic.FACEMESH_CONTOURS,
        mp_drawing.DrawingSpec(
            color=(255, 0, 255),
            thickness=1,
            circle_radius=1
        ),
        mp_drawing.DrawingSpec(
            color=(0, 255, 255),
            thickness=1,
            circle_radius=1
        )
    )

    # Drawing Right hand Landmarks
    mp_drawing.draw_landmarks(
        image,
        results.right_hand_landmarks,
        mp_holistic.HAND_CONNECTIONS
    )

    # Drawing Left hand Landmarks
    mp_drawing.draw_landmarks(
        image,
        results.left_hand_landmarks,
        mp_holistic.HAND_CONNECTIONS
    )

    # Calculating the FPS
    currentTime = time.time()
    fps = 1 / (currentTime - previousTime)
    previousTime = currentTime

    # Displaying FPS on the image
    cv2.putText(image, str(int(fps)) + " FPS", (10, 70), cv2.FONT_HERSHEY_COMPLEX, 1, (0, 255, 0), 2)

    # Display the resulting image
    cv2.imshow("Facial and Hand Landmarks", image)

    # Press 'q' to break the loop
    if cv2.waitKey(5) & 0xFF == ord('q'):
        break

# When all the process is done
# Release the capture and destroy all windows
capture.release()
cv2.destroyAllWindows()

# Code to access landmarks
for landmark in mp_holistic.HandLandmark:
    print(landmark, landmark.value)

print(mp_holistic.HandLandmark.WRIST.value)

Output:

Explanation of the whole code:

import cv2: This line imports the OpenCV library, which is used for image and video processing.
import time: This line imports the time module, which is used to work with time-related functions.
import mediapipe as mp: This line imports the Mediapipe library and aliases it as mp for easier use in the code.
mp_holistic = mp.solutions.holistic: This line accesses the holistic model from the Mediapipe library and assigns it to the variable mp_holistic.
holistic_model = mp_holistic.Holistic(...): This line initializes the holistic_model with specific parameters (min_detection_confidence and min_tracking_confidence) for detecting and tracking body landmarks.
mp_drawing = mp.solutions.drawing_utils: This line initializes the drawing utilities for visualizing the detected landmarks on the image.
capture = cv2.VideoCapture(0): This line captures video from the default camera (index 0) and assigns it to the variable capture.
previousTime = 0 and currentTime = 0: These lines initialize variables for calculating frames per second (FPS).
while capture.isOpened():: This line starts a loop that continues as long as the video capture is open.
ret, frame = capture.read(): This line reads a frame from the video capture and stores it in the variables ret and frame.
frame = cv2.resize(frame, (800, 600)): This line resizes the frame to a specific width and height for better visualization.
image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB): This line converts the frame from BGR to RGB color format, which is required by the Mediapipe model.
image.flags.writeable = False and image.flags.writeable = True: These lines temporarily mark the image as not writeable and then writeable again, which can improve performance when passing the image to the model.
results = holistic_model.process(image): This line processes the image using the holistic_model to detect body landmarks.
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR): This line converts the processed image back to BGR color format for display.
mp_drawing.draw_landmarks(...): These lines draw the detected landmarks (facial and hand) on the image using the mp_drawing utilities.
currentTime = time.time() and fps = 1 / (currentTime - previousTime): These lines calculate the frames per second (FPS) of the video stream.
cv2.putText(...): This line adds the FPS information to the image for display.
cv2.imshow(...): This line displays the image with the landmarks and FPS information.
if cv2.waitKey(5) & 0xFF == ord('q'): break: This line breaks the loop if the ‘q’ key is pressed.
capture.release(): This line releases the video capture.
cv2.destroyAllWindows(): This line closes all OpenCV windows.
for landmark in mp_holistic.HandLandmark: print(landmark, landmark.value): This code accesses and prints the landmarks detected on the hands.
print(mp_holistic.HandLandmark.WRIST.value): This line prints the value of the WRIST landmark.

Conclusion:

In conclusion, Face & Hand Landmarks Detection using Python with Mediapipe and OpenCV opens up a world of possibilities in computer vision. Whether it’s for creating fun effects in video chats, developing advanced user interfaces, or even assisting in medical diagnostics, this technology has the potential to revolutionize how we interact with computers. By harnessing the capabilities of Python and these powerful libraries, developers can create innovative and impactful applications that benefit society as a whole.

CodeMagnet

CodeMagnet

Leave a ReplyCancel reply

Python Humanize Package Tutorial: Complete Guide

Caesar Cipher in Python: Complete Guide with Examples, Code, and Explanation

Python tqdm Tutorial: Easily Track Loop Progress with Examples

Trending

Python Humanize Package Tutorial: Complete Guide

Caesar Cipher in Python: Complete Guide with Examples, Code, and Explanation

Python tqdm Tutorial: Easily Track Loop Progress with Examples

AI-Powered Healthcare Diagnosis with Python: A Complete Guide

CodeMagnet

Subscribe to CodeMagnet! 🔔

Face & Hand Landmarks Detection using Python – Mediapipe, OpenCV

Required Libraries

Share this:

Like this:

Author

Leave a ReplyCancel reply

Trending