Python Voice Playback and Text-to-Speech Application Using just_playback and pyttsx3

Python offers numerous libraries to work with media, including text-to-speech and audio playback. Two useful libraries for these purposes are just_playback, which allows you to control the playback of audio files, and pyttsx3, which converts text to speech.

This article will guide you through how to create a Python application that plays audio files and converts text to speech, using both libraries.

Introduction to just_playback and pyttsx3

  1. just_playback: This library is a minimalistic audio playback library that allows users to play audio files in various formats such as MP3 or WAV. It also includes basic controls for pausing, resuming, and seeking through audio files, which makes it a good tool for interactive media applications.
  2. pyttsx3: This library is a text-to-speech conversion library in Python. Unlike some online services, pyttsx3 works offline and supports multiple speech engines, making it ideal for applications that need to be fully operational without internet access.

Installing the Libraries

Before you can start, you need to install both libraries. You can install them using pip:

pip install just_playback pyttsx3

Understanding the Libraries

just_playback Features:

  • Play/Pause/Stop: Control audio playback with intuitive commands.
  • Seek: Jump to specific points in the audio file.
  • Volume Control: Adjust the volume dynamically during playback.
  • Event Listeners: Detect when an audio file ends, or its state changes.

pyttsx3 Features:

  • Offline Speech Generation: No need for an internet connection.
  • Voice Customization: Adjust the rate, volume, and voice.
  • Platform Independence: Works on Windows, macOS, and Linux.
  • Multiple Voices: Choose between male and female voices (depending on the installed speech engines).

Example 1: Audio Playback with just_playback

We’ll start with a simple example where we load and control an audio file.

import time
from just_playback import Playback

# Initialize Playback
playback = Playback()

# Load an audio file (make sure the file exists in the same directory)
playback.load_file('sample_audio.mp3')

# Start playing the audio
playback.play()

# Wait for a few seconds to demonstrate pausing and resuming
time.sleep(5)
print("Pausing audio...")
playback.pause()

time.sleep(2)
print("Resuming audio...")
playback.resume()

# Stop playback after some more time
time.sleep(5)
print("Stopping audio...")
playback.stop()

Output:


Explanation:

  1. Loading and Playing: We use the play() function to play the loaded file.
  2. Pausing and Resuming: The pause() and resume() methods allow us to control the audio dynamically during runtime.
  3. Stopping: The stop() method stops the audio completely.

You can replace 'sample_audio.mp3' with the path to any audio file you wish to play.

Example 2: Text-to-Speech with pyttsx3

Now let’s move on to text-to-speech using the pyttsx3 library.

Code:

import pyttsx3

# Initialize the TTS engine
engine = pyttsx3.init()

# Set properties before adding the text you want to convert to speech
engine.setProperty('rate', 150)     # Speed of speech
engine.setProperty('volume', 1)     # Volume level (0.0 to 1.0)

# Get the available voices and choose one
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)  # 0 for male, 1 for female

# Convert the text to speech
engine.say("Hello! This is an example of text-to-speech conversion using pyttsx3 by Codemagnet, enjoy using it.")
engine.runAndWait()  # Blocks while speaking

Output:

Explanation:

  1. Initialization: We initialize the text-to-speech engine with pyttsx3.init().
  2. Setting Properties: You can adjust the rate (speed) of the speech and the volume.
  3. Selecting Voices: The voices list contains the available voice profiles. We use the engine.setProperty('voice', ...) method to select a voice. In this case, voices[1] is a female voice, and voices[0] is male.
  4. Text-to-Speech: The engine.say() function accepts a string of text, which is then converted to speech. engine.runAndWait() ensures the program waits until the speech is finished.

Example 3: Combining just_playback and pyttsx3 in an Innovative Application

Let’s combine both libraries to create an innovative Python application that:

  • Plays background music while the app reads aloud text using pyttsx3.

Code:

import pyttsx3
import time
from just_playback import Playback
import threading

# Function to play background music
def play_music():
    playback = Playback()
    playback.load_file('sample_audio.mp3')
    playback.play()
    # Keep the thread alive while the music plays
    while playback.active:
        time.sleep(1)

# Function to convert text to speech
def text_to_speech(engine):
    text = "Hello! This is a Python-based application that plays background music while reading this text aloud."
    engine.say(text)
    engine.runAndWait()

# Main function to run both background music and text-to-speech
def main():
    # Initialize the text-to-speech engine in the main thread
    engine = pyttsx3.init()
    engine.setProperty('rate', 150)  # Speech speed
    engine.setProperty('volume', 0.9)  # Volume

    # Choose a voice
    voices = engine.getProperty('voices')
    engine.setProperty('voice', voices[0].id)  # Male voice

    # Start the music in a separate thread
    music_thread = threading.Thread(target=play_music)
    music_thread.start()

    # Add a short delay before starting the speech
    time.sleep(2)

    # Run the text-to-speech function in the main thread
    text_to_speech(engine)

    # Wait for the music to finish
    music_thread.join()

if __name__ == "__main__":
    main()

Output:

To lower the background music volume so that the text-to-speech output can be heard more clearly, you can adjust the playback volume in the Playback class from the just_playback module. Unfortunately, just_playback doesn’t have direct volume control, but you can use the pydub library to modify the volume of the audio file before playing it.

Here’s how you can do it using pydub to reduce the volume of the background music:

Step 1: Install the necessary modules

Make sure you have both pydub and ffmpeg installed for manipulating audio files.

pip install pydub

You’ll also need to install ffmpeg:

  • For Windows: Download the binaries from here, and add them to your system’s PATH.
  • For Linux/macOS: You can install ffmpeg through your package manager.

Step 2: Modify the code to lower the volume

import pyttsx3
import time
from just_playback import Playback
import threading
from pydub import AudioSegment

# Function to reduce the background music volume
def reduce_music_volume(input_file, output_file, reduction_dB):
    song = AudioSegment.from_file(input_file)
    quieter_song = song - reduction_dB  # Reduce volume by the specified dB
    quieter_song.export(output_file, format="mp3")

# Function to play background music
def play_music():
    playback = Playback()
    playback.load_file('quieter_sample_audio.mp3')  # Use the modified quieter file
    playback.play()
    while playback.active:
        time.sleep(1)

# Function to convert text to speech
def text_to_speech(engine):
    text = "Hello! This is a Python-based application that plays background music while reading this text aloud."
    engine.say(text)
    engine.runAndWait()

# Main function to run both background music and text-to-speech
def main():
    # Reduce the background music volume
    reduce_music_volume('sample_audio.mp3', 'quieter_sample_audio.mp3', reduction_dB=10)  # Reduce by 10 dB

    # Initialize the text-to-speech engine in the main thread
    engine = pyttsx3.init()
    engine.setProperty('rate', 150)  # Speech speed
    engine.setProperty('volume', 0.9)  # Volume

    # Choose a voice
    voices = engine.getProperty('voices')
    engine.setProperty('voice', voices[0].id)  # Male voice

    # Start the music in a separate thread
    music_thread = threading.Thread(target=play_music)
    music_thread.start()

    # Add a short delay before starting the speech
    time.sleep(2)

    # Run the text-to-speech function in the main thread
    text_to_speech(engine)

    # Wait for the music to finish
    music_thread.join()

if __name__ == "__main__":
    main()

Explanation:

  1. reduce_music_volume(): This function reduces the volume of the original audio file (sample_audio.mp3) by a specified decibel level (10 dB in this case). It creates a quieter version of the file (quieter_sample_audio.mp3).
  2. pydub: This library is used to manipulate the audio file. It loads the file, applies the volume reduction, and exports a new version with reduced volume.
  3. Modified Background Music: In the play_music() function, we’re now loading and playing the quieter audio file (quieter_sample_audio.mp3), so the background music volume is lower and the text-to-speech output is easier to hear.

Outcome:

With the reduced background music volume, the text-to-speech audio will be clearer and more distinct, allowing you to hear both the music and speech without one overpowering the other.

In conclusion, creating a Python application that integrates background music playback with text-to-speech functionality using just_playback and pyttsx3 demonstrates the power of combining audio processing tools in Python. pyttsx3 allows for seamless text-to-speech synthesis with customizable speed, volume, and voice selection, making it ideal for narrating various types of content. Meanwhile, just_playback handles background music playback efficiently, enhancing the user experience. By adjusting the music volume using additional libraries like pydub, we can create a well-balanced audio environment where both music and speech are clearly heard. This combination provides a versatile and interactive way to build engaging multimedia applications in Python.

Author

Sona Avatar

Written by

Leave a Reply

Trending

CodeMagnet

Your Magnetic Resource, For Coding Brilliance

Programming Languages

Web Development

Data Science and Visualization

Career Section

<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-4205364944170772"
     crossorigin="anonymous"></script>