Matplotlib - Audio Processing



Audio processing is the technology that helps computers understand and work with sound. It involves using computers to manipulate, enhance, or analyze sounds. Think of it as a magical toolbox for making music, fixing noisy recordings, or even recognizing your voice commands when you talk to your phone or smart speaker.

So, audio processing is like a wizard that helps make sounds better and more useful in our digital world. It contains a range of techniques and methods for modifying, enhancing, or extracting information from audio signals.

For instance, we can perform time-frequency analysis on audio data. This analysis helps us understand how the frequency content of the audio signal changes over time. To achieve this, we need to create a spectrogram, which is a visual representation of the frequencies in the audio signal as it evolves over time.

The x-axis represents time, the y-axis represents frequency, and the color intensity represents the strength of each frequency component at a given time. By plotting the spectrogram, we can identify patterns, trends, and changes in the audio signal's frequency content, which is useful for tasks like detecting sound events or analyzing musical compositions −

Audio Processing

Audio Processing in Matplotlib

In Matplotlib, audio processing refers to the manipulation and visualization of audio data. While Matplotlib is primarily known for creating visualizations like charts and graphs, it can also be used to visualize audio signals.

For example, you can use Matplotlib to generate spectrograms, which provide a visual representation of the frequency content of an audio signal over time. Matplotlib does not have built-in functionalities for audio processing. However, you can use it in conjunction with other libraries such as NumPy, SciPy, and librosa to perform audio processing tasks, including filtering, spectral analysis, and signal transformations.

Below are different examples showing the use of Matplotlib with these libraries for basic audio processing and visualization tasks

Audio Processing: Waveform Visualization

In Matplotlib, audio processing waveform visualization involves plotting the amplitude of audio signals over time. When audio data is loaded into Matplotlib, you can use its plotting functions to create a graphical representation of the waveform.

Each point on the plot represents the amplitude of the audio signal at a specific moment in time, and the waveform shows how the signal changes over the duration of the audio clip. This visualization helps users understand the characteristics of the audio, such as its volume and frequency components, and is commonly used in audio analysis and editing applications.

Example

In the following example, we visualize the waveform of an audio signal using the plot() function. The x-axis represents time in seconds, and the y-axis represents the amplitude of the audio signal, with amplitude values plotted against time −

import matplotlib.pyplot as plt
import numpy as np

# Assuming audio_data is a numpy array containing audio samples
audio_data = np.random.randn(10000)  

# Plotting the audio waveform
plt.plot(np.arange(len(audio_data)), audio_data)
plt.xlabel('Sample')
plt.ylabel('Amplitude')
plt.title('Audio Waveform')
# Displaying the plot
plt.show()

Output

Following is the output of the above code −

Waveform Visualization

Audio Processing: Spectrogram

Spectrograms are used to understand the frequency content of audio signals, identify patterns or changes in sound over time, and analyze the characteristics of audio recordings.

In Matplotlib, an audio processing spectrogram is a visual representation of the frequencies and intensities present in an audio signal over time. It is like taking a snapshot of the frequencies present in different parts of the audio clip as it plays.

The spectrogram displays time on the x-axis, frequency on the y-axis, and the intensity of each frequency at a given moment as a color. Brighter colors typically represent higher intensities or amplitudes of frequencies, while darker colors represent lower intensities.

Example

In this example, we are using the specgram() function to generate and display the spectrogram of the audio signal. The x-axis represents time in seconds, the y-axis represents frequency in Hertz, and the color intensity represents the intensity of each frequency component at different points in time −

import matplotlib.pyplot as plt
import numpy as np

# Generating random audio data
# Sample rate in Hz
sample_rate = 44100
# Duration of the audio in seconds
duration = 5 
# Creating a time array representing the time axis of the audio data 
t = np.linspace(0, duration, int(sample_rate * duration))
# Generating audio data
# by creating a sine wave at 440 Hz frequency
audio_data = np.sin(2 * np.pi * 440 * t)  
# Plotting the spectrogram of the audio data
plt.specgram(audio_data, Fs=sample_rate)
plt.xlabel('Time (s)')
plt.ylabel('Frequency (Hz)')
plt.title('Spectrogram')
# Adding a color bar to indicate intensity in decibels
plt.colorbar(label='Intensity (dB)')
# Displaying the plot
plt.show()

Output

Following is the output of the above code −

Spectrogram Visualization

Audio Processing: Frequency Spectrum

In Matplotlib, an audio processing frequency spectrum represents the distribution of frequencies present in an audio signal. It is like breaking down the sound into its individual frequency components to see how much of each frequency is present.

The frequency spectrum plot shows frequency on the x-axis and the intensity or amplitude of each frequency on the y-axis. Peaks or spikes in the plot indicate frequencies that are particularly strong or dominant in the audio signal.

This visualization helps to understand the composition of the audio, such as the pitch of the sound, the presence of different musical notes or tones, and any background noise or interference.

Example

In here, we are using the magnitude_spectrum() function to calculate and display the frequency spectrum of the audio signal. The x-axis represents frequency in Hertz, and the y-axis represents the magnitude of each frequency component −

import matplotlib.pyplot as plt
import numpy as np

# Assuming audio_data is a numpy array containing audio samples
# Providing random audio data
audio_data = np.random.randn(10000)
# sample rate
sample_rate = 44100  

# Plotting the magnitude spectrum of the audio data
plt.magnitude_spectrum(audio_data, Fs=sample_rate)
plt.xlabel('Frequency (Hz)')
plt.ylabel('Magnitude')
plt.title('Frequency Spectrum')
# Displaying the plot
plt.show()

Output

Following is the output of the above code −

Frequency Spectrum
Advertisements