Lab A - Tonal Analysis
This lab allows you to practice analysing an audio signal input in the frequency domain, and to extract some features. Skills learnt during this lab would be useful for an analysis system which is based on the extraction of tonal features for example for F0/pitch detection, tuning, intonation, vibrato analysis etc.
1) To be able to process audio signals to extract tonal parameters
2) To be able to use tonal parameters to calculate tonal attributes
1) Creating test files
2) Extracting the fundamental frequency
3) Using spectrograms
4) Calculating fundamental frequency over time
5) Creating more complex test files
6) Analysing more complex tones with spectrogram
7) Creating a complex tone with vibrato
8) Analysing vibrato
9) Further work
1. Creating test files
The first section of this lab involves making and using test audio. Having ‘ground truth’ files enables us to know that our system is performing as expected - it is worth also doing this for your assignment in some way!
Make 4 sine waves of different fundamental frequencies (make sure you know what they are!) each with a length of 2 seconds. Check that they are correct using plot and soundsc. Use descriptive variable names.
Make a melody by combining the four sine waves in sequence.
2. Extracting the fundamental frequency
Write a function which uses fft to plot the frequency spectrum of the single tones you have created.
fft returns the complex and imaginary parts of the frequency domain version of the time domain signal. To get the frequency magnitude response, you will also need to use abs.
Reminder: the spectrum returned by fft contains N samples (where N is the number of points specified, or if no value is specified, the length of the input signal) from 0Hz to the sampling frequency Fs. Notice that the lower half of the spectrum is reflected in the upper half of the plot. This is a consequence of the waveform being sampled. Only the N/2 frequencies up to just below Fs/2 carry useful information, the higher frequencies are redundant and can be ignored (usually they are not plotted).
Now, change the scale of the horizontal axis of the spectrum to kHz.
Also change the vertical axis to plot the magnitude spectra in decibels (dBs).
Calculate the fundamental frequency from the data returned by fft.
3. Using spectrograms
Although using fft to perform a frequency analysis on a whole signal gives us some very general information about what is present, it does not tell us about where in time components at particular frequencies occur – we need to analyse this with a time-frequency analysis tool.
A very common example of a time-frequency analysis tool is the spectrogram, which describes how the spectrum of a signal changes over time. It does this by dividing the signal up into small (possibly overlapping and tapered) segments and performing a Fourier analysis (as is used for fft) for each one, which shows how energy is distributed in frequency for each of these segments.
Handily, there is a function called spectrogram within Matlab which carries out all of these steps for us and plots the distribution of a signal’s energy over time and frequency in a two-dimensional plot.
The first input to this function is a vector containing the single-channel (i.e. mono) signal that we wish to analyse.
The second input is a scalar (i.e. a single numeric value) that specifies how long each segment (or ‘window’) should be.
The third input is the number of samples each segment of the vector overlaps.
The fourth input is the number of frequency points used to calculate the discrete Fourier transforms.
The fifth input is the sampling frequency.
The sixth input is the location of the frequency axis: ‘xaxis’ or ‘yaxis’.
Plot the spectrogram of your melodic sequence. If you do not assign any outputs, spectrogram creates a plot of the data.
Experiment with changing the window length and overlap size. Increasing the window length gives us better frequency resolution in our spectrogram. This comes at a cost of poorer time-resolution - this is not a problem for a simple sinusoidal input such as this, but may be something you need to think about later.
Assigning outputs to spectrogram means we can access the data, rather than just producing a nice plot.
[s,f,t] = spectrogram(...) returns:
s = complex frequency data of dimensions f by t
f = specified frequencies
t = specified time intervals
See if you can recreate the output from spectrogram using the output data - don’t worry about the colorbar values for now. Remember that Matlab figures start incrementing from the top left but spectrograms start at the bottom left. A helpful plot function might be imagesc, as might be the call axis xy.
It is also possible to build your own spectrogram by using fft multiple times whilst stepping through the signal.
4. Calculating fundamental frequency over time using spectrogram
In a similar way to that used in Section 2, calculate the fundamental frequencies of the melodic sequence and plot against time. Hint: the output from spectrogram will help here.
Expand this to determine the onset time of each note.
Why might it not always be accurate to do this in the frequency domain?
5. Creating more complex test files
Now create some sawtooth tones with the same fundamental frequencies and lengths as the sine waves you made in Section 1. Build the sawtooth tones from sinusoidal components rather than using the Matlab function - include 10-20 harmonics. Create a melodic sequence in the same way as in Section 1. You should have the same sequence twice, using sine and sawtooth tones.
6. Analysing more complex tones with spectrogram
Use the spectrogram function to analyse the sawtooth tone sequence you have created – are the results as you expect? What happens if you calculate the fundamental frequency of a more complex tone?
7. Creating a complex tone with vibrato
Vibrato is a frequency modulation of the fundamental and harmonics of a signal, usually around 4-8Hz. We are going to synthesise a complex tone with vibrato to act as an approximated singing signal.
Use the following code as a starting point to alter your sawtooth tone generation code to frequency-modulate each harmonic. We are modulating the frequency by changing the value passed to sine for each sample.
lfo_f = 5; %modulating frequency (rate)
lfo_depth = 5; %modulation depth (extent)
phase_adjustment = lfo_depth.*sin(2*pi*lfo_f .*t);
Harmonic = sin(2*pi*fc*t + phase_adjustment);
Then create a sequence using the same fundamental frequencies as your sine sequence and complex tone sequence. You should now have 3 sequences: a sine sequence, a sawtooth sequence, and a vibrato sawtooth sequence. These need to match in both frequencies used and length of notes, for reasons that you will see later. Plot these sequences and make sure you can see the harmonics and vibrato.
8. Analysing vibrato
Calculate the fundamental frequency over time of your complex vibrato sequence. Do you capture the vibrato information? What might you need to adjust in your spectrogram calculation?
To analyse the vibrato for separate notes, we need to know where notes start and end. This is why it is important that your sequences all match in both fundamental frequencies used and length of notes in the melody. We are fabricating a fairly complex step in vibrato analysis to enable us to look at the final output - determining where the notes are in the signal.
We are going to use the note onset time information from the sine sequence (calculated in Section 4) to analyse the periodicity of the vibrato on each note in turn to determine the vibrato rate.
Use the following code as a start. Findpeaks finds the local maxima in a signal - we can use this to calculate the period and therefore the frequency of the vibrato.
[peaks, locations] = findpeaks(fundamental(noteOnset1:noteOnset2-1); %we are interested in a specific section of the signal
vibratoMaxLocations = t(locations); %where t is the output from spectrogram which gives the time increments
vibratoPeriod = mean(diff(vibratoMaxLocations)); %calculates the mean difference between each time value
vibratoFrequency = 1/vibratoPeriod;
Make sure you understand this code - ask if you are not sure. The frequency you calculate should match the LFO value that you specified when creating the complex tone with vibrato.
Download the example file (lc_vss_212_cv7.wav). This is a real-world vibrato singing example, for which we do not have ground truth fundamental frequency values. How does this differ to the synthesised example when you plot the spectrogram? How might you go about analysis of the vibrato in this example?
9. Further work
There are other methods of F0 estimation that are more sophisticated than just taking values from the FFT, and which incorporate the note onset requirement. Some of these can be found in toolboxes that other people have written - see if you can find any online or from literature and explore their inputs, usage, and outputs.
It is also possible to implement tonal analysis in Pure Data. If you wish, there are various objects similar to the functions explored in this lab, in particular:
Explore [rfft], [ifft] and [fiddle] by working through the examples in this tutorial (http://pd-tutorial.com/english/ch03s08.html), particularly 188.8.131.52 and 184.108.40.206
What other features could you extract from a spectral analysis of the audio input file?