6.808
Lab 4: Detecting Human Gestures via Sound Signals
Assigned: 2022-03-11
Due: 2022-03-30
Assigned: 2022-03-11
Due: 2022-03-30
The goal of this lab is to implement code in Jupyter Notebook to detect human gestures via sound signals. We have provided you with all of the code to show the processed outcome (eg spectrograms, plots etc.), but you will need to implement some parts of the FMCW background subtraction code to be able to visualize human gestures.
This lab is based on a 6.808 project in Spring 2021 by Cooper Jones, Willie Zhu and Jan Wojcik.
Start by downloading the iPython code for this lab.
Known problem: Do not use Google Colab to run the .ipynb. The PortAudio library cannot be used in Colab
Start by downloading the Anaconda software. After installing, open the Anaconda Navigator and click on launch for Jupyter Notebook. It should open in Chrome. Create a new directory called “Lab_4_FMCW” and upload the “6.808 Lab 4 – Gesture Recognition via FMCW.ipynb” notebook file in that directory. Open the file and you should be able to build and run the different code blocks in the file. Specifically, click on the box under “Initialization” then click “Run”. Do the same thing for the box under “Filter Functions”. Now that you have initialized the functions, you are ready to implement the code for the lab.
For those unfamiliar with .ipynb, order in which you run the cells matter, so be aware when going back through different parts of this notebook (as in section 2)
The goal of this section is to successfully obtain spectrograms from the recorded FMCW signals. In the next section you will extract gestures using these spectrograms.
The basic sequence is as follows:
Before you can start transmitting and receiving FMCW chirp signals, you need to know how to transmit a single frequency, what it sounds like, what it looks like in the time and frequency domains, and what its spectrogram looks like.
To do this you will need to implement the “play_and_record” function. Given a frequency, sampling rate, and duration as input, this function should play the corresponding sound (using the speaker) and record it (using the microphone); it should also return the recorded sound.
You should write code to do the following:
Once you have successfully implemented this function, execute the code that comes after the function definition. You should hear a loud 10kHz tone for 2 seconds. When you run the following code blocks, you should be able to see what the signal looks like in time and frequency and what the spectrogram of the signal looks like.
Known issue for Mac Laptops: When running the “sd.playrec()” command, Google Chrome might ask for permission to get access to the microphone, please give access when you get this message otherwise, this command will only play the sound but it will not record it. If you didn’t give access then you will need to go to microphone settings in Mac and there you can give microphone access to Chrome.
Once everything is working fine, you should run subsequent blocks to see the received signal in the time domain, frequency domain, and the spectrogram.
In this task, you will transmit and record a FMCW chirp signal. To do this you will need to implement the "play_and_record_chirp" function. Specifically you need to do the following:
Note: If the microphone is not able to record a good signal, try increasing the amplitude of your transmitted signal (tx) by scaling the entire signal by a constant factor i.e. 500 or 100
After you have successfully implemented this function, run the subsequent block. You should hear a periodic sweep from your speakers.
Subsequent code blocks do the following:
Notice that the spectrograms look relatively similar. In this section, we will see how performing background subtraction allows us to track movements in the environment.
background_subtract
In this task you will perform background subtraction from the received FMCW chirp signals. To do this you need to implement the background_subtract function that takes in a series of mixed chirp segment FFTs (i.e. all_multiplied_ffts) as input. Specifically you need to do the following:
Subsequent code blocks do the following:
idx_to_distance
In this task you will estimate the distance using the peak location. To do this you need to implement the “idx_to_distance” function which takes the peak location as input. Specifically you need to do the following:
𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒=(Δ𝑓.𝑠𝑙𝑜𝑝𝑒.𝑣)/2
where "v" is the speed of sound in air, "slope" is the slope of your FMCW chirp and " Δ𝑓 " corresponds to the peak location.
Hint: The index of the peak is not equal to Δ𝑓 because it is not in Hz, how can you convert it to Hz?
Once you have successfully implemented this function, and if you run the next code block. It should plot the distance variation as a function of time.
In this task, you will record different hand gestures. To do this, run the code for Task 1.2 again but this time while the chirp sound is playing, do the following separately:
For each case your hand movement should be aligned with the direction of the sound.
Write up your answers to the following items in a single PDF file and name it lab4_kerberos.pdf or lab4_kerberos1+kerberos2.pdf (e.g. lab4_bnagda.pdf or lab4_bnagda+fadel.pdf). Upload the file to your private channel in Slack by Mar 30, 11:59 PM. If you work with a partner, you only have to submit once. You do not need to submit your code, but we may ask to look at your code during the checkoff.
Do this for both movement patterns (i.e., 4 plots in total)
Bonus questions (Optional for extra credit)
A checkoff will require successfully showing the two plots for the two different hand gestures, as well as explaining how each of the functions were implemented.