Face It 👀, Save All Your Happy Moments Via Google Aiy
About the project
We could all use a reminder of the good times when we are down. For happy memory curation, we propose Face it 👀 - an automatic happy moment achiever by utilizing the Google AIY Voice kits, will automatically begin recording when you initialize a recording trigger word, and save the moment to play back later.
Estimated time: 1 week
Items used in this project
The main goal of this project is to voice-trigger a recording of a moment and play it later at a specific time you want. Via using the original setup from Google AIY Voice Kit, we were able to record a 3s voice message and replayed it by manually inputting a command. However, we would need to program extra code to start the auto-recording based on the voice recognition ideas we want to achieve. Below is a video of our final product demo, in which you will see an example on how we could use Face it 👀 in a birthday surprise party.
Below are the steps you will need to do in order to reproduce what we have done in the video (there might be different ways of programming, but we are providing you one approach out of millions out there😊):
1. Build the Google AIY Voice Kit
We followed the official Google AIY Voice Kit guide webpage to build up our kit: https://aiyprojects.withgoogle.com/voice/#assembly-guide
The guide is easy to follow, and we took photos along the way assembling it:
2. Configure the Kit
After assembling the kit, you will need to set up SSH to get the terminal ready, connect to the Raspberry Pi and get credentials from the Google Cloud Platform in order to use the Google Assistant APIs. One thing we tried and fixed apart from the tutorial is an error in one of the example files when executing src/examples/voice/assistant_grpc_demp.py. You will need to change one function call in one of the python scripts then continue to follow the Google tutorial as normal. For more details on how to fix the issue, you can refer to: https://github.com/google/aiyprojects-r ... issues/658
3. Use cases
We have made three use cases of this project related to voice recognition via using trigger-word recording. Imagine there is a Paddington bear, who is friend with Teddy bear and Pinky bear. One day Teddy and Pinky bear bought a Face it 👀 which is using Google AIY Voice recognition technology. By giving different key words and functions in the code, Teddy and Pinky bear want to try different games with Paddington via using the kit. Below is what they planned:
1. Prosocial teasing with friends – invite him/er to eat chicken feet
2. Birthday surprise party gift
3. Google Google who is the fairest of us all?
In the first use case, Teddy and Pinky bear invite Paddington to try a Chinese traditional finger food (real fingers by all means) – chicken feet. They record Paddington’s first reaction towards the invitation (without Paddington knowing it, because the auto voice trigger word is “chicken feet”), and replay the recording after Paddington tried the food and changed his mind. It is a friendly teasing of Paddington, to encourage him never say no to new things that he didn’t know before. Just like a lot of parents who said they didn’t want a dog/cat, and when you finally get them one, they changed their mind and cherish the pet more than anything else. You can record such scenarios via Face it 👀 any time:
In the second use case, Teddy and Pinky bear organized a surprise party for Paddington’s birthday. And when Teddy bear says “Let’s sing a birthday song”, the “birthday song” key word will trigger the auto-recording without Paddington knowing it. What’s more, Teddy wants to give Paddington this precious voice recording memory as a birthday gift later, so he can either ask Google via using the trigger word “birthday gift” to replay the song sang by their voices, or save the recording as an audio file to send to Paddington.
Last but not the least, Teddy bear hard coded a fun question and answer in the kit: whenever someone asked the question “Google google, who is the fairest of us all?”, by detecting the key word “fairest of us all”, Google will reply “Teddy bear, is the fairest in the world”. The key is that, after this question Teddy will say “next question” and trigger to switch automatically from the “special question” mode back to a normal Google assistant Q&A mode without being noticed. So when Paddington started to ask the kit any other questions, such as “what day is it today”, Google will reply regularly based on the Google Assistant library.
4. Set up your IDE to connect to the microSD card
To better test the result from our code writing, we connected our Visual Studio and push the below command to cover the code directly in the microSD card to test. This gives you the opportunity to see simultaneous outcome from what you write. We modified directly in the grpc.py file (~AIY-project-python/src/aiy/assistant/grpc.py) from one of the default examples of the Google Voice Kit and another demo file (~AIY-project-python/src/examples/voice/assistant_grpc_demo_ex.py), and email@example.com is our IP address:
5. Write and run the code
Here comes to the juicy part of this project - the real “go behind the scenes”.
1. Modify the code to start listening conversation once the kit is powered on, instead of triggering listening via pressing the button from the kit as default setup from the Google Voice Kit. We need to keep the listening always on-going in order to catch the trigger words later.
2. Once the kit is consistently listening, we call function conversation2 to listen trigger words (both start the conversation with Google or ask Google to turn off speaking / shut up herself):
The reason we need Google to “shut up” herself is that we want Google to “quietly listening” for the trigger word while we talk, instead of keep interrupting users’ conversation if she doesn’t understand. For all the conversations she doesn’t understand, she should keep silent until she captures any trigger word.
We call function Listner to listen trigger words (to play voice message from respective previous recordings):
3. For all the trigger words in three use cases, below is the set-up:
“hey google” – start the normal conversation with Google (Q&A mode)
“shut up” – ask Google to keep quiet while waiting for the trigger word (turn off the Q&A mode and change to “quiet listening” mode)
“everybody ready” – ask Google to keep quiet while waiting for the trigger word (in use case 2, turn off the Q&A mode and change to “quiet listening” mode)
“chicken feet” – start recording a voice message (in use case 1)
“birthday song” – start recording a voice message (in use case 2)
“play memory” – replay the recording from the voice message triggered by “chicken feet” (in use case 1)
“birthday gift” – replay the recording from the voice message triggered by “birthday song” (in use case 2)
“next question” – ask Google to keep quiet while waiting for the trigger word (in use case 3, turn off the Q&A mode and change to “quiet listening” mode)
“fairest of us all” – trigger to play the answer “teddy bear, is the fairest in the world” (in use case 3)
Function for recording and playing:
Our team had a lot of fun doing this project while trying and playing with the Google kit. We think it is a very good tool to entertain your families and friends and explore the AI world by using mostly the basic tool setups. It is also a good way for Python beginners to dive in and learn coding via playing. In our original proposal, we would like to record the memory by combining both vision and voice recordings, however the camera from the Vision Kit we received was not working since the beginning. We have contacted the firstname.lastname@example.org team for a replacement part and it takes time to ship to us, so our development will continue once the camera arrives. We are happy with our current Face it 👀 result especially the auto-voice listening and trigger word based re-playing functions. We also want to thank electromaker platform to give us the opportunity to test Google AIY kits, and all the help answering our questions in their website forum. We hope that you enjoy our team’s idea and solutions, as much as we enjoy this contest.
A sincere thank you from our team: