Speech Detection

This script uses the OpenCV, Azure Speech Recognition, gTTS, and VLC libraries to detect faces in the webcam video feed and recognize speech in the audio input. It generates a response to the user's speech using OpenAI's GPT-3 API, and plays the generated speech using the VLC library. It also displays a GUI with a textbox and buttons to change the language for speech recognition.

Requirements

OpenCV

pip install opencv-python

Azure Speech Recognition

pip install azure-cognitiveservices-speech

gTTS

pip install gTTS

VLC

download https://www.videolan.org/vlc/download-windows.en-GB.html

tkinter

pip install tk

OpenAI API key

pip install openai

Usage

Set the environment variable azure_api_key to your Azure API key.
Set the environment variable openai_api_key to your OpenAI API key.
Run the script: python main.py
The script will start the webcam and display the video feed in a window.
When a face is detected in the video, the script will create a rectangle around it.
When speech is detected, it will transcribe the speech to text, generate a response, generate speech from the response text, and play the generated speech.
The transcribed speech and generated response will be displayed in the GUI textbox.
Click the "English" or "German" button in the GUI to change the language for speech recognition.
Press "Q" in runtime to end the script.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
temp		temp
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Detection

Requirements

Usage

Copyright

About

Releases

Packages

Languages

License

LeonBurghardtDev/ai-face-and-speech-detection

Folders and files

Latest commit

History

Repository files navigation

Speech Detection

Requirements

Usage

Copyright

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages