pr2_listens

Interface for PR2 to listen to speech and audio commands, and be able to physically move and respond to those commands with appropriate action.

This project was developed as part of a research project at Carnegie Mellon University.

Dependencies:

SoX - Sound eXchange - http://sox.sourceforge.net/
PortAudio - For audio input collection and initial handling
Google Speech API - (v1)
ROS (Robot Operating System) on PR2

Audio editing techniques: For audio editing, I utilized a library called SoX. It is effectively a command line interface to be able to resample and edit audio tracks. For further documentation on the various techniques and commands used, please refer to: http://sox.sourceforge.net/Docs/Documentation

Architecture Design: Various architechture designs were tried and tested. This was finally chosen as the optimal option. We have one central server that is responsible for "Listening & Speech to text translation". The speech to text was further handled using the Google Speech API. Each client could then ping the server, indicating that it wishes to now listen for commands. While "KEEP ALIVE" messages are sent to the server, it continues to listen. When no clients are active, the server stays alive, but temporarily stops listening and translating.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
msg		msg
scripts		scripts
CMakeLists.txt		CMakeLists.txt
README.md		README.md
package.xml		package.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

msg

msg

scripts

scripts

CMakeLists.txt

CMakeLists.txt

README.md

README.md

package.xml

package.xml

Repository files navigation

pr2_listens

About

Releases

Packages

Languages

vsai/pr2_listens

Folders and files

Latest commit

History

Repository files navigation

pr2_listens

About

Resources

Stars

Watchers

Forks

Languages