Skip to content

vsai/pr2_listens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pr2_listens

Interface for PR2 to listen to speech and audio commands, and be able to physically move and respond to those commands with appropriate action.

This project was developed as part of a research project at Carnegie Mellon University.

Dependencies:

  • SoX - Sound eXchange - http://sox.sourceforge.net/
  • PortAudio - For audio input collection and initial handling
  • Google Speech API - (v1)
  • ROS (Robot Operating System) on PR2

Audio editing techniques: For audio editing, I utilized a library called SoX. It is effectively a command line interface to be able to resample and edit audio tracks. For further documentation on the various techniques and commands used, please refer to: http://sox.sourceforge.net/Docs/Documentation

Architecture Design: Various architechture designs were tried and tested. This was finally chosen as the optimal option. We have one central server that is responsible for "Listening & Speech to text translation". The speech to text was further handled using the Google Speech API. Each client could then ping the server, indicating that it wishes to now listen for commands. While "KEEP ALIVE" messages are sent to the server, it continues to listen. When no clients are active, the server stays alive, but temporarily stops listening and translating.

About

Interface for PR2 to listen and respond to commands

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages