Rhasspy Voice Assistant

Rhasspy is an offline, multilingual voice assistant toolkit inspired by Jasper that works with Home Assistant and Hass.io.

To run Rhasspy using Docker:

docker run -d -p 12101:12101 \
      --restart unless-stopped \
      -e RHASSPY_PROFILES=/profiles \
      -v "$HOME/.rhasspy:/profiles" \
      --device /dev/snd:/dev/snd \
      synesthesiam/rhasspy-hassio-addon:latest

Then visit the web interface at http://localhost:12101

Purpose

A typical voice assistant (Alexa, Google Home, etc.) solves a number of important problems:

Deciding when to listen (wake word)
Listening for commands/questions (wait for silence)
Transcribing command/question (speech to text)
Interpreting the speaker's intent from the text (intent recognition)
Fulfilling the speaker's intent (e.g., playing a song, answering a question)

Rhasspy provides offline, private solutions to problems 1-4 using off-the-shelf tools. These tools are:

Pocketsphinx Keyphrase (wake word)
PyAudio (wait for silence)
Pocketsphinx (speech to text)
RasaNLU (intent recognition)

For problem 5 (fulfilling the speaker's intent), Rhasspy works with Home Assistant's built-in automation capability. For each intent you define, Rhasspy sends an event to Home Assistant that can be used to do anything Home Assistant can do (toggle switches, call REST services, etc.). This means that Rhasspy will do very little out of the box compared to other voice assistants, but there will also be no limits to what can be done.

How it Works

Rhasspy transforms speech commands into Home Assistant events that trigger automations. You define these commands in a Rhasspy profile using a specialized template syntax that lets you control how Rhasspy creates the events it sends to Home Assistant.

Let's say you have an RGB of some kind in your bedroom that's hooked up already to Home Assistant. You'd like to be able to say things like "set the bedroom light to red" to change its color. To start, you could write a Home Assistant automation to help you out:

automation:
  # Change the light in the bedroom to red.
  trigger:
    ...
  action:
    service: light.turn_on
    data:
      rgb_color: [255, 0, 0]
      entity_id: light.bedroom

Now you just need the trigger! Rhasspy will send events that can be caught with the event trigger platform. A different event will be sent for each intent that you define. On the Rhasspy side, define an intent called ChangeLightColor that can be said a number of ways:

[ChangeLightColor]
colors = (red | green | blue) {color}
set [the] (bedroom){name} [to] <colors>

This is a simplified JSGF grammar that will generate the following sentences:

set the bedroom to red
set the bedroom to green
set the bedroom to blue
set the bedroom red
set the bedroom green
set the bedroom blue
set bedroom to red
set bedroom to green
set bedroom to blue
set bedroom red
set bedroom green
set bedroom blue

Rhasspy uses these sentences generate an ARPA language model for speech recognition, and train an intent recognizer. The {color} tag in the colors rule will have Rhasspy put a color property in each event with the name of the recognized color. Likewise, the {name} tag on bedroom will add a name property.

If trained on these sentences, Rhasspy will now recognize commands like "set the bedroom light to red" and send a rhasspy_ChangeLightState to Home Assistant with the following data:

{
  "name": "bedroom",
  "color": "red"
}

You can now fill in the rest of the Home Assistant automation:

automation:
  # Change the light in the bedroom to red.
  trigger:
    platform: event
    event_type: rhasspy_ChangeLightState
    event_data:
      name: bedroom
      color: red
  action:
    service: light.turn_on
    data:
      rgb_color: [255, 0, 0]
      entity_id: light.bedroom

This will handle the specific case of setting the bedroom light to red, but not any other color. You can either add additional automations to handle these, or make use of automation templating to do it all at once.

Intended Audience

Rhasspy is intended for advanced users that want to have a voice interface to Home Assistant, but value privacy and freedom above all else. There are many other voice assistants, but none (to my knowledge) that:

Can function completely disconnected from the Internet
Are entirely free/open source
Work well with Home Assistant and Hass.io

If you feel comfortable sending your voice commands through the Internet for someone else to process, or are not comfortable with rolling your own Home Assistant automations to handle intents, I recommend taking a look at Mycroft.

Customization

Rhasspy allows you to customize every stage of intent recognition, including:

Defining custom wake words
Providing example sentences that you want to be recognized, annotated with intent information
Specifying how you pronounce specific words, including words that Rhasspy doesn't know yet
Splitting speech recording, transcription, and intent recognition across multiple machines

Profiles

All of the files Rhasspy needs for wake word detection, speech transcription, and intent recognition are contained in a profile directory. Out of the box, Rhasspy contains profiles for English (en), Spanish (es), French (fr), German (de), Italian (it), Dutch (nl), and Russian (ru).

The important files in a profile are:

acoustic_model/
- Directory with CMU acoustic model (16 Khz)
base_dictionary.txt
- Large CMU dictionary file with general word pronunciations
custom_words.txt
- Small CMU dictionary file with custom word pronunciations for you
unknown_words.txt
- Small CMU dictionary file with guessed word pronunciations by phonetisaurus
g2p.fst
- Finite state transducer used by phonetisaurus to guess unknown word pronunciations
language_model.txt
- ARPA trigram model created from user sentences
phoneme_examples.txt
- Text file with example words/pronunciations for each phoneme
phonemes.txt
- Text file mapping from CMU to eSpeak phonemes
profile.json
- Overrides profile settings from defaults.json
- See profile documentation for details
sentences.ini
- Intents and sentences used to generate language model and train intent recognizer
rasa_config.yml
- YAML configuration for RasaNLU

Running

Rhasspy is designed to run on Raspberry Pi's (armhf) and desktops/laptops (amd64), as a Hass.IO add-on, within Docker and inside a Python virtual environment.

Docker

Make sure you have Docker installed:

curl -sSL https://get.docker.com | sh

and that your user is part of the docker group:

sudo usermod -a -G docker $USER

Be sure to reboot after adding yourself to the docker group!

Next, start the Rhasspy Docker image in the background:

docker run -d -p 12101:12101 \
      --restart unless-stopped \
      -e RHASSPY_PROFILES=/profiles \
      -v "$HOME/.rhasspy:/profiles" \
      --device /dev/snd:/dev/snd \
      synesthesiam/rhasspy-hassio-addon:latest

The web interface should now be accessible at http://localhost:12101

If you're using docker compose, try this:

rhasspy:
    image: "synesthesiam/rhasspy-hassio-addon:latest"
    restart: unless-stopped
    environment:
        RHASSPY_PROFILES: "/profiles"
    volumes:
        - "./rhasspy_config:/profiles"
    ports:
        - "12101:12101"
    devices:
        - "/dev/snd:/dev/snd"

Hass.IO

Add my Hass.IO Add-On Repository in the Add-On Store, refresh, then install the "Rhasspy Assistant" under “Synesthesiam Hass.IO Add-Ons” (all the way at the bottom of the Add-On Store screen).

NOTE: Beware that on a Raspberry Pi 3, the add-on can take 10-15 minutes to build and around 1-2 minutes to start.

Watch the system log for a message like “Build 8e35c251/armhf-addon-rhasspy:1.1 done”. If the “Open Web UI” link on the add-on page doesn’t work, please check the log for errors, wait a minute, and try again.

Virtual Environment

This repository is designed to host a Python Virtual environment for running Rhasspy outside of Docker. This may be desirable if you have trouble getting Rhasspy to access your microphone from within a Docker container. To start, clone the repo somewhere:

git clone https://github.com/synesthesiam/rhasspy-hassio-addon.git

Then run the create-venv.sh script (assumes a Debian distribution):

cd rhasspy-hassio-addon/
./create-venv.sh

Once the installation finishes (5-10 minutes on a Raspberry Pi 3), you can use the run-venv.sh script to start Rhasspy:

./run-venv.sh

If all is well, the web interface will be available at http://localhost:12101

Supporting Tools

The following tools/libraries help to support Rhasspy:

Flask (web server)
Pocketsphinx (speech to text)
Opengrm (language modeling)
Phonetisaurus (word pronunciations)
fuzzywuzzy (fuzzy string matching)
RasaNLU (intent recognition)
Python 3
Sox (WAV conversion)
Vue.js (web UI)
webrtcvad (voice activity detection)

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
bin		bin
dist		dist
doc		doc
docker		docker
etc/homeassistant/config		etc/homeassistant/config
profiles		profiles
public		public
src		src
.dockerignore		.dockerignore
.env.development		.env.development
.gitignore		.gitignore
.projectile		.projectile
Dockerfile		Dockerfile
Dockerfile.alpine		Dockerfile.alpine
Dockerfile.client		Dockerfile.client
Dockerfile.demo		Dockerfile.demo
Dockerfile.server		Dockerfile.server
Makefile		Makefile
README.md		README.md
app.py		app.py
audio_recorder.py		audio_recorder.py
babel.config.js		babel.config.js
command_listener.py		command_listener.py
config.json		config.json
create-venv.sh		create-venv.sh
intent.py		intent.py
jsgf_utils.py		jsgf_utils.py
package.json		package.json
profiles.py		profiles.py
requirements.txt		requirements.txt
run-demo.sh		run-demo.sh
run-server.sh		run-server.sh
run-venv.sh		run-venv.sh
stt.py		stt.py
train.py		train.py
utils.py		utils.py
wake.py		wake.py
yarn.lock		yarn.lock

brBart/rhasspy-hassio-addon

Folders and files

Latest commit

History

Repository files navigation

Rhasspy Voice Assistant

Purpose

How it Works

Intended Audience

Customization

Profiles

Running

Docker

Hass.IO

Virtual Environment

Supporting Tools

About

Resources

Stars

Watchers

Forks

Languages