Espressif S3 Box 3, aka Ha Voice Assist

Took the plunge and ordered one last night, should be here in a few days. Once it arrives I’ll be diving into the HA Voice assistant journey and plan to update you guys with info on ease of setup, functionality, how it compares to google etc etc.

So stay tuned :slight_smile:


So the box arrived earlier this evening and I’ve begun to mess around with it a bit. There are a couple options for using these as a local voice assistant with HA, I am covering the pure HA/ESPhome offering here. I do plan to install Willow at some point soon and will cover it at that time.

Setting it up for use with Home Assistant and ESPHome is surprisingly easy, as is configuring HA to take advantage of it. Even creating and installing a custom wake word is almost trivial to do.

Custom generated wake words work rather well at the default settings, catching it about 80% of the time. I have not had a chance to generate any using settings recommended for increased accuracy as I wanted to ensure they would work before investing the time doing more in depth training.

The mic on these little boxes is quite decent as well, with the box able to understand me perfectly from a couple rooms away and over tv or music with about 95% accuracy.

You can currently customize the images displayed on screen when waiting to hear wake work, listening, thinking, responding, and encountering an error.

Without any additional setup you can control any device installed in HA by voice at this point, and ask about some things like the weather. Ha refers to these commands/actions as intents and sentences. You can create your own custom intents/sentences allowing you to do anything you could already do with HA.

Unfortunately they have just begun to support these in ESPHOME and more importantly esp-idf so a lot of the capabilities are currently unused/unavailable. There are of course also some bugs that are still being squashed.

The touchscreen is useless as anything other than a display to let you know what the assistant is doing. you cant use it to control anything or adjust any settings etc.

The volume cannot be adjusted and is way too low to be of use unless it is right next to you.

None of the sensors are available, including the radar for presence detection

ESP-IDF does not currently feature a media player, so TTS is all you can push to the device, and currently I cannot get it to work without it sounding rushed and half cut off at the end.

The device randomly freezes after processing a command and needs to be reset. Based on a quick search this may be related to custom wake words, but looking at the logs i think its another issue with the idf as support for several parts of these boxes were just recently added.

Lots of potential, could be a fun device if you enjoy tinkering or have experience with the esp/arduino platforms. Not ready for the average user quite yet though.

Will be looking into how the willow platform behaves on it, it sounds more promising as it has the touchscreen working and can also integrate with HA. There is also a firmware from espressif that i need to look into along with any community based solutions.

Will update in the future as i progress.

Setup info/links for those interested in playing along (yes it really is this easy to get started)

Follow this to install piper and whisper (and openwakeword if you want a custom wake word): Installing a local Assist pipeline - Home Assistant

Next plug in the box to your computer and open this web page and follow the instructions to flash the latest esphome firmware for it and connect it to HA.

Creating a custom wake word model can be done here just follow the instructions:

A decent video and write-up to help you along with the process if you have issues: Local Voice Assistance with Wake Word in Home Assistant! Bye Bye Google Home and Alexa! - Smart Home Junkie - Tutorials and Information for your Smart Home and Home Assistant

And of course feel free to ask and I’ll help where i can


OK, tryiing this:

I then click this:

and I then get to this.

is there a different, better way?


With the core installation of HA lacking the offiicial Add-on store, I believe you would need to manually install whisper, piper, and openwakeword in their own containers and then use the built in wyoming integration in ha to integrate them into the core HA install via ip and port numbers.

Well at least until Markus has time to finish the update images as I believe they implement the FULL HA install which includes the add on store.

Sorry about the confusion, I forgot I moved my HA install off of core and onto an LXC