声音年- Chapter 2: Let's talk

Comments

This year is Home Assistant’s声音年。2023年,我们的目标是让用户以自己的语言控制家庭助理。必威是什么今天,我们提出了第2章,这是我们朝着这一目标迈进的第二个里程碑。

Chapter 1,我们专注于意图 - 用户想要做的事情。今天,家庭助理社区已将必威是什么常见的智能家庭命令和回应转化为45种语言,关闭家庭助理支持的62种语言。必威是什么

For Chapter 2, we’ve expanded beyond text to now include audio; specifically, turning audio (speech) into text, and text back into speech. With this functionality,必威是什么家庭助理协助功能现在可以提供完整的语音界面供用户进行交互。

A voice assistant also needs hardware, so today we’re launching ESPHome support for Assist and; to top it off: we’re launching the World’s Most Private Voice Assistant. Keep reading to see what that entails.

To watch the video presentation of this blog post, including live demos, check我们现场直播的录制。

组成语音助手

这new协助Pipeline integrationallows you to configure all components that make up a voice assistant in a single place.

对于语音命令,管道从音频开始。语音到文本系统确定用户所说的单词,然后将其转发给对话代理。代理商从文本中提取意图并由家庭助理执行。必威是什么在这一点上,“打开灯”会导致您的光线打开。管道的最后一部分是文本到语音的,在这里,代理的响应回馈了您。这可能是一个简单的确认(“打开灯”)或问题的答案,例如“亮着哪个灯?”。

Screenshot of the new Assist configuration in Home Assistant.

有了新的语音助手设置页面,用户可以创建多个助手,混合和匹配语音服务。想要一个以英国口音回应的美国英国助手吗?没问题。聆听荷兰语,德语或法国语音命令的第二个助手呢?或者,也许您想将Chatgpt扔进混合。根据需要创建尽可能多的助手,并从中使用协助dialogas well as voice assistant hardware for Home Assistant.

与许多不同的服务互动意味着许多不同的事情可能会出错。为了帮助用户弄清楚什么问题,我们为家庭助理建立了广泛的调试工具。必威是什么您始终可以检查每个语音助手的最后10个互动。

新辅助调试工具的屏幕截图。

由家庭助理云提供动力的语音助手必威是什么

必威是什么家庭助理云subscription, besides end-to-end encrypted remote connection, includes state of the art speech-to-text and text-to-speech services. This allows your voice assistant to speak 130+ languages (including dialects like Peruvian Spanish) and is extremely fast to respond. Sample:

作为订户,您可以直接开始在家庭助理中使用语音。必威是什么您将不需要任何额外的硬件或软件即可开始。

在addition to high quality speech-to-text and text-to-speech for your voice assistants, you will also be supporting the development of Home Assistant itself.

Join Home Assistant Cloud today

这fully local voice assistant

With Home Assistant you can be guaranteed two things: there will be options and one of those options will be local. With our voice assistant that’s no different.

吹笛者:我们针对本地文本到语音高质量的新模型

为了使质量的文本到本地运行,我们必须创建自己的文本到语音系统,该系统已在Raspberry Pi 4上进行了优化,该系统称为Piper。

吹笛者徽标

Piper usesmodern machine learning algorithms对于逼真的语音,但仍然可以快速产生音频。在Raspberry Pi 4上,Piper只能在处理时间的1秒钟内生成2秒的音频。更强大的CPU,例如Intel Core i5,可以在相同的时间内生成17秒的音频。样本:

有关更多样本,请参阅the Piper website

一个add-on with Piperis available now for Home Assistant withover 40 voices across 18 languages, including: Catalan, Danish, German, English, Spanish, Finnish, French, Greek, Italian, Kazakh, Nepali, Dutch, Norwegian, Polish, Brazilian Portuguese, Ukrainian, Vietnamese, and Chinese. Voices for Piper are trained fromopen audio datasets,其中许多来自free audiobooks read by volunteers。如果您有兴趣贡献自己的声音,让我们知道!

你也可以运行Piper as a standalone Docker container

Local speech-to-text with OpenAI Whisper

Whisper是由OpenAI创建的开源语音到文本模型,可在本地运行。自2022年发行以来,Whisper已被开源社区改进,以通过诸如诸如诸如窃窃私语and更快的呼声。在不到一年的进展中,Whisper现在能够为文本提供语音dozens of languageson small servers and single-board computers!

一个使用更快的呼吸速度附加现在可以为家庭助理提供。必威是什么在Raspberry Pi 4上,语音命令可能需要大约7秒钟的时间来处理约200 MB的RAM。Intel Core i5 CPU或更好的能力可以在秒后响应时间,并且可以运行更大(更准确)的耳语版本。

你也可以运行低语是独立的码头容器

怀俄明州:语音助手胶

Voice assistants share many common functions, such as speech-to-text, intent-recognition, and text-to-speech. We created theWyoming protocolto provide a small set of standard messages for talking to voice assistant services, including the ability to stream audio.

怀俄明州使开发人员可以专注于语音服务的核心,而无需投入到诸如HTTP或MQTT之类的特定网络堆栈。该协议与即将到来的version 3.0 of Rhasspy,因此两个项目都可以共享语音服务。

With Wyoming, we’re trying to kickstart a more interoperable open voice ecosystem that makes sharing components across projects and platforms easy. Developers and scientists wishing to experiment with new voice technologies need only implement a small set of messages to integrate with other voice assistant projects.

上面提到的耳语和吹笛者通过新的必威是什么怀俄明州的整合。怀俄明州的服务也可以在其他机器上运行,并且仍然集成到家庭助理中。必威是什么

埃斯法姆powered voice assistants

埃斯法姆is our software for microcontrollers. Instead of programming, users define how their sensors are connected in a YAML file. ESPHome will read this file and generate and install software on your microcontroller to make this data accessible in Home Assistant.

今天,我们正在使用Esphome启动支持语音助手的支持。将麦克风连接到大声台设备,您可以用声音控制智能房屋。包括扬声器,智能房屋会回话。

我们一直专注于M5STACK ATOM Echofor testing and development. For $13 it comes with a microphone and a speaker in a nice little box. We’ve created a tutorial to turn this device into a voice remote directly from your browser!

教程:为家庭助理创建一个13美元的语音遥控器。必威是什么

埃斯法姆Voice Assistant documentation.

世界上最私人的助手

If you were designing the world’s most private voice assistant, what features would it have? To start, it should only listen when you’re ready to talk, rather than all the time. And when it responds, you should be the only one to hear it. This sounds strangely familiar…

A phone! No, not the featureless rectangle you have in your pocket; an analog phone. These great creatures once ruled the Earth with twisty cords and unique looks to match your style. Analog phones have a familiar interface that’s hard to beat: pick up the phone to listen/speak and put it down when done.

With Home Assistant’s newVoice-over-IP integration, you can now use an “old school” phone to control your smart home!

By configuring off-hook autodial, your phone will automatically call Home Assistant when you pick it up. Speak your voice command or question, and listen for the response. The conversation will continue as long as you please: speak more commands/questions, or simply hang up. Assign a unique voice assistant/pipeline to each VoIP adapter, enabling dedicated phones for specific languages.

我们将最初的努力集中在支持Grandstream HT801 Voice-Over-IP盒子。It works with any phone with an RJ11 connector, and connects directly to Home Assistant. There is no need for an extra server.

教程:创建自己世界上最私人的声音助手

Give your voice assistant personality using the OpenAI integration.

Some links on this page are affiliate links and purchases using these links support the Home Assistant project.

Baidu