javascript - What are the ways to implement speech recognition in Electron? - Stack Overflow

So I have an Electron app that uses the web speech API (SpeechRecognition) to take the user's voic

So I have an Electron app that uses the web speech API (SpeechRecognition) to take the user's voice, however, it's not working. The code:

if ("webkitSpeechRecognition" in window) {
  let SpeechRecognition =
    window.SpeechRecognition || window.webkitSpeechRecognition;
  let recognition = new SpeechRecognition();

  recognition.onstart = () => {
    console.log("We are listening. Try speaking into the microphone.");
  };

  recognition.onspeechend = () => {
    recognition.stop();
  };

  recognition.onresult = (event) => {
    let transcript = event.results[0][0].transcript;
    console.log(transcript);
  };

  recognition.start();
} else {
  alert("Browser not supported.");
}

It says We are listening... in the console, but no matter what you say, it doesn't give an output. On the other hand, running the exact same thing in Google Chrome works and whatever I say gets console logged out with the console.log(transcript); part. I did some more research and it turns out that Google has recently stopped support for the Web Speech API in shell-based Chromium windows (Tmk, everything that is not Google Chrome or MS Edge), so that seems to be the reason it is not working on my Electron app.

See: electron-speech library's end Artyom.js issue another stackOverflow question regarding this

So is there any way I can get it to work in Electron?

So I have an Electron app that uses the web speech API (SpeechRecognition) to take the user's voice, however, it's not working. The code:

if ("webkitSpeechRecognition" in window) {
  let SpeechRecognition =
    window.SpeechRecognition || window.webkitSpeechRecognition;
  let recognition = new SpeechRecognition();

  recognition.onstart = () => {
    console.log("We are listening. Try speaking into the microphone.");
  };

  recognition.onspeechend = () => {
    recognition.stop();
  };

  recognition.onresult = (event) => {
    let transcript = event.results[0][0].transcript;
    console.log(transcript);
  };

  recognition.start();
} else {
  alert("Browser not supported.");
}

It says We are listening... in the console, but no matter what you say, it doesn't give an output. On the other hand, running the exact same thing in Google Chrome works and whatever I say gets console logged out with the console.log(transcript); part. I did some more research and it turns out that Google has recently stopped support for the Web Speech API in shell-based Chromium windows (Tmk, everything that is not Google Chrome or MS Edge), so that seems to be the reason it is not working on my Electron app.

See: electron-speech library's end Artyom.js issue another stackOverflow question regarding this

So is there any way I can get it to work in Electron?

Share Improve this question edited Mar 2, 2024 at 13:01 XYBOX asked Jan 18, 2023 at 18:08 XYBOXXYBOX 9713 bronze badges 1
  • Hey, and if possible, maybe this question could gain enough traction to reach the panies managing these APIs and perhaps they could do something about native support on shell-based browsers. I understand the reasons they might've disabled it, but I think those should be solved in a way other than pletely removing support. – XYBOX Commented Mar 2, 2024 at 13:03
Add a ment  | 

2 Answers 2

Reset to default 7

I ended up doing an implementation that uses the media devices API to get the user's speech through their microphone and then sends it to a Python server using WebSockets which uses the audio stream with the SpeechRecognition pip package and returns the transcribed text to the client (Electron app).

This is what I implemented, it is way too long for a thing as simple as this, but if someone has a better suggestion, please do let me know by writing an answer.

I used Rust, Neon, cpal and Vosk to make a nodejs module that can start/stop independent OS threads that handle listening to the mic and recognizing text from it in real-time. From node you can select the device and plug in different language recognizers, hand it trigger words to call back to, etc. It works for what I built it for but I can probably put up a repo for it and make it a little more flexible if anyone's interested.

const { app, BrowserWindow } = require('electron');
const voiceModule = require('./index.node');


// in this demo I will stop after two rounds of recognizing target words:
let called = 0;
function onWordsFound(words) {
  console.log('words found:', words);
  called ++;
  if (called > 1) {
    console.log('stopping listener');
    voiceModule.stopListener();
    return;
  }
  // I use setTimeout here since the rust function calling this js function must exit before the next call to lookForWords
  // but you can use voiceModule.lookForWords anywhere in your JS code
  setTimeout(() => {
    console.log('calling lookForWords');
    voiceModule.lookForWords(["second", "words"], true); 
  }, 1000);
}

const f = async () => {
  voiceModule.setPathToModel('./models/large'); // this is the english large model, but you can use any vosk-patible model you want
  const r = voiceModule.listDevices();
  // just use the default microphone for now but you can use listDevices and setMicName to make a selection UI
  voiceModule.setMicName(r[0]); 
  // after selecting the mic you can call startListener
  voiceModule.startListener(onWordsFound); // pass your callback
  voiceModule.lookForWords(['hello', 'world'], false); // false means match ANY word, true means they must match ALL words in the list
};

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744269080a4566009.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信