• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.
  • If you want to test the new SinusBot for Windows that is compatible with TS version 3.1 & the current Discord version, please use this version.

Tutorial Using Speech-Recognition

flyth

is reticulating splines
Staff member
Developer
#1
This is a little tutorial on how to use speech recognition. The feature is still highly experimental and will cause increased CPU & RAM usage. I've tried to make it so it only activates if it is really necessary.

Some limitations upfront:
  • you need at least version 0.13.37
  • you will need to provide the commands that should be recognized beforehand for now
  • there is no continuous recognition
  • only 3 speakers will be recognized simultaneously, additional speakers will be ignored once one of the initial 3 speakers will stop speaking; in most cases that is more than enough
  • make sure your config.ini contains
    Code:
    [SpeechRecognition]
    Enable = true
So - let's get started!
  1. Download https://www.sinusbot.com/pre/speech.tar.bz2
  2. Extract that file to the same directory as your SinusBot (so you will have a new folder speech inside the SinusBot directory)
  3. Create a script that uses the speech recognition engine
  4. Restart the bot

Code:
registerPlugin({
    name: 'Speech Recognition Demo',
    version: '2.0',
    description: 'This is a simple script that will recognize the command "stop"',
    author: 'Michael Friese <[email protected]>',
    vars: [],
    voiceCommands: ['stop']
}, function(sinusbot, config) {
    var audio = require('audio');
    var event = require('event');
    var engine = require('engine');
    audio.setAudioReturnChannel(2); // this activates speech recognition
    event.on('speech', function(ev) {
        engine.log(ev.client.nick() + ' just said ' + ev.text);
    });
});
So it's pretty simple: you register some voiceCommands in the plugins' manifest. Once one of the registered commands is recognized, the speech event gets triggered with an object containing the clientId and the recognized command (a full client object will be added later on).

Important: Only words that are inside the ./speech/dict file will be recognized. It already contains many english words and their phonetic "translation". If you need words that are not in there, you have to add them (and the translation) manually before you can use them.

I'd love to get some feedback on how it worked out for you. In most simple cases it works well enough, so don't try to recognize full sentences but focus on simple commands. Also, you might want to let the commands start with a word like "bot", so that you don't trigger things by accident ;)
 
Last edited:

flyth

is reticulating splines
Staff member
Developer
#9
Updated first post to be compatible with version 0.13+.
If you're trying this, please make sure you have read EVERYTHING in that post beforehand :)
 

Similar threads