Motivation

My smartphone has an IR transmitter. With this I have already developed a remote control for my television. The aim is to control them by voice command. The MIT App Inventor has the SpeechRecognizer component. Unfortunately, this does not work continuously and emits an annoying tone at the start and at the end. The extension shown here works continuously and you can switch off the tones.

Version history

Version	Adjustments
1.0 (2021-03-13)	Initial version
1.1 (2022-02-16	Problems with Android 11 solved. Property isRecognitionAvailable added. Method Stop corrected.
1.2 (2022-01-09)	Property Language added.

Contents

Download

The UrsAI2ContinuousSpeech ZIP archive for download. The archive contains the source code, the compiled binary for uploading to the App Inventor and a sample application.

Verwendung

The sound can be turned on and off using the SoundEnabled property. To do this, the audio stream for system signals is muted, respectively the volume is reset. If the screen containing the component is cleared (onDestroy), the speech recognition is automatically switched off and the sound is switched on again. The end tone of the Android SpeechRecognizer is delayed. It is possible that Android switches the sound back on before the end sound is triggered. You can then hear the end tone.

Start and Stop start and stop the speech recognition. Unfortunately, the Android SpeechRecognizer component does not offer continuous operation. Internally, the speech recognition is restarted immediately after each cancellation until the call to the Stop function cancels this loop. There may be short recognition pauses lasting a few milliseconds before restarting. However, this did not have a negative effect in the tests.

The OnResult event supplies the result of the speech recognition. It is triggered when the speaker takes a break. OnPartialResult supplies intermediate results of the speech recognition. It is triggered approximately after every recognized word.

Internally, the process is roughly as follows:

Reference

Properties

IsRecognitionAvailable: Returns whether speech recognition is available.
Language: Suggests the language to use for recognizing speech. An empty string (the default) will use the system’s default language.

Language is specified using a language tag with an optional region suffix, such as en or es-MX. The set of supported languages will vary by device.

See also MIT AI2 SpeechRecognizer.Language.
SoundEnabled: Switches the system tones on and off.
Version: Gets the version of the extension.

Functions

Start (): Starts speech recognition.
Stop (): Terminates speech recognition.

Events

OnPartialResult (Text): Intermediate result approximately after every spoken word.
OnResult (Text): Result of speech recognition after a pause in speech. Speech recognition is switched off after this event.

Example PopupTest

The example recognizes voice commands and converts them into IR codes to operate my TV.

Components used:

UrsAI2ContinuousSpeech for speech recognition.
UrsAI2IrXmitter for sending the IR codes.
UrsAI2KeepAlive prevents Doze mode.

When initializing, KeepAlive is switched on, the sound is switched off and speech recognition is switched on:

The speech recognition is done in the OnPartialResult event. This event provides the necessary data earlier than OnResult.

Only a few example commands and not the entire command set from the remote control example is implemented, .

The CheckCommand procedure evaluates of the results of the speech recognition:

Text is the result of speech recognition, Command is the command to be checked against, and Code is an index in a list of IR codes. If a hit is found, the speech recognition is switched off first. This is necessary because a voice output immediately follows that is not intended to serve as its own input. The CallAction procedure sends the IR code associated with Code. Finally, "ok" is output via the voice output.

The global variable IsSpeaking controls the reactivation of the speech recognition. Usually the OnResult event, at which the speech recognition has to be restarted, is triggered before the end of the speech output. If a command is recognized, however, speech recognition should restart after the end of the speech output.

Tools

For developing own extensions I gathered some tips: AI2 FAQ: Develop Extensions.