SpeechToText Engine Configuration Parameters

A SpeechToText task transcribes audio into text. This section describes the parameters that you can set in the configuration section for a SpeechToText task.

Configuration Parameter Description
CustomLM The path and interpolation weight of each custom language model to use for speech-to-text processing.
ErrorMessage The message that appears in the transcript when Media Server cannot connect to an IDOL Speech Server.
FilterMusic Specifies whether to include speech-to-text results for audio segments that Speech Server identifies as music or noise.
Input The audio track to process.
Language The language pack to use for speech-to-text processing.
MaxConsecutiveTries The maximum number of attempts that Media Server makes to connect to the servers listed in the SpeechToTextServers parameter.
Mode The mode for speech-to-text analysis (you can prioritize accuracy or speed).
ModeValue The processing rate. The meaning of this parameter depends on the value of the Mode parameter.
SampleFrequency The sample frequency of the audio to send to the IDOL Speech Server.
SpeechToTextServers A list of IDOL Speech Servers to use for speech-to-text.
Type The analysis engine to use. Set this parameter to SpeechToText.

Output Tracks

Output track Type Description
Result SpeechToTextResult Contains a record for each word.

SpeechToTextResult

Field name Type Description
id UUID A universally unique identifier to identify the section of audio described by the record.
text TextData The spoken word converted to text.
confidence Int The confidence score for the speech-to-text process.

_HP_HTML5_bannerTitle.htm