azure-speech - 如何使用 azure Speech-to-Text 获得 NBest 替代品

Question

我想使用 azure speech-to-text 为单个语音话语获取多个替代转录。

我已经设置了 format=detailed 参数，并且响应确实包含一个名为 NBest 的字段。但该字段仅包含一个转录。

我还需要在输入端设置什么吗？

谢谢。

score 0 · Accepted Answer

我相信您已经定义了需要在输入端定义的所有内容。

但是有了更多关于周围环境的信息，就会更容易弄清楚如何准确回答。例如，我不确定它在ContinuousRecognition模式或RecognizeOnce模式下的行为是否相同。

在下面的 C# 代码中，我确实获得了 NBest 数组包含 5 个结果的结果。但是请注意，在我找到的代码示例中，以及您将在下面找到的与我自己的集成的代码示例中，NBest 属性被定义为一个列表。我不确定在您使用的框架中，这是否可能是包含单个结果的 NBest 对象的来源。

SpeechConfig _speechConfig = SpeechConfig.FromSubscription(SUBSCRIPTION_KEY, SUBSCRIPTION_REGION);
_speechConfig.SpeechRecognitionLanguage = SPEECH_RECOGNITION_LANGUAGE;
_speechConfig.OutputFormat = OutputFormat.Detailed;

AudioConfig _audioConfig = AudioConfig.FromDefaultMicrophoneInput();
_recognizer = new SpeechRecognizer(_speechConfig, _audioConfig);

_recognizer.Recognized += (s, e) => OnRecognized(e);

    private void OnRecognized(SpeechRecognitionEventArgs e)
    {
        if (e.Result.Reason == ResultReason.RecognizedSpeech)
        {
            SpeechRecognitionResult result = e.Result;
            PropertyCollection propertyCollection = result.Properties;
            string jsonResult = propertyCollection.GetProperty(PropertyId.SpeechServiceResponse_JsonResult);
            var structuredResult = JsonConvert.DeserializeObject<Result>(jsonResult);
            var bestResult = structuredResult?.NBest[0]; // <= pick your favorite NBest
            // Do something with the bestResult of your choice
        }
    }

    public class Word
    {
        public int Duration { get; set; }
        public int Offset { get; set; }
        public string word { get; set; }
    }

    public class NBest
    {
        public double Confidence { get; set; }
        public string Display { get; set; }
        public string ITN { get; set; }
        public string Lexical { get; set; }
        public string MaskedITN { get; set; }
        public List<Word> Words { get; set; }
    }

    public class Result
    {
        public string DisplayText { get; set; }
        public int Duration { get; set; }
        public string Id { get; set; }
        public List<NBest> NBest { get; set; }
        public Int64 Offset { get; set; }
        public string RecognitionStatus { get; set; }
    }

score 0 · Accepted Answer

我加入评论以确保您使用的是什么机制：

如果您正在使用Speech CLI或想尝试一下，请执行以下操作：

第一组：

spx config recognize @default.output --set @@output.all.detailed

然后：

spx recognize --file FILE --output all itn text --output all file type json

或者

spx recognize --file FILE --output all lexical text --output all file type json

azure-speech - 如何使用 azure Speech-to-Text 获得 NBest 替代品

2 回答 2

Related

Reference