所以我有一个用例,我想将音频文件 (.WAV) 上传到 blob 存储中,该存储触发一个函数并从音频中获取文本。目前,唯一可能的方法是在本地保存音频文件。音频配置无法获取音频文件的 uri。我正在使用的代码是这样的:
import azure.cognitiveservices.speech as speechsdk
speech_key, service_region = "sub-key", "westeurope"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
audio_input = speechsdk.AudioConfig(filename="**BLOB URI**")
speech_recognizer = speechsdk.SpeechRecognizer(speech_config, audio_input)
result = speech_recognizer.recognize_once()
if result.reason == speechsdk.ResultReason.RecognizedSpeech:
print("Recognized: {}".format(result.text))
elif result.reason == speechsdk.ResultReason.NoMatch:
print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
print("Speech Recognition canceled: {}".format(cancellation_details.reason))
if cancellation_details.reason == speechsdk.CancellationReason.Error:
print("Error details: {}".format(cancellation_details.error_details))
根据我的研究,我们不能将 uri 作为文件名(代码的粗体部分)。像先在本地下载这样的解决方案是行不通的。
我尝试将音频作为流读取,但找不到转换为 AudioInputStream 的方法。
任何帮助都会很棒。谢谢。