0

我只是想简单地使用 python SpeechRecognition 从音频文件中获取成绩单。似乎无论我设置什么 pause_threshold 或持续时间或其他什么,它总是给我相同的确切输出,大约 80 秒音频中的 30 秒,然后它会切断。

import speech_recognition as sr

import moviepy.editor as mp

clip = mp.VideoFileClip(r"recording2.webm")

clip.audio.write_audiofile(r"converted.wav")

r = sr.Recognizer()

r.pause_threshold = 10

# r.energy_threshold = 4000

audio = sr.AudioFile("converted.wav")

with audio as source:
   audio_file = r.record(source, duration=90)

result = r.recognize_azure(audio_file, key=AZUREKEY, language="en-US", show_all=False, location="westeurope")

print(result)

无论我如何设置,仍然有相同的结果。

4

1 回答 1

0

我不确定这是否是正确的方法,但它目前是处理该问题的充分方法。我将音频分成 30 秒的块并建立了整个成绩单。

with audio as source:
    r.adjust_for_ambient_noise(source)
    for chunk in range(no_of_chunks):
        audio_data = r.record(source, duration=30)
        transcript = r.recognize_azure(audio_data, key=AZURE_KEY, language="en-US", show_all=False,
                                       location="westeurope")
        result += transcript + " "
于 2021-11-12T14:37:16.247 回答