Jan-16-2023, 03:42 PM
Experiencing a problem with the sound files generated by the program, and that they are coming out as corrupted or not working properly.
I am having an issue with a program I created using the python programming language. The program converts text files located in a specific folder into speech audio files using the Azure Cognitive Services Text to Speech API and saves the generated audio files in a different folder. However, the audio files are coming out as corrupted and not working properly. I have not been able to find a solution and would greatly appreciate any suggestions or help.
Thank you in advance.
Hello everyone,
I posted a thread on this forum seeking assistance with a Python program script that I am trying to fix. Unfortunately, I have not received any responses yet. I understand that everyone is busy and may not have the time to answer right away, but if anyone could take a look at my post and provide some help or guidance, it would be greatly appreciated.
Thank you
I am having an issue with a program I created using the python programming language. The program converts text files located in a specific folder into speech audio files using the Azure Cognitive Services Text to Speech API and saves the generated audio files in a different folder. However, the audio files are coming out as corrupted and not working properly. I have not been able to find a solution and would greatly appreciate any suggestions or help.
Thank you in advance.
import os
import requests
from array import array
# Global constants
API_KEY = ""
ENDPOINT_URL = ""
TEXT_FOLDER = "C:/Users/user/Desktop/text"
AUDIO_FOLDER = "C:/Users/user/Desktop/audio"
VOICE_OPTION = "en-US-JessaNeural"
def file_list_to_array(folder, extension):
"""
Returns an array of file names that match the specified extension in the given folder.
"""
files = []
for file in os.listdir(folder):
if file.endswith(extension):
files.append(file)
return files
def text_to_speech(text_file, audio_file):
"""
Converts the text in the given text file to speech and saves the generated audio as the given audio file.
"""
headers = {
"Ocp-Apim-Subscription-Key": API_KEY,
"Content-Type": "application/ssml+xml",
"X-Microsoft-OutputFormat": "audio-24khz-48kbitrate-mono-WAV",
"User-Agent": "Edge"
}
with open(text_file, "r") as f:
text = f.read()
body = f"<speak version='1.0' xmlns='https://www.w3.org/2001/10/synthesis' xml:lang='en-US'><voice name='{VOICE_OPTION}'>{text}</voice></speak>"
response = requests.post(ENDPOINT_URL, headers=headers, data=body)
if response.status_code != 200:
print(f"Error: {response.content}")
return
with open(audio_file, "wb") as f:
f.write(response.content)
print(f"Text in {text_file} converted to speech and saved as {audio_file}")
def main():
text_files = file_list_to_array(TEXT_FOLDER, ".txt")
for text_file in text_files:
audio_file = text_file.replace(".txt", ".WAV")
audio_file = os.path.join(AUDIO_FOLDER, audio_file)
text_file = os.path.join(TEXT_FOLDER, text_file)
text_to_speech(text_file, audio_file)
if __name__ == "__main__":
main()Hello everyone,
I posted a thread on this forum seeking assistance with a Python program script that I am trying to fix. Unfortunately, I have not received any responses yet. I understand that everyone is busy and may not have the time to answer right away, but if anyone could take a look at my post and provide some help or guidance, it would be greatly appreciated.
Thank you
