비교해봤는데, 영어 발화자로 한국어를 읽는게 한국 발화자보다 깔끔하다고 느껴진다. 물론 발음은 한국 발화자가 더 좋다.
음성 프리셋 + 합성
text_prompt = """
I have a silky smooth voice, and today I will tell you about
the exercise regimen of the common sloth.
"""
audio_array = generate_audio(text_prompt, history_prompt="en_speaker_1")
# install bark as well as pytorch nightly to get blazing fast flash-attention
!pip install git+https://github.com/suno-ai/bark.git && \
pip uninstall -y torch torchvision torchaudio && \
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu118
from bark import SAMPLE_RATE, generate_audio, preload_models
from IPython.display import Audio
preload_models()
text_prompt = """
Hello, my name is Suno. And, uh — and I like pizza. [laughs]
But I also have other interests such as playing tic tac toe.
"""
audio_array = generate_audio(text_prompt)
Audio(audio_array, rate=SAMPLE_RATE)
음성 파일 저장
from scipy.io.wavfile import write as write_wav
write_wav("/path/to/audio.wav", SAMPLE_RATE, audio_array)