Text to speech #9

taehallm · 2023-11-20T03:57:56Z

taehallm
Nov 20, 2023
Maintainer

특징

OpenAI에서 제공하는 Audio API 중 하나. tts-1 or tts-1-hd 모델 사용.
텍스트를 음성 오디오로 변환하는 데 사용.
6개의 빌트인 음성이 제공됨
여러 언어 음성 인식 가능 (한국어 포함) 단, 영어가 제일 성능 높음
스트리밍을 사용한 실시간 오디오 출력 제공
지원되는 출력 파일 형식은 mp3이지만 opus, aac, flac 등도 가능
- Opus: 인터넷 스트리밍 및 통신의 경우 지연 시간이 짧음
- AAC: 디지털 오디오 압축용이며 YouTube, Android, iOS에서 선호
- FLAC: 무손실 오디오 압축으로, 오디오 애호가들이 아카이빙용으로 선호
- mp3로 테스트한 뒤 다른 형식 적용할 예정
감정(tone)을 조절하는 기능은 없음

파라미터

모델(str) : tts-1 or tts-1-hd
input(str): 오디오를 생성할 텍스트 (최대 길이 4096 영문 characters)
voice(str): 오디오를 생성할 때 사용할 음성 (alloy, echo, fable, onyx, nova 그리고 shimmer 중 하나 고르면 됨)
response_format(stre): 오디오를 입력할 형식. 기본값은 mp3이고 지원되는 형식은 위에 언급한 것들과 동일.
speed(num): optional / defualts to 1. 생성된 오디오 속도로 0.25에서 4.0 사이의 값을 선택함. 기본값은 1.0임.

실시간 스트리밍?

Speech API에서 chunk transfer encoding을 사용하여 실시간 오디오 스트리밍을 지원하는데, 전체 파일이 생성되어 액세스할 수 있게 되기 전에 오디오를 재생할 수 있게 하는 기능임

어떻게 활용할지

전체 피드백 및 음성으로 받은 사용자 명령어에 대한 응답을 음성으로 출력하는 데 사용

사용법

python :

from pathlib import Path
import openai

speech_file_path = Path(__file__).parent / "speech.mp3"
response = openai.audio.speech.create(
  model="tts-1",
  voice="alloy",
  input="The quick brown fox jumped over the lazy dog."
)
response.stream_to_file(speech_file_path)

node:

import fs from "fs";
import path from "path";
import OpenAI from "openai";

const openai = new OpenAI();

const speechFile = path.resolve("./speech.mp3");

async function main() {
  const mp3 = await openai.audio.speech.create({
    model: "tts-1",
    voice: "alloy",
    input: "Today is a wonderful day to build something people love!",
  });
  console.log(speechFile);
  const buffer = Buffer.from(await mp3.arrayBuffer());
  await fs.promises.writeFile(speechFile, buffer);
}
main();

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cereal

Text to speech #9

{{title}}

Replies: 0 comments

Select a reply

Cereal

Text to speech #9

taehallm Nov 20, 2023 Maintainer

특징

파라미터

실시간 스트리밍?

어떻게 활용할지

사용법

Replies: 0 comments

taehallm
Nov 20, 2023
Maintainer