31 Commits

Author SHA1 Message Date
b179ed7944 update to newer sound package 2025-07-04 19:22:42 -07:00
c5609d07e9 update dep 2025-04-30 10:26:33 -07:00
Jose Ramirez
d79325ad71 add WithHTTPClient (#13) 2025-04-30 10:21:50 -07:00
fba03ed0be update error code for 400 2025-03-13 20:39:03 -07:00
aa701237ff add new error type 2025-03-13 14:04:06 -07:00
Jose Ramirez
29fa401714 s/speeh/speech (#11) 2025-03-04 00:20:55 -08:00
77d17e20fb update deps 2025-03-03 11:17:49 -08:00
Jose Ramirez
93af72dc7c add support for speech-to-text endpoint (#10) 2025-03-03 11:16:04 -08:00
Lachlan Laycock
db0a2e1760 Add sound generation api (#9)
* Add missing attributes for VoiceResponseModel

* Updating module to point to forked repo

* Tidying up go.mod

* Adding missing voice settings

* Adding support for request stitching

* Adding support for request stitching

* Fix dup SharingOptions struct from merge

* Add Sound Generation API

* Fix: revert user-agent/package url to original
2024-11-25 21:39:34 -08:00
samy kamkar
c585531fae Support command line text & new API attributes (#8)
* support `style` and `use_speaker_boost` API attrs

* support optional command line string as text

* print out time it took to run
2024-07-24 12:14:00 -07:00
Lachlan Laycock
41f142ec2c Add missing attributes for VoiceResponseModel (#5) 2023-10-17 13:49:20 -07:00
6ebcddb891 bump deps 2023-09-20 23:53:35 -07:00
ae598ecc4b upgrade some deps 2023-08-24 18:58:52 -07:00
kayos
85bc7b2007 Fix: check errors before checking pointer receiver (!) (#4) 2023-08-12 12:32:19 -07:00
fzqxzhang
261398509a feat: model_id tag add omitempty (#3) 2023-07-25 15:43:48 +00:00
Marcel Molina
b925ef1471 Fix compile error from variable typo (#2)
* Fix compile error from variable typo

* bytes.Buffer pointer
2023-07-09 16:00:45 +00:00
e095a7ec13 don't check resposne code before checcking err !=nil 2023-06-27 12:31:56 -07:00
32972d4ff2 Merge branch 'master' of github.com:taigrr/elevenlabs 2023-06-26 20:09:02 -07:00
84e59417d6 fix nil error pointed out by @Davincible , closes #1 2023-06-26 20:08:51 -07:00
3e3b7004b8 update to newer release of api 2023-05-12 23:42:59 -07:00
6fcc65115d add comments to the readme code 2023-04-19 14:02:09 -07:00
5451bcd0b1 fix readme and say 2023-04-19 14:00:03 -07:00
58581c3c46 add example binary to readme 2023-04-18 22:35:17 -07:00
544f24cab1 add example section 2023-04-18 22:32:17 -07:00
35f72a819b update copyright year 2023-04-18 22:31:02 -07:00
0d72be9d50 rename license 2023-04-18 22:23:48 -07:00
899a127e7a add badges and license 2023-04-18 22:23:03 -07:00
8a27f7df64 add example usage to readme 2023-04-18 22:19:04 -07:00
fb146435fd remove tts 2023-04-18 22:16:27 -07:00
e2bb856589 add example streamer 2023-04-18 22:16:04 -07:00
48f074cc9c add sample say program 2023-04-18 22:09:25 -07:00
18 changed files with 779 additions and 244 deletions

12
LICENSE Normal file
View File

@@ -0,0 +1,12 @@
Copyright (C) 2023 by Tai Groot <tai@taigrr.com>
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

129
README.md
View File

@@ -1,5 +1,11 @@
# elevenlabs
Unofficial [elevenlabs.io](https://beta.elevenlabs.io/) ([11.ai](11.ai)) voice synthesis client
[![License 0BSD](https://img.shields.io/badge/License-0BSD-pink.svg)](https://opensource.org/licenses/0BSD)
[![GoDoc](https://godoc.org/github.com/taigrr/elevenlabs?status.svg)](https://godoc.org/github.com/taigrr/elevenlabs)
[![Go Mod](https://img.shields.io/badge/go.mod-v1.20-blue)](go.mod)
[![Go Report Card](https://goreportcard.com/badge/github.com/taigrr/elevenlabs?branch=master)](https://goreportcard.com/report/github.com/taigrr/elevenlabs)
Unofficial [elevenlabs.io](https://beta.elevenlabs.io/) ([11.ai](http://11.ai)) voice synthesis client
This library is not affiliated with, nor associated with ElevenLabs in any way.
@@ -7,12 +13,129 @@ ElevenLabs' official api documentation, upon which this client has been
derived, [can be found here](https://api.elevenlabs.io/docs).
## Purpose
This go client provides an easy interface to create synthesized voices and
make TTS (text-to-speech) requests to elevenlabs.io
As a prerequisite, you must already have an account with elevenlabs.io.
After creating your account, you can get you API key [from here](https://help.elevenlabs.io/hc/en-us/articles/14599447207697-How-to-authorize-yourself-using-your-xi-api-key-).
After creating your account, you can get your API key [from here](https://help.elevenlabs.io/hc/en-us/articles/14599447207697-How-to-authorize-yourself-using-your-xi-api-key-).
## Test Program
To test out an example `say` program, run:
`go install github.com/taigrr/elevenlabs/cmd/say@latest`
Set the `XI_API_KEY` environment variable, and pipe it some text to give it a whirl!
## Example Code
### Text-to-Speech Example
To use this library, create a new client and send a TTS request to a voice.
The following code block illustrates how one might replicate the say/espeak
command, using the streaming endpoint.
I've opted to go with gopxl's beep package, but you can also save the file
to an mp3 on-disk.
```go
package main
import (
"bufio"
"context"
"io"
"log"
"os"
"time"
"github.com/gopxl/beep/v2"
"github.com/gopxl/beep/v2/mp3"
"github.com/gopxl/beep/v2/speaker"
"github.com/taigrr/elevenlabs/client"
"github.com/taigrr/elevenlabs/client/types"
)
func main() {
ctx := context.Background()
// load in an API key to create a client
client := client.New(os.Getenv("XI_API_KEY"))
// fetch a list of voice IDs from elevenlabs
ids, err := client.GetVoiceIDs(ctx)
if err != nil {
panic(err)
}
// prepare a pipe for streaming audio directly to beep
pipeReader, pipeWriter := io.Pipe()
reader := bufio.NewReader(os.Stdin)
text, _ := reader.ReadString('\n')
go func() {
// stream audio from elevenlabs using the first voice we found
err = client.TTSStream(ctx, pipeWriter, text, ids[0], types.SynthesisOptions{Stability: 0.75, SimilarityBoost: 0.75, Style: 0.0, UseSpeakerBoost: true})
if err != nil {
panic(err)
}
pipeWriter.Close()
}()
// decode and prepare the streaming mp3 as it comes through
streamer, format, err := mp3.Decode(pipeReader)
if err != nil {
log.Fatal(err)
}
defer streamer.Close()
speaker.Init(format.SampleRate, format.SampleRate.N(time.Second/10))
done := make(chan bool)
// play the audio
speaker.Play(beep.Seq(streamer, beep.Callback(func() {
done <- true
})))
<-done
}
```
### Sound Generation Example
The following example demonstrates how to generate sound effects using the Sound Generation API:
```go
package main
import (
"context"
"os"
"github.com/taigrr/elevenlabs/client"
)
func main() {
ctx := context.Background()
// Create a new client with your API key
client := client.New(os.Getenv("XI_API_KEY"))
// Generate a sound effect and save it to a file
f, err := os.Create("footsteps.mp3")
if err != nil {
panic(err)
}
defer f.Close()
// Basic usage (using default duration and prompt influence)
err = client.SoundGenerationWriter(ctx, f, "footsteps on wooden floor", 0, 0)
if err != nil {
panic(err)
}
// Advanced usage with custom duration and prompt influence
audio, err := client.SoundGeneration(
ctx,
"heavy rain on a tin roof",
5.0, // Set duration to 5 seconds
0.5, // Set prompt influence to 0.5
)
if err != nil {
panic(err)
}
os.WriteFile("rain.mp3", audio, 0644)
}
```

View File

@@ -2,6 +2,7 @@ package client
import (
"errors"
"net/http"
)
const apiEndpoint = "https://api.elevenlabs.io"
@@ -12,14 +13,16 @@ var (
)
type Client struct {
apiKey string
endpoint string
apiKey string
endpoint string
httpClient *http.Client
}
func New(apiKey string) Client {
return Client{
apiKey: apiKey,
endpoint: apiEndpoint,
apiKey: apiKey,
endpoint: apiEndpoint,
httpClient: &http.Client{},
}
}
@@ -27,3 +30,14 @@ func (c Client) WithEndpoint(endpoint string) Client {
c.endpoint = endpoint
return c
}
func (c Client) WithAPIKey(apiKey string) Client {
c.apiKey = apiKey
return c
}
// WithHTTPClient allows users to provide their own http.Client
func (c Client) WithHTTPClient(hc *http.Client) Client {
c.httpClient = hc
return c
}

View File

@@ -15,8 +15,6 @@ import (
func (c Client) HistoryDelete(ctx context.Context, historyItemID string) (bool, error) {
url := fmt.Sprintf(c.endpoint+"/v1/history/%s", historyItemID)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodDelete, url, nil)
if err != nil {
return false, err
@@ -24,7 +22,7 @@ func (c Client) HistoryDelete(ctx context.Context, historyItemID string) (bool,
req.Header.Set("accept", "application/json")
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
switch res.StatusCode {
case 401:
@@ -55,7 +53,6 @@ func (c Client) HistoryDownloadZipWriter(ctx context.Context, w io.Writer, id1,
toDownload := types.HistoryPost{
HistoryItemIds: downloads,
}
client := &http.Client{}
body, _ := json.Marshal(toDownload)
bodyReader := bytes.NewReader(body)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bodyReader)
@@ -66,7 +63,7 @@ func (c Client) HistoryDownloadZipWriter(ctx context.Context, w io.Writer, id1,
req.Header.Set("accept", "archive/zip")
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
switch res.StatusCode {
case 401:
@@ -99,7 +96,6 @@ func (c Client) HistoryDownloadZip(ctx context.Context, id1, id2 string, additio
toDownload := types.HistoryPost{
HistoryItemIds: downloads,
}
client := &http.Client{}
body, _ := json.Marshal(toDownload)
bodyReader := bytes.NewReader(body)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bodyReader)
@@ -110,7 +106,7 @@ func (c Client) HistoryDownloadZip(ctx context.Context, id1, id2 string, additio
req.Header.Set("accept", "archive/zip")
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
switch res.StatusCode {
case 401:
@@ -142,7 +138,6 @@ func (c Client) HistoryDownloadZip(ctx context.Context, id1, id2 string, additio
func (c Client) HistoryDownloadAudioWriter(ctx context.Context, w io.Writer, ID string) error {
url := fmt.Sprintf(c.endpoint+"/v1/history/%s/audio", ID)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return err
@@ -150,15 +145,15 @@ func (c Client) HistoryDownloadAudioWriter(ctx context.Context, w io.Writer, ID
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "audio/mpeg")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return err
}
switch res.StatusCode {
case 401:
return ErrUnauthorized
case 200:
if err != nil {
return err
}
defer res.Body.Close()
io.Copy(w, res.Body)
return nil
@@ -179,7 +174,6 @@ func (c Client) HistoryDownloadAudioWriter(ctx context.Context, w io.Writer, ID
func (c Client) HistoryDownloadAudio(ctx context.Context, ID string) ([]byte, error) {
url := fmt.Sprintf(c.endpoint+"/v1/history/%s/audio", ID)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return []byte{}, err
@@ -187,15 +181,15 @@ func (c Client) HistoryDownloadAudio(ctx context.Context, ID string) ([]byte, er
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "audio/mpeg")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return []byte{}, err
}
switch res.StatusCode {
case 401:
return []byte{}, ErrUnauthorized
case 200:
if err != nil {
return []byte{}, err
}
b := bytes.Buffer{}
w := bufio.NewWriter(&b)
@@ -219,7 +213,6 @@ func (c Client) HistoryDownloadAudio(ctx context.Context, ID string) ([]byte, er
func (c Client) GetHistoryItemList(ctx context.Context) ([]types.HistoryItemList, error) {
url := c.endpoint + "/v1/history"
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return []types.HistoryItemList{}, err
@@ -227,16 +220,15 @@ func (c Client) GetHistoryItemList(ctx context.Context) ([]types.HistoryItemList
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return []types.HistoryItemList{}, err
}
switch res.StatusCode {
case 401:
return []types.HistoryItemList{}, ErrUnauthorized
case 200:
if err != nil {
return []types.HistoryItemList{}, err
}
var history types.GetHistoryResponse
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&history)

View File

@@ -15,7 +15,6 @@ import (
func (c Client) DeleteVoiceSample(ctx context.Context, voiceID, sampleID string) (bool, error) {
url := fmt.Sprintf(c.endpoint+"/v1/voices/%s/samples/%s", voiceID, sampleID)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodDelete, url, nil)
if err != nil {
return false, err
@@ -24,15 +23,14 @@ func (c Client) DeleteVoiceSample(ctx context.Context, voiceID, sampleID string)
req.Header.Set("accept", "application/json")
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return false, err
}
switch res.StatusCode {
case 401:
return false, ErrUnauthorized
case 200:
if err != nil {
return false, err
}
return true, nil
case 422:
fallthrough
@@ -51,7 +49,6 @@ func (c Client) DeleteVoiceSample(ctx context.Context, voiceID, sampleID string)
func (c Client) DownloadVoiceSampleWriter(ctx context.Context, w io.Writer, voiceID, sampleID string) error {
url := fmt.Sprintf(c.endpoint+"/v1/voices/%s/samples/%s/audio", voiceID, sampleID)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return err
@@ -59,15 +56,15 @@ func (c Client) DownloadVoiceSampleWriter(ctx context.Context, w io.Writer, voic
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "audio/mpeg")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return err
}
switch res.StatusCode {
case 401:
return ErrUnauthorized
case 200:
if err != nil {
return err
}
defer res.Body.Close()
io.Copy(w, res.Body)
return nil
@@ -88,7 +85,6 @@ func (c Client) DownloadVoiceSampleWriter(ctx context.Context, w io.Writer, voic
func (c Client) DownloadVoiceSample(ctx context.Context, voiceID, sampleID string) ([]byte, error) {
url := fmt.Sprintf(c.endpoint+"/v1/voices/%s/samples/%s/audio", voiceID, sampleID)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return []byte{}, err
@@ -96,15 +92,15 @@ func (c Client) DownloadVoiceSample(ctx context.Context, voiceID, sampleID strin
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "audio/mpeg")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return []byte{}, err
}
switch res.StatusCode {
case 401:
return []byte{}, ErrUnauthorized
case 200:
if err != nil {
return []byte{}, err
}
b := bytes.Buffer{}
w := bufio.NewWriter(&b)

96
client/sound_gen.go Normal file
View File

@@ -0,0 +1,96 @@
package client
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"github.com/taigrr/elevenlabs/client/types"
)
// SoundGenerationWriter generates a sound effect from text and writes it to the provided writer.
// If durationSeconds is 0, it will be omitted from the request and the API will determine the optimal duration.
// If promptInfluence is 0, it will default to 0.3.
func (c Client) SoundGenerationWriter(ctx context.Context, w io.Writer, text string, durationSeconds, promptInfluence float64) error {
params := types.SoundGeneration{
Text: text,
PromptInfluence: 0.3, // default value
}
if promptInfluence != 0 {
params.PromptInfluence = promptInfluence
}
if durationSeconds != 0 {
params.DurationSeconds = durationSeconds
}
body, err := c.requestSoundGeneration(ctx, params)
if err != nil {
return err
}
defer body.Close()
_, err = io.Copy(w, body)
return err
}
// SoundGeneration generates a sound effect from text and returns the audio as bytes.
// If durationSeconds is 0, it will be omitted from the request and the API will determine the optimal duration.
// If promptInfluence is 0, it will default to 0.3.
func (c Client) SoundGeneration(ctx context.Context, text string, durationSeconds, promptInfluence float64) ([]byte, error) {
params := types.SoundGeneration{
Text: text,
PromptInfluence: 0.3, // default value
}
if promptInfluence != 0 {
params.PromptInfluence = promptInfluence
}
if durationSeconds != 0 {
params.DurationSeconds = durationSeconds
}
body, err := c.requestSoundGeneration(ctx, params)
if err != nil {
return nil, err
}
defer body.Close()
var b bytes.Buffer
_, err = io.Copy(&b, body)
if err != nil {
return nil, err
}
return b.Bytes(), nil
}
func (c Client) requestSoundGeneration(ctx context.Context, params types.SoundGeneration) (io.ReadCloser, error) {
url := c.endpoint + "/v1/sound-generation"
b, err := json.Marshal(params)
if err != nil {
return nil, err
}
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewBuffer(b))
if err != nil {
return nil, err
}
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "audio/mpeg")
res, err := c.httpClient.Do(req)
if err != nil {
return nil, err
}
if res.StatusCode != http.StatusOK {
res.Body.Close()
return nil, fmt.Errorf("unexpected status code: %d", res.StatusCode)
}
return res.Body, nil
}

126
client/stt.go Normal file
View File

@@ -0,0 +1,126 @@
package client
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"io"
"mime/multipart"
"net/http"
"os"
"path/filepath"
"github.com/taigrr/elevenlabs/client/types"
)
// ConvertSpeechToText converts audio to text using the specified file path
func (c *Client) ConvertSpeechToText(ctx context.Context, audioFilePath string, request types.SpeechToTextRequest) (*types.SpeechToTextResponse, error) {
file, err := os.Open(audioFilePath)
if err != nil {
return nil, fmt.Errorf("failed to open audio file: %w", err)
}
defer file.Close()
return c.ConvertSpeechToTextFromReader(ctx, file, filepath.Base(audioFilePath), request)
}
// ConvertSpeechToTextFromReader converts audio to text using the provided reader
func (c *Client) ConvertSpeechToTextFromReader(ctx context.Context, reader io.Reader, filename string, request types.SpeechToTextRequest) (*types.SpeechToTextResponse, error) {
body := &bytes.Buffer{}
writer := multipart.NewWriter(body)
if err := writer.WriteField("model_id", string(request.ModelID)); err != nil {
return nil, fmt.Errorf("failed to write model_id field: %w", err)
}
part, err := writer.CreateFormFile("file", filename)
if err != nil {
return nil, fmt.Errorf("failed to create form file: %w", err)
}
if _, err = io.Copy(part, reader); err != nil {
return nil, fmt.Errorf("failed to copy audio data: %w", err)
}
if request.LanguageCode != "" {
if err := writer.WriteField("language_code", request.LanguageCode); err != nil {
return nil, fmt.Errorf("failed to write language_code field: %w", err)
}
}
if request.NumSpeakers != 0 {
if err := writer.WriteField("num_speakers", fmt.Sprintf("%d", request.NumSpeakers)); err != nil {
return nil, fmt.Errorf("failed to write num_speakers field: %w", err)
}
}
if request.TagAudioEvents {
if err := writer.WriteField("tag_audio_events", "true"); err != nil {
return nil, fmt.Errorf("failed to write tag_audio_events field: %w", err)
}
}
if request.TimestampsGranularity != "" {
if err := writer.WriteField("timestamps_granularity", string(request.TimestampsGranularity)); err != nil {
return nil, fmt.Errorf("failed to write timestamps_granularity field: %w", err)
}
}
if request.Diarize {
if err := writer.WriteField("diarize", "true"); err != nil {
return nil, fmt.Errorf("failed to write diarize field: %w", err)
}
}
if err = writer.Close(); err != nil {
return nil, fmt.Errorf("failed to close multipart writer: %w", err)
}
url := fmt.Sprintf(c.endpoint + "/v1/speech-to-text")
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, body)
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("Content-Type", writer.FormDataContentType())
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("xi-api-key", c.apiKey)
res, err := c.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("failed to send request: %w", err)
}
switch res.StatusCode {
case 401:
return nil, ErrUnauthorized
case 200:
var sttResponse types.SpeechToTextResponse
if err := json.NewDecoder(res.Body).Decode(&sttResponse); err != nil {
return nil, fmt.Errorf("failed to parse API response: %w", err)
}
return &sttResponse, nil
case 422:
ve := types.ValidationError{}
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&ve)
if jerr != nil {
err = errors.Join(err, jerr)
} else {
err = errors.Join(err, ve)
}
return nil, err
case 400:
fallthrough
default:
ve := types.ParamError{}
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&ve)
if jerr != nil {
err = errors.Join(err, jerr)
} else {
err = errors.Join(err, ve)
}
return nil, err
}
}

View File

@@ -1,7 +1,6 @@
package client
import (
"bufio"
"bytes"
"context"
"encoding/json"
@@ -13,123 +12,100 @@ import (
"github.com/taigrr/elevenlabs/client/types"
)
func (c Client) TTSWriter(ctx context.Context, w io.Writer, text, voiceID string, options types.SynthesisOptions) error {
options.Clamp()
url := fmt.Sprintf(c.endpoint+"/v1/text-to-speech/%s", voiceID)
opts := types.TTS{
Text: text,
VoiceSettings: options,
}
b, _ := json.Marshal(opts)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewBuffer(b))
if err != nil {
return err
}
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "audio/mpeg")
res, err := client.Do(req)
switch res.StatusCode {
case 401:
return ErrUnauthorized
case 200:
if err != nil {
return err
}
defer res.Body.Close()
io.Copy(w, res.Body)
return nil
case 422:
fallthrough
default:
ve := types.ValidationError{}
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&ve)
if jerr != nil {
err = errors.Join(err, jerr)
} else {
err = errors.Join(err, ve)
}
return err
func WithPreviousText(previousText string) types.TTSParam {
return func(tts *types.TTS) {
tts.PreviousText = previousText
}
}
func (c Client) TTS(ctx context.Context, text, voiceID string, options types.SynthesisOptions) ([]byte, error) {
options.Clamp()
url := fmt.Sprintf(c.endpoint+"/v1/text-to-speech/%s", voiceID)
client := &http.Client{}
opts := types.TTS{
Text: text,
VoiceSettings: options,
func WithNextText(nextText string) types.TTSParam {
return func(tts *types.TTS) {
tts.NextText = nextText
}
b, _ := json.Marshal(opts)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewBuffer(b))
}
func (c Client) TTSWriter(ctx context.Context, w io.Writer, text, modelID, voiceID string, options types.SynthesisOptions, optionalParams ...types.TTSParam) error {
params := types.TTS{
Text: text,
VoiceID: voiceID,
ModelID: modelID,
}
for _, p := range optionalParams {
p(&params)
}
body, err := c.requestTTS(ctx, params, options)
if err != nil {
return err
}
defer body.Close()
io.Copy(w, body)
return nil
}
func (c Client) TTS(ctx context.Context, text, voiceID, modelID string, options types.SynthesisOptions, optionalParams ...types.TTSParam) ([]byte, error) {
params := types.TTS{
Text: text,
VoiceID: voiceID,
ModelID: modelID,
}
for _, p := range optionalParams {
p(&params)
}
body, err := c.requestTTS(ctx, params, options)
if err != nil {
return []byte{}, err
}
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "audio/mpeg")
res, err := client.Do(req)
switch res.StatusCode {
case 401:
return []byte{}, ErrUnauthorized
case 200:
if err != nil {
return []byte{}, err
}
b := bytes.Buffer{}
w := bufio.NewWriter(&b)
defer res.Body.Close()
io.Copy(w, res.Body)
return b.Bytes(), nil
case 422:
fallthrough
default:
ve := types.ValidationError{}
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&ve)
if jerr != nil {
err = errors.Join(err, jerr)
} else {
err = errors.Join(err, ve)
}
return []byte{}, err
}
defer body.Close()
b := bytes.Buffer{}
io.Copy(&b, body)
return b.Bytes(), nil
}
func (c Client) TTSStream(ctx context.Context, w io.Writer, text, voiceID string, options types.SynthesisOptions) error {
options.Clamp()
url := fmt.Sprintf(c.endpoint+"/v1/text-to-speech/%s/stream", voiceID)
opts := types.TTS{
Text: text,
VoiceSettings: options,
func (c Client) TTSStream(ctx context.Context, w io.Writer, text, voiceID string, options types.SynthesisOptions, optionalParams ...types.TTSParam) error {
params := types.TTS{
Text: text,
VoiceID: voiceID,
Stream: true,
}
b, _ := json.Marshal(opts)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewBuffer(b))
for _, p := range optionalParams {
p(&params)
}
body, err := c.requestTTS(ctx, params, options)
if err != nil {
return err
}
defer body.Close()
io.Copy(w, body)
return nil
}
func (c Client) requestTTS(ctx context.Context, params types.TTS, options types.SynthesisOptions) (io.ReadCloser, error) {
options.Clamp()
url := fmt.Sprintf(c.endpoint+"/v1/text-to-speech/%s", params.VoiceID)
if params.Stream {
url += "/stream"
}
b, _ := json.Marshal(params)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewBuffer(b))
if err != nil {
return nil, err
}
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "audio/mpeg")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return nil, err
}
switch res.StatusCode {
case 401:
return ErrUnauthorized
return nil, ErrUnauthorized
case 200:
if err != nil {
return err
}
defer res.Body.Close()
io.Copy(w, res.Body)
return nil
return res.Body, nil
case 422:
fallthrough
default:
@@ -141,6 +117,6 @@ func (c Client) TTSStream(ctx context.Context, w io.Writer, text, voiceID string
} else {
err = errors.Join(err, ve)
}
return err
return nil, err
}
}

View File

@@ -19,10 +19,17 @@ type Voice struct {
Labels string `json:"labels,omitempty"` // Serialized labels dictionary for the voice.
}
type TTS struct {
Text string `json:"text"` // The text that will get converted into speech. Currently only English text is supported.
VoiceID string `json:"voice_id"` // The ID of the voice that will be used to generate the speech.
ModelID string `json:"model_id,omitempty"`
Text string `json:"text"` // The text that will get converted into speech.
PreviousText string `json:"previous_text,omitempty"` // The text that was used to generate the previous audio file.
NextText string `json:"next_text,omitempty"` // The text that will be used to generate the next audio file.
VoiceSettings SynthesisOptions `json:"voice_settings,omitempty"` // Voice settings are applied only on the given TTS request.
Stream bool `json:"stream,omitempty"` // If true, the response will be a stream of audio data.
}
type TTSParam func(*TTS)
func (so *SynthesisOptions) Clamp() {
if so.Stability > 1 || so.Stability < 0 {
so.Stability = 0.75
@@ -30,11 +37,35 @@ func (so *SynthesisOptions) Clamp() {
if so.SimilarityBoost > 1 || so.SimilarityBoost < 0 {
so.SimilarityBoost = 0.75
}
if so.Style > 1 || so.Style < 0 {
so.Style = 0.0
}
if so.UseSpeakerBoost != true && so.UseSpeakerBoost != false {
so.UseSpeakerBoost = true
}
}
type SynthesisOptions struct {
Stability float64 `json:"stability"`
SimilarityBoost float64 `json:"similarity_boost"`
Style float64 `json:"style"`
UseSpeakerBoost bool `json:"use_speaker_boost"`
}
type SharingOptions struct {
Status string `json:"status"`
HistoryItemSampleId string `json:"history_item_sample_id"`
OriginalVoiceId string `json:"original_voice_id"`
PublicOwnerId string `json:"public_owner_id"`
LikedByCount int32 `json:"liked_by_count"`
ClonedByCount int32 `json:"cloned_by_count"`
WhitelistedEmails []string `json:"whitelisted_emails"`
Name string `json:"name"`
Labels map[string]string `json:"labels"`
Description string `json:"description"`
ReviewStatus string `json:"review_status"`
ReviewMessage string `json:"review_message"`
EnabledInLibrary bool `json:"enabled_in_library"`
}
type ExtendedSubscriptionResponseModel struct {
@@ -103,6 +134,22 @@ type LanguageResponseModel struct {
IsoCode string `json:"iso_code"`
DisplayName string `json:"display_name"`
}
type Language struct {
LanguageID string `json:"language_id"`
Name string `json:"name"`
}
type ModelResponseModel struct {
ModelID string `json:"model_id"`
Name string `json:"name"`
Description string `json:"description"`
CanBeFinetuned bool `json:"can_be_finetuned"`
CanDoTextToSpeech bool `json:"can_do_text_to_speech"`
CanDoVoiceConversion bool `json:"can_do_voice_conversion"`
TokenCostFactor float64 `json:"token_cost_factor"`
Languages []Language `json:"languages"`
}
type RecordingResponseModel struct {
RecordingID string `json:"recording_id"`
MimeType string `json:"mime_type"`
@@ -155,6 +202,17 @@ func (ve ValidationError) Error() string {
return fmt.Sprintf("%s %s: ", ve.Type_, ve.Msg)
}
type ParamError struct {
Detail struct {
Status string `json:"status"`
Message string `json:"message"`
} `json:"detail"`
}
func (pe ParamError) Error() string {
return fmt.Sprintf("%s %s: ", pe.Detail.Status, pe.Detail.Message)
}
type VerificationAttemptResponseModel struct {
Text string `json:"text"`
DateUnix int32 `json:"date_unix"`
@@ -164,14 +222,95 @@ type VerificationAttemptResponseModel struct {
Recording *RecordingResponseModel `json:"recording"`
}
type VoiceResponseModel struct {
VoiceID string `json:"voice_id"`
Name string `json:"name"`
Samples []Sample `json:"samples"`
Category string `json:"category"`
FineTuning FineTuningResponseModel `json:"fine_tuning"`
Labels map[string]string `json:"labels"`
Description string `json:"description"`
PreviewURL string `json:"preview_url"`
AvailableForTiers []string `json:"available_for_tiers"`
Settings SynthesisOptions `json:"settings"`
VoiceID string `json:"voice_id"`
Name string `json:"name"`
Samples []Sample `json:"samples"`
Category string `json:"category"`
FineTuning FineTuningResponseModel `json:"fine_tuning"`
Labels map[string]string `json:"labels"`
Description string `json:"description"`
PreviewURL string `json:"preview_url"`
AvailableForTiers []string `json:"available_for_tiers"`
Settings SynthesisOptions `json:"settings"`
Sharing SharingOptions `json:"sharing"`
HighQualityBaseModelIds []string `json:"high_quality_base_model_ids"`
}
type SoundGeneration struct {
Text string `json:"text"` // The text that will get converted into a sound effect.
DurationSeconds float64 `json:"duration_seconds"` // The duration of the sound which will be generated in seconds.
PromptInfluence float64 `json:"prompt_influence"` // A higher prompt influence makes your generation follow the prompt more closely.
}
type TimestampsGranularity string
const (
// TimestampsGranularityNone represents no timestamps
TimestampsGranularityNone TimestampsGranularity = "none"
// TimestampsGranularityWord represents word-level timestamps
TimestampsGranularityWord TimestampsGranularity = "word"
// TimestampsGranularityCharacter represents character-level timestamps
TimestampsGranularityCharacter TimestampsGranularity = "character"
)
type SpeechToTextModel string
const (
SpeechToTextModelScribeV1 SpeechToTextModel = "scribe_v1"
)
// SpeechToTextRequest represents a request to the speech-to-text API
type SpeechToTextRequest struct {
// The ID of the model to use for transcription (currently only 'scribe_v1')
ModelID SpeechToTextModel `json:"model_id"`
// ISO-639-1 or ISO-639-3 language code. If not specified, language is auto-detected
LanguageCode string `json:"language_code,omitempty"`
// Whether to tag audio events like (laughter), (footsteps), etc.
TagAudioEvents bool `json:"tag_audio_events,omitempty"`
// Number of speakers (1-32). If not specified, uses model's maximum supported
NumSpeakers int `json:"num_speakers,omitempty"`
// Granularity of timestamps: "none", "word", or "character"
TimestampsGranularity TimestampsGranularity `json:"timestamps_granularity,omitempty"`
// Whether to annotate speaker changes (limits input to 8 minutes)
Diarize bool `json:"diarize,omitempty"`
}
// SpeechToTextResponse represents the response from the speech-to-text API
type SpeechToTextResponse struct {
// ISO-639-1 language code
LanguageCode string `json:"language_code"`
// The probability of the detected language
LanguageProbability float64 `json:"language_probability"`
// The transcribed text
Text string `json:"text"`
// Detailed word-level information
Words []TranscriptionWord `json:"words"`
// Error message, if any
Error string `json:"error,omitempty"`
}
// TranscriptionWord represents a word or spacing in the transcription
type TranscriptionWord struct {
// The text content of the word/spacing
Text string `json:"text"`
// Type of segment ("word" or "spacing")
Type string `json:"type"`
// Start time in seconds
Start float64 `json:"start"`
// End time in seconds
End float64 `json:"end"`
// Speaker identifier for multi-speaker transcriptions
SpeakerID string `json:"speaker_id,omitempty"`
// Character-level information
Characters []TranscriptionCharacter `json:"characters,omitempty"`
}
// TranscriptionCharacter represents character-level information in the transcription
type TranscriptionCharacter struct {
// The text content of the character
Text string `json:"text"`
// Start time in seconds
Start float64 `json:"start"`
// End time in seconds
End float64 `json:"end"`
}

View File

@@ -11,7 +11,6 @@ import (
func (c Client) GetUserInfo(ctx context.Context) (types.UserResponseModel, error) {
url := c.endpoint + "/v1/user"
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return types.UserResponseModel{}, err
@@ -19,16 +18,14 @@ func (c Client) GetUserInfo(ctx context.Context) (types.UserResponseModel, error
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return types.UserResponseModel{}, err
}
switch res.StatusCode {
case 401:
return types.UserResponseModel{}, ErrUnauthorized
case 200:
if err != nil {
return types.UserResponseModel{}, err
}
var user types.UserResponseModel
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&user)
@@ -53,5 +50,8 @@ func (c Client) GetUserInfo(ctx context.Context) (types.UserResponseModel, error
func (c Client) GetSubscriptionInfo(ctx context.Context) (types.Subscription, error) {
info, err := c.GetUserInfo(ctx)
if err != nil {
return types.Subscription{}, err
}
return info.Subscription, err
}

View File

@@ -17,8 +17,6 @@ import (
func (c Client) CreateVoice(ctx context.Context, name, description string, labels []string, files []*os.File) error {
url := c.endpoint + "/v1/voices/add"
client := &http.Client{}
var b bytes.Buffer
w := multipart.NewWriter(&b)
for _, r := range files {
@@ -43,14 +41,14 @@ func (c Client) CreateVoice(ctx context.Context, name, description string, label
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return err
}
switch res.StatusCode {
case 401:
return ErrUnauthorized
case 200:
if err != nil {
return err
}
return nil
case 422:
fallthrough
@@ -69,7 +67,6 @@ func (c Client) CreateVoice(ctx context.Context, name, description string, label
func (c Client) DeleteVoice(ctx context.Context, voiceID string) error {
url := fmt.Sprintf(c.endpoint+"/v1/voices/%s", voiceID)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodDelete, url, nil)
if err != nil {
return err
@@ -77,14 +74,14 @@ func (c Client) DeleteVoice(ctx context.Context, voiceID string) error {
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return err
}
switch res.StatusCode {
case 401:
return ErrUnauthorized
case 200:
if err != nil {
return err
}
return nil
case 422:
fallthrough
@@ -103,7 +100,7 @@ func (c Client) DeleteVoice(ctx context.Context, voiceID string) error {
func (c Client) EditVoiceSettings(ctx context.Context, voiceID string, settings types.SynthesisOptions) error {
url := fmt.Sprintf(c.endpoint+"/v1/voices/%s/settings/edit", voiceID)
client := &http.Client{}
b, _ := json.Marshal(settings)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
@@ -113,14 +110,14 @@ func (c Client) EditVoiceSettings(ctx context.Context, voiceID string, settings
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return err
}
switch res.StatusCode {
case 401:
return ErrUnauthorized
case 200:
if err != nil {
return err
}
so := types.SynthesisOptions{}
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&so)
@@ -145,7 +142,6 @@ func (c Client) EditVoiceSettings(ctx context.Context, voiceID string, settings
func (c Client) EditVoice(ctx context.Context, voiceID, name, description string, labels []string, files []*os.File) error {
url := fmt.Sprintf(c.endpoint+"/v1/voices/%s/edit", voiceID)
client := &http.Client{}
var b bytes.Buffer
w := multipart.NewWriter(&b)
@@ -171,14 +167,14 @@ func (c Client) EditVoice(ctx context.Context, voiceID, name, description string
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return err
}
switch res.StatusCode {
case 401:
return ErrUnauthorized
case 200:
if err != nil {
return err
}
return nil
case 422:
fallthrough
@@ -195,49 +191,9 @@ func (c Client) EditVoice(ctx context.Context, voiceID, name, description string
}
}
func (c Client) defaultVoiceSettings(ctx context.Context) (types.SynthesisOptions, error) {
url := c.endpoint + "/v1/voices/settings/default"
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return types.SynthesisOptions{}, err
}
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
switch res.StatusCode {
case 401:
return types.SynthesisOptions{}, ErrUnauthorized
case 200:
if err != nil {
return types.SynthesisOptions{}, err
}
so := types.SynthesisOptions{}
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&so)
if jerr != nil {
return types.SynthesisOptions{}, jerr
}
return so, nil
case 422:
fallthrough
default:
ve := types.ValidationError{}
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&ve)
if jerr != nil {
err = errors.Join(err, jerr)
} else {
err = errors.Join(err, ve)
}
return types.SynthesisOptions{}, err
}
}
func (c Client) GetVoiceSettings(ctx context.Context, voiceID string) (types.SynthesisOptions, error) {
url := fmt.Sprintf(c.endpoint+"/v1/voices/%s/settings", voiceID)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return types.SynthesisOptions{}, err
@@ -245,14 +201,14 @@ func (c Client) GetVoiceSettings(ctx context.Context, voiceID string) (types.Syn
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
if err != nil {
return types.SynthesisOptions{}, err
}
switch res.StatusCode {
case 401:
return types.SynthesisOptions{}, ErrUnauthorized
case 200:
if err != nil {
return types.SynthesisOptions{}, err
}
so := types.SynthesisOptions{}
defer res.Body.Close()
jerr := json.NewDecoder(res.Body).Decode(&so)
@@ -277,7 +233,7 @@ func (c Client) GetVoiceSettings(ctx context.Context, voiceID string) (types.Syn
func (c Client) GetVoice(ctx context.Context, voiceID string) (types.VoiceResponseModel, error) {
url := fmt.Sprintf(c.endpoint+"/v1/voices/%s", voiceID)
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return types.VoiceResponseModel{}, err
@@ -285,7 +241,7 @@ func (c Client) GetVoice(ctx context.Context, voiceID string) (types.VoiceRespon
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
switch res.StatusCode {
case 401:
return types.VoiceResponseModel{}, ErrUnauthorized
@@ -319,7 +275,7 @@ func (c Client) GetVoice(ctx context.Context, voiceID string) (types.VoiceRespon
func (c Client) GetVoices(ctx context.Context) ([]types.VoiceResponseModel, error) {
url := c.endpoint + "/v1/voices"
client := &http.Client{}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return []types.VoiceResponseModel{}, err
@@ -327,7 +283,7 @@ func (c Client) GetVoices(ctx context.Context) ([]types.VoiceResponseModel, erro
req.Header.Set("xi-api-key", c.apiKey)
req.Header.Set("User-Agent", "github.com/taigrr/elevenlabs")
req.Header.Set("accept", "application/json")
res, err := client.Do(req)
res, err := c.httpClient.Do(req)
switch res.StatusCode {
case 401:
return []types.VoiceResponseModel{}, ErrUnauthorized

1
cmd/say/.gitignore vendored
View File

@@ -1,2 +1,3 @@
*.mp3
main
say

View File

@@ -1,8 +1,17 @@
package main
import (
"bufio"
"context"
"io"
"log"
"os"
"strings"
"time"
"github.com/gopxl/beep/v2"
"github.com/gopxl/beep/v2/mp3"
"github.com/gopxl/beep/v2/speaker"
"github.com/taigrr/elevenlabs/client"
"github.com/taigrr/elevenlabs/client/types"
@@ -15,13 +24,41 @@ func main() {
if err != nil {
panic(err)
}
saveFile, err := os.Create("sample.mp3")
if err != nil {
panic(err)
pipeReader, pipeWriter := io.Pipe()
// record how long it takes to run and print out on exit
start := time.Now()
defer func() {
log.Println(time.Since(start))
}()
var text string
if len(os.Args) > 1 {
text = strings.Join(os.Args[1:], " ")
} else {
reader := bufio.NewReader(os.Stdin)
b, _ := io.ReadAll(reader)
text = string(b)
}
defer saveFile.Close()
err = client.TTSWriter(ctx, saveFile, "hello, golang", ids[0], types.SynthesisOptions{Stability: 0.75, SimilarityBoost: 0.75})
go func() {
err = client.TTSStream(ctx, pipeWriter, text, ids[0], types.SynthesisOptions{Stability: 0.75, SimilarityBoost: 0.75, Style: 0.0, UseSpeakerBoost: false})
if err != nil {
panic(err)
}
pipeWriter.Close()
}()
streamer, format, err := mp3.Decode(pipeReader)
if err != nil {
panic(err)
log.Fatal(err)
}
defer streamer.Close()
speaker.Init(format.SampleRate, format.SampleRate.N(time.Second/10))
done := make(chan bool)
speaker.Play(beep.Seq(streamer, beep.Callback(func() {
done <- true
})))
<-done
}

3
cmd/transcribe/.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
*.mp3
main
transcribe

34
cmd/transcribe/main.go Normal file
View File

@@ -0,0 +1,34 @@
package main
import (
"context"
"encoding/json"
"fmt"
"os"
"github.com/taigrr/elevenlabs/client"
"github.com/taigrr/elevenlabs/client/types"
)
func main() {
ctx := context.Background()
client := client.New(os.Getenv("XI_API_KEY"))
filePath := os.Args[1]
resp, err := client.ConvertSpeechToText(ctx, filePath, types.SpeechToTextRequest{
ModelID: types.SpeechToTextModelScribeV1,
TimestampsGranularity: types.TimestampsGranularityWord,
Diarize: true,
})
if err != nil {
panic(err)
}
bytes, err := json.Marshal(resp)
if err != nil {
panic(err)
}
fmt.Println(string(bytes))
}

View File

@@ -1,4 +0,0 @@
package main
func main() {
}

14
go.mod
View File

@@ -1,3 +1,15 @@
module github.com/taigrr/elevenlabs
go 1.20
go 1.23.0
toolchain go1.24.0
require github.com/gopxl/beep/v2 v2.1.1
require (
github.com/ebitengine/oto/v3 v3.3.2 // indirect
github.com/ebitengine/purego v0.8.0 // indirect
github.com/hajimehoshi/go-mp3 v0.3.4 // indirect
github.com/pkg/errors v0.9.1 // indirect
golang.org/x/sys v0.32.0 // indirect
)

22
go.sum
View File

@@ -0,0 +1,22 @@
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/ebitengine/oto/v3 v3.3.2 h1:VTWBsKX9eb+dXzaF4jEwQbs4yWIdXukJ0K40KgkpYlg=
github.com/ebitengine/oto/v3 v3.3.2/go.mod h1:MZeb/lwoC4DCOdiTIxYezrURTw7EvK/yF863+tmBI+U=
github.com/ebitengine/purego v0.8.0 h1:JbqvnEzRvPpxhCJzJJ2y0RbiZ8nyjccVUrSM3q+GvvE=
github.com/ebitengine/purego v0.8.0/go.mod h1:iIjxzd6CiRiOG0UyXP+V1+jWqUXVjPKLAI0mRfJZTmQ=
github.com/gopxl/beep/v2 v2.1.1 h1:6FYIYMm2qPAdWkjX+7xwKrViS1x0Po5kDMdRkq8NVbU=
github.com/gopxl/beep/v2 v2.1.1/go.mod h1:ZAm9TGQ9lvpoiFLd4zf5B1IuyxZhgRACMId1XJbaW0E=
github.com/hajimehoshi/go-mp3 v0.3.4 h1:NUP7pBYH8OguP4diaTZ9wJbUbk3tC0KlfzsEpWmYj68=
github.com/hajimehoshi/go-mp3 v0.3.4/go.mod h1:fRtZraRFcWb0pu7ok0LqyFhCUrPeMsGRSVop0eemFmo=
github.com/hajimehoshi/oto/v2 v2.3.1/go.mod h1:seWLbgHH7AyUMYKfKYT9pg7PhUu9/SisyJvNTT+ASQo=
github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA=
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
golang.org/x/sys v0.0.0-20220712014510-0a85c31ab51e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.32.0 h1:s77OFDvIQeibCmezSnk/q6iAfkdiQaJi4VzroCFrN20=
golang.org/x/sys v0.32.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=