How to Clone any Voice using AudioStack API
There are multiple options available for cloning your voice using AudioStack API, depending on your needs.
With AudioStack API, you can access a bigger range of voice cloning options, enabling you to get the right quality for your usecase.
When you clone a voice, you will create a custom voice that will be available for use in the Audiostack API (only to the user that created the voice). To do this, you may use any files uploaded within your organisation.
How can I upload my files for cloning?
There are different options available for uploading your files, depending on whether you're integrating with the API, uploading the files as a developer, or offering voice cloning to people who would prefer to record using AudioStack's simple recording workflow in the Platform:
Which type of cloning should I use?
Standard Cloning
For better quality and more control over the voice, use voice_engine_2
. This engine requires at least 20 minutes of data and an audio file where the speaker agrees to have their voice cloned by AudioStack. It's not instant, it typically takes up to few hours for your voice to be available in library.
For non-english voices, we recommended using at least 45 minutes of data.
1200 credits will be charged upon successful voice creation.
Instant Cloning
If you want to clone a voice instantly or just with a little data, use voice_engine_3
. Here, you can create a clone with just a few minutes of recordings (although the quality will improve with more data).
300 credits will be charged upon successful voice creation.
How to clone a voice
Create your voice in two steps.
First, make a POST request to https://v2.api.audio/speech/voice-cloning
with the following payload:
{
"fileIds": ["file_id1", "file_id2", "file_id3"],
"alias": "my-instant-clone",
"engine": "voice_engine_3",
"metadata": {
"gender": "female"
}
}
If the files are correct and the alias is globally unique, you will receive a response with a status code of 202
, meaning your voice is being created.
Then, make a GET request to https://v2.api.audio/speech/voice-cloning?alias=my-instant-clone
to check the status of your voice. If the status is succeeded
, that means your voice is ready to be used in AudioStack!
You'll notice that a state contains a discardedFiles
field. This field will contain the IDs of the files that were not used in the voice cloning process. You'll also see the reason for their discard.
Engine Details
voice_engine_2 | voice_engine_3 | |
---|---|---|
Max. files amount | 500 | 25 |
Min. single file duration | 1.5 seconds | 1.5 seconds |
Min. total audio duration | 20 minutes | 1.5 seconds |
Max. total file size | 500MB | 10MB |
SSML support | Yes | No |
multilingual | No | Yes |
Avg. synthesis latency for 500 characters | 4 seconds | 20 seconds |
Updated 18 days ago