The only ethical way to source voice cloning data at scale.
Voice cloning is the highest-risk use case in speech AI, and the one most exposed to lawsuits, right-of-publicity claims, and platform bans. We license speaker-specific audio under separate, explicit, written voice-cloning consent from every speaker — and we'll only sell it to customers who agree to the elevated terms below.
Read this before you buy anything else.
Standard catalog and custom licenses do not permit voice cloning. Cloning a speaker's voice raises distinct legal issues that the standard release does not address: right of publicity, biometric privacy under BIPA, defamation risk if the cloned voice is misused, and a much broader surface area for litigation. We refuse to bundle voice cloning rights into a generic dataset license, because doing so would be unfair to speakers and dangerous for buyers.
What an aipodcast voice cloning license looks like.
Per-speaker consent
We obtain a separate, explicit, written voice-cloning consent from every speaker in the licensed dataset, naming the customer.
Per-customer disclosure
The speaker is told who is licensing their voice and for what purpose. No anonymous resale.
Compensation flow-through
A defined share of the elevated fee flows directly to the speaker, separate from base licensing payments.
Use restrictions
No non-consensual intimate content, defamation, election interference, deceptive impersonation, or any practice prohibited by EU AI Act Article 5.
Watermarking
Customer agrees to participate in a recognized voice provenance / watermarking scheme. We can recommend a stack.
Right of revocation
On revocation, customer must cease producing new outputs derived from the cloned voice within 30 days. Already-distributed outputs are not retroactively recalled.
What we deliver.
Audio
- Format
- 48 kHz / 24-bit WAV
- Per speaker
- Several hours to several dozen hours typical
- Tracks
- Multi-track separation where available
Transcripts
- Alignment
- Word-level standard, phoneme-level on request
- Formats
- JSON, WebVTT, SRT, TextGrid
Metadata
- Speaker
- Age range, gender, L1, accent, mic, environment
- Per file
- Consent record IDs
Legal
- Consent
- Signed voice-cloning consent record from the speaker
- Addendum
- Dual-party MSA addendum
- Indemnification
- IP & consent claims arising from source audio
Who this is right for.
✓ Right fit
- Companies building branded voice products with the licensed speaker’s consent and participation
- Accessibility products that clone a user’s voice with their explicit consent (e.g., voice banking for ALS patients)
- Audiobook and dubbing platforms cloning licensed performers
- Voice agent products needing brand-consistent synthesized voices
✕ Not a fit
- Anyone wanting to clone a celebrity, public figure, or third party without their consent — we will not work with you
- Anyone wanting to build deepfake or impersonation tools — we will not work with you
- Anyone unwilling to participate in voice provenance / watermarking — we will not work with you
Building a voice cloning product the right way?
Voice cloning is priced significantly above catalog and custom rates. Talk to sales for a project-specific quote and a walkthrough of the consent flow.