VOICE CLONING · TRAINING DATA

The only ethical way to source voice cloning data at scale.

Voice cloning is the highest-risk use case in speech AI, and the one most exposed to lawsuits, right-of-publicity claims, and platform bans. We license speaker-specific audio under separate, explicit, written voice-cloning consent from every speaker — and we'll only sell it to customers who agree to the elevated terms below.

Per-speaker written consent · Compensation flow-through · Watermarking required · Right-of-revocation honored
Why this is a separate product

Read this before you buy anything else.

Standard catalog and custom licenses do not permit voice cloning. Cloning a speaker's voice raises distinct legal issues that the standard release does not address: right of publicity, biometric privacy under BIPA, defamation risk if the cloned voice is misused, and a much broader surface area for litigation. We refuse to bundle voice cloning rights into a generic dataset license, because doing so would be unfair to speakers and dangerous for buyers.

How it works

What an aipodcast voice cloning license looks like.

Per-speaker consent

We obtain a separate, explicit, written voice-cloning consent from every speaker in the licensed dataset, naming the customer.

Per-customer disclosure

The speaker is told who is licensing their voice and for what purpose. No anonymous resale.

Compensation flow-through

A defined share of the elevated fee flows directly to the speaker, separate from base licensing payments.

Use restrictions

No non-consensual intimate content, defamation, election interference, deceptive impersonation, or any practice prohibited by EU AI Act Article 5.

Watermarking

Customer agrees to participate in a recognized voice provenance / watermarking scheme. We can recommend a stack.

Right of revocation

On revocation, customer must cease producing new outputs derived from the cloned voice within 30 days. Already-distributed outputs are not retroactively recalled.

Specs

What we deliver.

Audio

Format
48 kHz / 24-bit WAV
Per speaker
Several hours to several dozen hours typical
Tracks
Multi-track separation where available

Transcripts

Alignment
Word-level standard, phoneme-level on request
Formats
JSON, WebVTT, SRT, TextGrid

Metadata

Speaker
Age range, gender, L1, accent, mic, environment
Per file
Consent record IDs

Legal

Consent
Signed voice-cloning consent record from the speaker
Addendum
Dual-party MSA addendum
Indemnification
IP & consent claims arising from source audio
Fit

Who this is right for.

✓ Right fit

  • Companies building branded voice products with the licensed speaker’s consent and participation
  • Accessibility products that clone a user’s voice with their explicit consent (e.g., voice banking for ALS patients)
  • Audiobook and dubbing platforms cloning licensed performers
  • Voice agent products needing brand-consistent synthesized voices

✕ Not a fit

  • Anyone wanting to clone a celebrity, public figure, or third party without their consent — we will not work with you
  • Anyone wanting to build deepfake or impersonation tools — we will not work with you
  • Anyone unwilling to participate in voice provenance / watermarking — we will not work with you

Building a voice cloning product the right way?

Voice cloning is priced significantly above catalog and custom rates. Talk to sales for a project-specific quote and a walkthrough of the consent flow.