SOLUTIONS

Podcast audio licensing for AI labs that want the whole catalogue.

AI labs need real human conversation at scale. Podcasters have hundreds of thousands of hours of it — recorded well, in studios, with the people who can grant rights cleanly. We are the bridge: bulk back-catalogue licensing with retroactive guest re-clearance, exclusivity tiers, transparent creator revenue share, and the only consent stack a frontier lab can actually defend in court.

48 kHz / 24-bit WAV · Word-level aligned transcripts · Verified consent · Commercial training rights
Bulk
Back-catalogue licensing
100%
Guest re-clearance, every episode
3 tiers
Non-excl / category / full exclusive
Royalty
Transparent creator revenue share
§ 01 — What you get

Built for the work.

Direct from creators

Bulk back catalogues licensed directly from working podcasters and networks. Not scraped, not RSS-rinsed.

Rights documented

Signed creator release plus retroactive guest re-clearance on every legacy episode. SHA-256 chain of custody.

Studio-grade audio

Real podcast studios — Shure SM7B, Rode NT1, MKH 416 chains in treated rooms. EBU R128 broadcast loudness.

Transcripts included

Word-aligned transcripts with diarization in JSON, CTM, TextGrid, SRT, or VTT.

Custom acquisition

Need a specific show, network, or vertical? We can broker the licence end-to-end, including the legal lift on guest re-clearance.

Enterprise terms

Indemnification, three exclusivity tiers, named consent contact for life, written revocation SLA.

§ 02 — Licence tiers

What you can license.

Asset
Available
Format
Notes
Released back catalogue
Yes
48 kHz / 24-bit WAV + transcripts
Full guest re-clearance on every episode
Ad-free production masters
Yes
Pre-mix WAV, no ad inserts
Cleaner training signal than the published feed
Raw multitrack stems
Where available
Per-mic WAV stems
Speaker-isolated, ideal for source separation and diarization training
Exclusive segments
Yes
Curated extracts
Topic, guest, or domain-specific cuts
Future episodes
Rolling
Monthly delivery
Forward consent baked into creator MSA
Transcripts only
Yes
JSON / CTM / SRT / VTT
For text LLMs, RAG corpora, or editorial
Non-exclusive licence
Default
Lowest tier, fastest to close
Category exclusive
Premium
No other AI buyer in your sub-vertical
Full exclusive
Premium+
No other AI buyer at all
§ 03 — What you can license

Inside the deal.

Back catalogue

The full released archive — every episode, every season — with retroactive guest re-clearance.

Exclusive segments

Curated extracts: deep-dive episodes, specific guests, domain-balanced cuts, or interview formats.

Future episodes

Rolling forward consent. New episodes flow into your shard monthly with the same provenance.

Ad-free masters

Production masters with ad inserts removed — cleaner signal, no host-read repetition pollution.

Raw multitrack

Per-mic stems where available. Speaker isolation pre-baked, perfect for source separation and diarization training.

Transcripts

Word-aligned text in JSON/CTM/SRT/VTT. Standalone licence for text LLMs and RAG.

§ 04 — How engagement works

From email to first catalogue.

01

Sample request

Tell us the verticals, hours, and exclusivity tier. We return a 30-min sample shard plus a candidate catalogue list within 48 hours.

02

Mutual NDA

Standard one-page mutual.

03

MSA + bulk licence

Perpetual commercial training licence, exclusivity tier, indemnification, named contact, written revocation SLA.

04

First delivery

Pilot back-catalogue tranche with audio, transcripts, manifests, per-episode consent receipts, and guest clearance log.

05

Manifest & provenance

Per-episode lineage: show, host, every guest, recording date, jurisdiction, consent version, SHA-256.

06

Ongoing delivery

Monthly increments of new episodes, exclusivity enforcement, creator royalty reporting, written revocation SLA.

§ 05 — FAQ

Common questions.

Can you legally license podcast audio for AI?

Yes — when the rights are properly granted. Every creator signs an explicit AI training release before audio enters the catalogue, and every guest in every episode is re-cleared in writing. Scraped public feeds do not meet this bar.

Can I license a whole back catalogue?

Yes. We license bulk back catalogues — entire shows, networks, or back-catalogue tranches — including retroactive guest re-clearance for legacy episodes. This is the cleanest path to large volumes of consented conversational audio fast.

What about co-hosts and guests?

Every voice in every recording has a signed release. For new episodes, guests sign when they walk into the studio. For legacy episodes, we do retroactive guest re-clearance — and we exclude any episode where a guest cannot be reached or declines.

What can I license?

Released back catalogue, exclusive segments, future episodes on a rolling basis, ad-free production masters, and raw multitrack stems where available. Aligned transcripts ship with every tier.

Is the audio exclusive?

Three tiers: non-exclusive (default), category-exclusive (no other AI buyer in your sub-vertical), and full-exclusive (no other AI buyer at all). Exclusive carries a premium.

How is revenue shared with creators?

Creators receive a transparent revenue share on every licensed hour, paid quarterly, with per-episode reporting. This is what makes consent sustainable — speakers stay paid for the life of the model.

What does it cost?

Pricing depends on hours, language, exclusivity tier, and metadata depth. Contact partnerships@aipodcast.io for a quote.

How is this different from a public dataset?

Public datasets lack documented consent, commercial training rights, speaker metadata, and a revocation path. Ours has all four — plus indemnification, audit trail, and a named human contact for life.

I am a podcaster being acquired or considering selling — how does that work?

Talk to us. We can structure either a pure data licence (you keep the show) or work alongside an acquisition with the AI rights as a separate, ongoing royalty stream. Email partnerships@aipodcast.io.

Want a representative sample?

30 minutes of audio + transcripts + metadata, delivered within 48 hours of NDA.