We built the speech-data company we wished existed.
We're podcasters who turned into a speech-data company because the speech-data market was broken in a way only podcasters could fix. Here's the story.
The market was broken in a specific way.
For most of the past decade, the easy way to get conversational speech data for AI was to scrape it. Podcasts, audiobooks, YouTube, support call recordings — anything you could get an HTTP request to. It worked because AI was a research curiosity and nobody was checking the licenses. That era is over.
The legal exposure went vertical
The EU AI Act now requires training-data transparency. Multiple foundation-model labs are in active litigation over the data they scraped. Voice actors are organizing. The legal exposure for any company training a speech model on un-cleared audio has gone from “theoretical” to “imminent.”
The alternatives weren’t much better
Generic crowd vendors, open datasets, and traditional data labelers all carry their own problems: noisy audio, vague consent language, no accent diversity, no contactable speakers, no provenance, six-month lead times.
The supply was hiding in plain sight
The world is full of professional podcasters sitting on hundreds or thousands of hours of studio-grade conversational audio that they own outright. They are the highest-quality untapped supply of speech data on earth, and almost nobody is paying them for it.
So we built the obvious thing
License that audio properly, on real contracts, with real consent, real revocation rights, and real money flowing to the speakers — and resell it to AI companies under terms their legal teams can actually approve.
Six things we will not compromise on.
Consent is not a checkbox
Every speaker in our catalog signs a written release that names them, describes the recording, and explicitly grants AI training rights. No click-through ToS amendments. No “implied consent.”
Speakers retain ownership
We license; we do not buy. Speakers can revoke at any time, and we honor the revocation on a defined SLA.
Professional quality is the floor
The catalog is studio-grade by default because we source from podcasters who already record in studios with broadcast-grade microphones. We do not sell anything below that grade.
Provenance is non-negotiable
Every file we deliver carries a per-file consent record ID, a SHA-256 checksum, and a signed manifest. Customers can audit the chain of consent for any file at any time.
We will say no to bad use cases
No non-consensual voice cloning, deepfakes, impersonation, election interference, or any practice prohibited by EU AI Act Article 5. The list of customers we have turned down is short but real.
Built for the legal review
Our pricing, contracts, security, and compliance docs are built to survive a Fortune 500 vendor review — not optimized for the sales pitch.
A real network of working podcasters.
We started with 350+ hours of original conversational English from our own podcast catalog, recorded in professional studios over the past several years. From there, we built outward.
Our supply network is a global set of partner podcast studios and podcast distribution companies who license their back catalogs to us and commission new recordings on our behalf. We pay creators directly, on a schedule defined in their contract — not in vague exposure or “future opportunities.”
The catalog grows weekly through (a) new partner studios joining the network, (b) new commissioned recordings in under-represented languages, and (c) direct outreach to professional podcasters worldwide.
- Owned: 350+ hours of original conversational English in catalog today
- Network: A global network of partner studios and distribution companies
- Growing: Active outreach to professional podcasters worldwide
Who runs aipodcast.
Jaeden Schafer — Founder & CEO
Founder of Podcast Studio. Built and operates a podcast network producing hundreds of hours of professional audio per month. Saw the speech-data licensing problem from the supply side and built aipodcast to solve it from both sides. Host of the AI Chat podcast.
Headquartered in Phoenix, Arizona
Operating from the U.S. with a supply network spanning podcast studios on multiple continents.
Press & contact
press@aipodcast.io · partnerships@aipodcast.io · legal@aipodcast.io · security@aipodcast.io · privacy@aipodcast.io
Want to talk?
Whether you're a frontier lab evaluating a new supplier or a podcaster wondering whether your back catalog has value — we'd love to hear from you.