ABOUT US

We built the speech-data company we wished existed.

We're podcasters who turned into a speech-data company because the speech-data market was broken in a way only podcasters could fix. Here's the story.

Headquartered in Phoenix, Arizona · Supply network across multiple continents
Why we exist

The market was broken in a specific way.

For most of the past decade, the easy way to get conversational speech data for AI was to scrape it. Podcasts, audiobooks, YouTube, support call recordings — anything you could get an HTTP request to. It worked because AI was a research curiosity and nobody was checking the licenses. That era is over.

The legal exposure went vertical

The EU AI Act now requires training-data transparency. Multiple foundation-model labs are in active litigation over the data they scraped. Voice actors are organizing. The legal exposure for any company training a speech model on un-cleared audio has gone from “theoretical” to “imminent.”

The alternatives weren’t much better

Generic crowd vendors, open datasets, and traditional data labelers all carry their own problems: noisy audio, vague consent language, no accent diversity, no contactable speakers, no provenance, six-month lead times.

The supply was hiding in plain sight

The world is full of professional podcasters sitting on hundreds or thousands of hours of studio-grade conversational audio that they own outright. They are the highest-quality untapped supply of speech data on earth, and almost nobody is paying them for it.

So we built the obvious thing

License that audio properly, on real contracts, with real consent, real revocation rights, and real money flowing to the speakers — and resell it to AI companies under terms their legal teams can actually approve.

Principles

Six things we will not compromise on.

Consent is not a checkbox

Every speaker in our catalog signs a written release that names them, describes the recording, and explicitly grants AI training rights. No click-through ToS amendments. No “implied consent.”

Speakers retain ownership

We license; we do not buy. Speakers can revoke at any time, and we honor the revocation on a defined SLA.

Professional quality is the floor

The catalog is studio-grade by default because we source from podcasters who already record in studios with broadcast-grade microphones. We do not sell anything below that grade.

Provenance is non-negotiable

Every file we deliver carries a per-file consent record ID, a SHA-256 checksum, and a signed manifest. Customers can audit the chain of consent for any file at any time.

We will say no to bad use cases

No non-consensual voice cloning, deepfakes, impersonation, election interference, or any practice prohibited by EU AI Act Article 5. The list of customers we have turned down is short but real.

Built for the legal review

Our pricing, contracts, security, and compliance docs are built to survive a Fortune 500 vendor review — not optimized for the sales pitch.

Where the audio comes from

A real network of working podcasters.

We started with 350+ hours of original conversational English from our own podcast catalog, recorded in professional studios over the past several years. From there, we built outward.

Our supply network is a global set of partner podcast studios and podcast distribution companies who license their back catalogs to us and commission new recordings on our behalf. We pay creators directly, on a schedule defined in their contract — not in vague exposure or “future opportunities.”

The catalog grows weekly through (a) new partner studios joining the network, (b) new commissioned recordings in under-represented languages, and (c) direct outreach to professional podcasters worldwide.

  • Owned: 350+ hours of original conversational English in catalog today
  • Network: A global network of partner studios and distribution companies
  • Growing: Active outreach to professional podcasters worldwide
Studio-grade
Recorded on Shure SM7B, Rode NT1, MKH 416 in treated rooms
Named & contactable
Every speaker on file with a signed model-training release
Direct contracts
No reseller chains, no orphan recordings, no guesswork
Team

Who runs aipodcast.

Jaeden Schafer — Founder & CEO

Founder of Podcast Studio. Built and operates a podcast network producing hundreds of hours of professional audio per month. Saw the speech-data licensing problem from the supply side and built aipodcast to solve it from both sides. Host of the AI Chat podcast.

Headquartered in Phoenix, Arizona

Operating from the U.S. with a supply network spanning podcast studios on multiple continents.

Press & contact

press@aipodcast.io · partnerships@aipodcast.io · legal@aipodcast.io · security@aipodcast.io · privacy@aipodcast.io

Want to talk?

Whether you're a frontier lab evaluating a new supplier or a podcaster wondering whether your back catalog has value — we'd love to hear from you.