Question 1

What languages does AIPodcast cover?

Accepted Answer

multiple regions and locales across English (US/GB/AU), Spanish (LATAM/EU), Portuguese (BR/PT), French, German, Italian, Dutch, Polish, Japanese, Korean, Mandarin, Cantonese, Hindi, Tamil, Bengali, Arabic (MSA/Egyptian/Gulf/Levantine), Turkish, Vietnamese, Thai, Indonesian, Tagalog, Swahili, Yoruba and more.

Question 2

How quickly can you ramp a new locale?

Accepted Answer

Tier-1 locales: same week from sample. Tier-2: 2–4 weeks. Tier-3 / low-resource: 6–10 weeks for the first 50 hours, with monthly increments after.

Question 3

Are speakers natively verified?

Accepted Answer

Yes. Every speaker is a native or near-native speaker of the locale. Native verification is performed by an in-locale reviewer, not a script.

Question 4

Do you support dialect-level coverage?

Accepted Answer

Yes. Spanish is split LATAM vs EU; Portuguese BR vs PT; Arabic MSA vs Egyptian/Gulf/Levantine; English US/GB/AU/IN. Speakers are tagged with sub-dialect metadata so you can filter or balance.

Question 5

Can you do parallel topic coverage across locales?

Accepted Answer

Yes. We run parallel topic shoots so the same conversational domain is covered across 5–15 locales — useful for multilingual evaluation and cross-lingual transfer.

Question 6

Can you provide code-switching data?

Accepted Answer

Yes. Bilingual and trilingual creators contribute natural code-switching recordings — especially Spanglish, Hinglish, Tagalog/EN, Arabic/FR, and Mandarin/EN.

Question 7

Do you support low-resource languages?

Accepted Answer

Yes — through custom collection. We recruit native speakers via our creator network and deliver targeted hour counts in weeks rather than quarters.

Question 8

What about jurisdictional consent?

Accepted Answer

Every release is jurisdiction-tagged and translated into the speaker's language. the speaker's home jurisdiction are all handled through the same provenance trail.

Question 9

How is multilingual data priced?

Accepted Answer

Per-locale and per-hour, with premium for low-resource locales and exclusive custom collections.

Multilingual speech data across multiple regions and locales.

Built for the work.

Locale breadth

Dialect coverage

Code-switching

Native verification

Metadata in-language

Custom recruitment

Status by locale.

Where the speakers actually live.

Americas

Europe

MENA

Sub-Saharan Africa

South Asia

East & SE Asia

From email to first locale.

Sample request

Mutual NDA

MSA + data licence

First delivery

Manifest & provenance

Ongoing delivery

Common questions.

Want a representative sample?