Speech APIs for AI Agents: Compare Providers, APIs, and Routes | AgentRouter

Speech APIs for AI Agents

Use AgentRouter to transcribe audio, generate speech, and inspect speech model supply through one routed voice API for agents.

Deepgram

OpenAI

Live Providers

Live Routes

10 credits ($0.01)

Transcribe

10 credits ($0.01)

Speak

The live speech catalog currently exposes 3 capabilities for AI agents. Transcribe currently starts at 10 credits ($0.01), and speak currently starts at 10 credits ($0.01).

Availability Note

Deepgram and OpenAI transcribe and speak routes are live. `speech.analyze.deepgram.mpp` and `speech.speak.grok.mpp` are still temporarily unavailable.

What Is Speech & Voice?

What is agent speech for AI agents?

Agent speech is the operational layer that lets an AI agent turn audio into text, turn text into speech, and inspect the live speech model catalog programmatically.

Speech workflows often split across different providers for transcription, speech generation, and model listing. That fragmentation makes it harder to keep routing, pricing, and fallback logic stable inside a production agent.

AgentRouter keeps those tasks behind one routed speech domain so the agent can compare live routes before execution and switch providers without rewriting its speech workflow.

Transcribe one audio clip
Generate speech from text
Inspect live speech model supply

Top Scenarios

Top speech & voice scenarios for agents

Transcribe audio into text

Recover text from an audio clip when the agent needs speech-to-text before downstream reasoning.

Deepgram

MPP

10 credits ($0.01)

OpenAI

MPP

10 credits ($0.01)

Meeting notes, voice agents, and transcription workflows

Coverage: 1 capabilities · 2 live routes

Start with Speech Transcribe

Generate speech from text

Create spoken output when the workflow starts from text and needs a voice response.

Deepgram

MPP

10 credits ($0.01)

OpenAI

MPP

10 credits ($0.01)

Voice replies, assistants, and narration workflows

Coverage: 1 capabilities · 2 live routes

Start with Speech Synthesize

Inspect available speech models

List model or voice supply before locking the workflow to one route.

Speech Models List

Deepgram

0 credits ($0)

Route selection, quality checks, and voice catalog inspection

Coverage: 1 capabilities · 1 live routes

Start with Speech Models List

Provider Comparison

Compare speech & voice providers and APIs

The rows below are derived from the live route catalog currently exposed through AgentRouter.

Provider	Best For	Key Capabilities	Starting Price	Available Via
D Deepgram Covers Speech Transcribe, Speech Synthesize, Speech Models List through the live routed catalog. 3 live routes	Speech Transcribe and Speech Synthesize	Speech Transcribe, Speech Synthesize, Speech Models List Modes: Managed	0 credits ($0)	MPP
O OpenAI Covers Speech Transcribe, Speech Synthesize through the live routed catalog. 2 live routes	Speech Transcribe and Speech Synthesize	Speech Transcribe, Speech Synthesize Modes: Managed

Quick Start

How to integrate speech & voice with AgentRouter

Get an API key, ask AgentRouter to recommend the best route for the job, and execute through one shared wallet instead of wiring each provider separately.

1
Step 1
Enable the API once
Create one AgentRouter API key that can be reused across all live domains and capabilities.
2
Step 2
Recommend the route
Let AgentRouter compare the current routes, prices, and route availability before the first execute call.
3
Step 3
Execute the capability
Run the selected workflow through the chosen route while AgentRouter handles pricing, wallet debits, and upstream settlement.

Checking AgentRouter account...

Works with

Pricing By Task

Pricing by common speech & voice tasks

These examples translate the current live route prices into workflow questions operators usually evaluate before shipping a domain into an agent.

Task	Route	Price	Notes
Transcribe one audio clip	Deepgram (speech.transcribe.deepgram.mpp)	10 credits ($0.01)	Return normalized text from speech input.
Generate one speech response	Deepgram (speech.speak.deepgram.mpp)	10 credits ($0.01)	Create audio output from a text prompt.
Inspect speech model supply	Deepgram (speech.models.list.deepgram.mpp)	0 credits ($0)	Check live provider catalog before hard-coding a route.

Pricing note: 1000 credits = $1 USD. Raw API responses still return numeric credit fields such as creditsCharged.

Why AgentRouter

Why use AgentRouter instead of direct speech & voice integrations?

Recommendation happens before execution, so the agent can compare current route supply and pricing instead of hard-coding one provider forever. The same routed contract stays stable even as provider coverage or transport paths change underneath.

One API surface across the current live speech & voice workflows.
Recommendation before execution, so route choice can change as provider supply changes.
One wallet and one billing layer across the routed provider catalog.
The lower section of this page keeps the full capability browser, route breakdowns, and example modals for implementation work.

Jump to detailed capabilities

FAQ

Speech & Voice FAQ

What is agent speech?

It is the routed layer that lets an AI agent transcribe audio, generate speech, and inspect speech model supply through one voice domain.

When should I use transcribe instead of speak?

Use transcribe when the input is audio and the output should be text. Use speak when the input is text and the output should be generated audio.

Why does models-list matter for speech workflows?

Models-list lets the agent inspect the live provider catalog before choosing a transcription route or voice surface, which reduces hard-coded assumptions.

Do I need separate provider keys for Deepgram or OpenAI speech routes?

No. AgentRouter keeps the live speech routes behind one wallet and one routed API surface.

Reference

Detailed speech & voice capabilities and route pricing

Use the capability browser below when you want contract-level detail: endpoints, live example modals, route breakdowns, and exact route prices.

Provider Networks

Direct API Partner·

AgentRouter·

Capability	Endpoints	Price
transcribe Convert spoken audio into text.	`GET /api/agentic-api/domains/speech/capabilities/transcribe` Read the capability contract, current lifecycle status, and machine-readable metadata for this speech workflow.	Free
	`POST /api/agentic-api/domains/speech/capabilities/transcribe/recommend` Ask AgentRouter to compare eligible routes inside this capability and return the best recommendation before execution.	Free
	`POST /api/agentic-api/domains/speech/capabilities/transcribe/execute` Execute this capability through the selected route while AgentRouter handles pricing, wallet debits, and upstream settlement.
speak Convert text into generated speech.	`GET /api/agentic-api/domains/speech/capabilities/speak` Read the capability contract, current lifecycle status, and machine-readable metadata for this speech workflow.	Free
	`POST /api/agentic-api/domains/speech/capabilities/speak/recommend` Ask AgentRouter to compare eligible routes inside this capability and return the best recommendation before execution.	Free
	`POST /api/agentic-api/domains/speech/capabilities/speak/execute` Execute this capability through the selected route while AgentRouter handles pricing, wallet debits, and upstream settlement. 1 route in this breakdown are temporarily unavailable.
analyze Analyze spoken or transcribed content for higher-level signals.	`GET /api/agentic-api/domains/speech/capabilities/analyze` Read the capability contract, current lifecycle status, and machine-readable metadata for this speech workflow.	Free
	`POST /api/agentic-api/domains/speech/capabilities/analyze/recommend` Ask AgentRouter to compare eligible routes inside this capability and return the best recommendation before execution.	Free
	`POST /api/agentic-api/domains/speech/capabilities/analyze/execute` Execute this capability through the selected route while AgentRouter handles pricing, wallet debits, and upstream settlement. 1 route in this breakdown are temporarily unavailable.
models.list List provider speech models or voices when the upstream route explicitly supports it.	`GET /api/agentic-api/domains/speech/capabilities/models-list` Read the capability contract, current lifecycle status, and machine-readable metadata for this speech workflow.	Free
	`POST /api/agentic-api/domains/speech/capabilities/models-list/recommend` Ask AgentRouter to compare eligible routes inside this capability and return the best recommendation before execution.	Free
	`POST /api/agentic-api/domains/speech/capabilities/models-list/execute` Execute this capability through the selected route while AgentRouter handles pricing, wallet debits, and upstream settlement.

Related AgentRouter pages

AgentRouter landing page

Browse all live domains, top API cards, and route coverage across the platform.

Install and auth

Enable AgentRouter, create the API key, and connect the shared wallet flow.

Phone, SMS & Voice domain

Create managed phone agents, provision numbers, inspect conversations and calls, and keep telephony workflows behind one supplier-neutral phone domain.

Web Crawl & Extraction domain

Fetch, scrape, extract, crawl, map, browse, and screenshot public pages through one routed surface.