Outbound voice-call API designed for AI agents (Claude, GPT, Cursor, MCP, Zapier, custom). Two entrypoints:
a structured Skill API (manifest + idempotent run + SSE events) and the legacy free-form
POST /calls. Every endpoint below is live in production.
No login required to read these docs.
Hand this off to another AI
The button above copies a single self-contained Markdown spec โ every endpoint, auth, request/response shape, idempotency contract, SSE event stream, request_id ask_user, error codes, agent recipes โ designed to be pasted into Claude.ai / ChatGPT / Cursor as context. After paste, your AI can call the API correctly first try.
Every authenticated request takes the API key in the X-API-Key header. Get a free key at the home page (100 free credits). One key works for everything; no per-endpoint scopes today.
Public (no auth): GET /skills, GET /skills/{skill_id}/manifest, GET /docs, GET /openapi.json.
Single exception: GET /calls/{id}/recording also accepts ?api_key= query param so HTML5 <audio> can fetch it.
Strongly-typed wrapper around the voice pipeline. The agent fetches the manifest (input schema + capabilities), POSTs structured data to /skills/{skill_id}/run, and the server builds the brief + dispatches the call. Idempotency, SSE events, durable ask_user โ all part of this path.
Response (200):
Public. Phase-level โ does NOT inspect any specific customer's webhook config. Per-customer validation happens at /run time.
Headers: X-API-Key (required), Idempotency-Key (optional 1โ255 chars; strongly recommended).
ask_user_mode:
| Value | Behavior |
|---|---|
"webhook" | Customer MUST have metadata.ask_user_webhook_url configured, else 422 ask_user_channel_required. |
"any" | Use webhook if configured, else SSE-stream. Recommended default. |
"stream" | No webhook preflight. Agent keeps SSE open and answers via POST /answer. Phase D. |
Defense-in-depth: If a webhook is configured on the customer's API key, it fires for ask_user events even when ask_user_mode="stream" โ both channels deliver the same event with the same request_id. Agents listening to both must dedupe by request_id. To suppress webhook delivery entirely, remove the webhook via DELETE /users/me/webhook.
Response (201):
Call queue / 202 Accepted: when the call queue is enabled server-side (CALL_QUEUE_ENABLED), this endpoint queues the call instead of dialing immediately and returns 202 โ the same SkillRunResponse shape as the 201, just status:"queued" and call_sid:null. The call is dialed shortly after at the account-wide CPS limit. Treat any 2xx as success, capture call_id, then follow the SSE stream or poll /calls/{call_id} โ the status moves queued โ dialing โ in_progress โ โฆ.
SSE URL is not in the run response โ derive it from manifest.events_url_template (today: /calls/{call_id}/events).
Same Idempotency-Key + same body from same customer โ server replays original response, no duplicate dial. Use a unique key per logical agent retry attempt (e.g. outer-loop UUID).
Storage: table skill_run_idempotency keyed on (customer_id, idempotency_key). Body fingerprinted with SHA-256 of canonical JSON. TTL 24 h; lock window 120 s.
| Server state | HTTP | detail.error | Notes |
|---|---|---|---|
| First time, success | 201 | โ | replayed: false |
| Same key + same body, prior call completed | 201 | โ | replayed: true, identical payload |
| Same key, different body | 409 | idempotency_conflict | Don't reuse keys across distinct intents |
| Prior request still in-flight (lock fresh) | 409 | idempotency_in_progress | Has retry_after_seconds + Retry-After header. Wait, retry SAME body. |
| Prior worker crashed (lock expired โฅ120 s) | 409 | idempotency_state_unknown | We don't auto-rerun (would risk 2nd Twilio dial). Use GET /calls?since=... to verify. |
Idempotency-Key length not in [1, 255] | 400 | idempotency_key_invalid | Header validation |
No Idempotency-Key header โ endpoint runs without replay guarantee. Network retries will produce duplicate calls. Don't ship that.
Content-Type: text/event-stream. Auth: X-API-Key. Use curl -N or any standard SSE client.
Last-Event-ID: <event_id> โ server backfills events with event_id > last from the call_events table before going live.Last-Event-ID): full event history for this call_id is replayed in order.event: heartbeat\ndata: {} every 15 s.outcome event, OR after 30 min wall-clock cap.503 too_many_sse_streams + Retry-After: 30.Every line is event: <type>\ndata: <json>\n\n. Each event includes id: <event_id> so SSE clients populate Last-Event-ID automatically.
| Event | Payload | Source |
|---|---|---|
status_change | {"status":"<queued|scheduled|dialing|in_progress|completed|failed|cancelled>","at":"<ISO>"} | call status transitions |
ask_user | {"request_id":"req_<hex>","message":"<question>","at":"<ISO>","answer_url":"/calls/.../answer"} | bot needs info |
answered | {"request_id":"req_<hex>","at":"<ISO>"} | echo of accepted POST /answer |
outcome | {"outcome_type":"<...>","billable":true,"summary":"<text>","at":"<ISO>"} | call analyzer result |
recording_ready | {"url":"/calls/.../recording","at":"<ISO>"} | Twilio recording posted |
heartbeat | {} | keepalive (not persisted) |
When the bot needs info during the call it emits ask_user to both the configured webhook and the SSE stream โ same request_id in both, dedupe by request_id.
Body:
| Field | Type | Notes |
|---|---|---|
answer | string | required, 1โ4000 chars |
request_id | string? | optional, โค64 chars. Recommended โ atomic UPDATE on durable row by PK; correctly disambiguates multiple in-flight ask_user. |
Without request_id: FIFO fallback โ oldest pending row for this call_id (FOR UPDATE SKIP LOCKED). Safe when only one ask_user is open at a time. Compatible with legacy webhook clients.
Response: {"delivered": <bool>}. delivered=false means no pending row matched (already answered, expired, call ended, unknown id). Idempotent โ same request_id twice โ second returns delivered=false.
The pending_skill_ask_user row is written before the SSE/webhook event fires. POST /answer can land on any pod (Istio path-hash routes to call's owner pod when possible, but works without it): the broker UPDATEs the row and pg_notify('ask_user_answer', request_id) wakes the owner pod.
Timeout: server-side asyncio.wait_for per ask_user. If timeout hits before answer, row goes pending โ timeout. Late answers return delivered=false.
Original entrypoint, unchanged. Use only when the structured Skill API doesn't fit your case (e.g. one-off cancellation, unusual ad-hoc task). For new agent integrations, prefer Skill API.
Body: target_phone (E.164), brief (free text), language (ru/en/auto), kind (default OTHER), parsed_slots (optional).
Response (201): {call: {call_id, call_sid, status, ...}, task_id, credits_reserved, owner_pod}. Same call lifecycle, same SSE/answer endpoints work.
Call queue / 202 Accepted: when the call queue is enabled server-side (CALL_QUEUE_ENABLED), POST /calls queues the call instead of dialing immediately and returns 202 with the slim body {call_id, queue_id, position, status:"queued"} โ no status_url and no call_sid; build the status URL yourself as /calls/{call_id}. The call is dialed shortly after at the account-wide CPS limit. Treat any 2xx as success, capture call_id, then follow the SSE stream or poll /calls/{call_id} โ the status moves queued โ dialing โ in_progress โ โฆ.
Returns array of CallDTO. transcript_full and supervisor_decisions are null in list responses (kept on single-call GET only).
Full CallDTO:
Status enum: queued / scheduled / dialing / in_progress / completed / failed / cancelled. (queued appears only when the call queue is enabled โ the call is accepted but not yet dialed.)
Updates DB status to cancelled and releases credit reservation. Does NOT terminate an already-active Twilio leg (Phase 3 limitation). Idempotent โ returns {cancelled: false} on already-final calls, never errors.
audio/mpeg. Both X-API-Key header AND ?api_key=KEY query param accepted (latter for HTML5 <audio>). 404 until Twilio posts the recording-status callback (typically a few seconds after outcome).
Default is 1 in-flight call per account. Raise as needed.
Returns the HMAC secret once โ store it. Standard Webhooks v1, SHA-256.
After removal, mid-call ask_user falls back to SSE-only delivery. If neither is configured, ask_user times out โ outcome failed_technical.
If your agent runs an HTTP server, configure a webhook to receive ask_user synchronously. Both webhook and SSE deliver the same event for the same request_id โ your agent must dedupe by request_id if it listens to both.
Webhook payload:
Headers: webhook-signature: ... (Standard Webhooks v1, HMAC-SHA256). Verify before trusting body.
Sync reply (โค30 s): 200 with body {"answer": "..."}.
Async reply: 202 with empty body, then POST /calls/{id}/answer with {answer, request_id}.
If your AI agent has neither a webhook server nor an SSE client, you can still operate (without ask_user mid-call):
POST /skills/{id}/run with Idempotency-Key.GET /calls/{call_id} every 2-3 s.status โ completed/failed/cancelled.outcome_type, outcome_summary, transcript_full; optionally fetch recording_url.ask_user mid-call without webhook OR SSE is not supported. Either configure one or include all needed info upfront in the input.
When a call reaches status=completed, outcome_type is one of:
success_booked โ booking confirmed by operatorsuccess_no_booking โ operator answered, no firm booking (info call)success_other โ task completed, neither booking nor refusalfailed_no_answer โ no one picked upfailed_voicemail โ answering machine detectedfailed_unsupported_language โ operator spoke a language we don't handlefailed_technical โ pipeline error (Twilio / Gemini / network)refused โ operator refused / no slotStructured errors use detail = {"error": "<code>", ...}. Codes are stable across versions.
| HTTP | detail.error | When | Retry? |
|---|---|---|---|
| 400 | idempotency_key_invalid | Header length not in [1, 255] | Fix key |
| 400 | invalid_input | Skill-side derivation failed | Fix body |
| 401 | โ | Missing or invalid API key | Fix key |
| 402 | quota_exceeded | Out of credits | Top up /checkout |
| 404 | skill_not_found | Unknown skill_id; payload includes available_skills | Fix id |
| 404 | Call not found | call_id not yours or doesn't exist | Fix id |
| 404 | Recording not available | Recording not posted yet | Wait + retry |
| 409 | concurrent_call_not_allowed | Per-customer cap; payload has active_call_ids + max_concurrent | Wait or raise cap |
| 409 | idempotency_conflict | Same key, different body | Fix body or new key |
| 409 | idempotency_in_progress | Prior request running, lock fresh; payload has retry_after_seconds | Wait Retry-After, retry SAME body |
| 409 | idempotency_state_unknown | Prior worker crashed | GET /calls?since=... to verify; do NOT retry blindly |
| 422 | Pydantic errors list | Body fails input_schema | Fix body |
| 422 | ask_user_channel_required | ask_user_mode="webhook" but no webhook URL on key | Configure webhook |
| 500 | โ | Server config issue | Backoff + retry |
| 502 | โ | Twilio dial failed; reservation released | Safe to retry with NEW key |
| 503 | maintenance | Admin-set maintenance window; payload has resume_at | Wait until resume |
| 503 | too_many_sse_streams | Per-pod SSE cap reached | Wait Retry-After |
| Spec | Value |
|---|---|
| Idempotency replay TTL | 24 h |
| Idempotency lock window | 120 s |
Idempotency-Key length | 1โ255 chars |
| SSE heartbeat interval | 15 s |
| SSE max stream duration | 30 min |
SSE post-outcome grace | 5 s |
| SSE per-pod stream cap | 200 |
| SSE per-subscriber queue size | 100 |
request_id format | req_<16-hex> |
| ask_user expiration | timeout + 5 s |
| Credits reserved at start | 200 |
| Default per-customer concurrency | 1 in-flight call |
| Webhook signature scheme | Standard Webhooks v1, SHA-256 |
| Sync webhook deadline | ~30 s |
POST /calls for new integrations โ schema validation, idempotency, structured errors, dynamic manifest.Idempotency-Key on POST /skills/{id}/run. One key per logical retry, NOT per network attempt. Without it, a connection blip will dial twice.curl -N. Resume with Last-Event-ID after disconnect.request_id in POST /answer. Disambiguates multiple in-flight ask_user. The bot supports parallel ask_user since Phase C.request_id if you listen to BOTH webhook and SSE โ same event delivered twice.phone_to_dictate./openapi.json over this page if anything disagrees. It's the byte-exact contract.