Skip to content

Text-to-Speech API

Pricing

10 credits per 1,000 characters

The cost is calculated based on the total character count of the text you submit.

Overview

The Text-to-Speech API converts text into natural-sounding speech using advanced neural voices. This endpoint supports multiple TTS engines including Edge, CapCut, and Google.

Endpoint

  • URL: POST https://api.revidapi.com/paid/text-to-speech
  • Method: POST

Request

Headers

  • x-api-key: Required. Your API key for authentication.
  • Content-Type: Required. Must be application/json.

Body Parameters

Required Parameters

Parameter Type Description
text string The text to convert to speech

Optional Parameters

Parameter Type Description
engine string TTS engine to use: edge, capcut, google. Default: edge
voice_id integer Voice ID for the selected engine
voice string Voice name (alternative to voice_id)
speed number Speech speed multiplier (0.5 to 2.0). Default: 1.0
pitch number Voice pitch adjustment (-20 to 20). Default: 0
webhook_url string (URI) URL to receive the result when processing is complete
id string Custom identifier for tracking the request

Voice Options

Option 1: Use voice_id

{ "text": "Hello", "voice_id": 1001 }

Option 2: Use engine + voice

{ "text": "Hello", "engine": "edge", "voice": "vi-VN-HoaiMyNeural" }

Speed Guide

Speed Effect
0.5 50% slower
1.0 Normal (default)
1.5 50% faster
2.0 100% faster

Example Request

{
  "text": "Hello, welcome to RevidAPI Text-to-Speech service.",
  "engine": "edge",
  "voice_id": 1001,
  "speed": 1.0,
  "pitch": 0,
  "webhook_url": "https://example.com/webhook",
  "id": "tts-request-123"
}
curl -X POST "https://api.revidapi.com/paid/text-to-speech" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, welcome to RevidAPI Text-to-Speech service.",
    "engine": "edge",
    "voice_id": 1001,
    "speed": 1.0,
    "webhook_url": "https://example.com/webhook",
    "id": "tts-request-123"
  }'

Response

Immediate Response (202 Accepted)

When a webhook URL is provided, the API returns an immediate acknowledgment with a task_id:

{
  "code": 202,
  "id": "tts-request-123",
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "message": "processing"
}

Success Response (via Webhook or Direct)

{
  "code": 200,
  "id": "tts-request-123",
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "response": {
    "audio_url": "https://storage.example.com/audio/output.mp3",
    "duration": 3.5
  },
  "message": "success"
}

Error Responses

Invalid Request (400)

{
  "code": 400,
  "id": "tts-request-123",
  "message": "Invalid request: 'text' is a required property"
}

Authentication Error (401)

{
  "code": 401,
  "message": "Invalid API key"
}

Workflow Recommendation

For asynchronous processing:

  1. Create Task: Send POST request to create the task
  2. Wait: Add a wait node (30-45 seconds) to allow server processing time
  3. Check Status: Use GET endpoint to check task status: GET https://tts.revidapi.com/api/get/{task_id}
  4. Retrieve Result: Once status is "completed", retrieve the audio URL from the response

Usage Notes

  1. Character Counting: Credits are calculated based on the total character count of the submitted text.
  2. Webhook Processing: When a webhook_url is provided, the request is processed asynchronously and results are sent to the webhook when complete.
  3. Voice Selection: You can use either voice_id (numeric) or voice (string name) to select a voice.
  4. Engine Support: Different engines support different voices. Check available voices for each engine.

Common Issues

  1. Invalid Voice: Ensure the voice_id or voice name is valid for the selected engine
  2. Text Length: Very long texts may take longer to process
  3. Webhook Delivery: Ensure webhook_url is publicly accessible

Best Practices

  1. Use Webhooks: Always use webhooks for better reliability
  2. Unique IDs: Provide unique id values for tracking
  3. Voice Testing: Test different voices to find the best match for your use case
  4. Speed Adjustment: Experiment with speed values to find the optimal speech rate