Skip to content

Video Captioning Endpoint (v1)

1. Overview

The https://api.revidapi.com/paid/video/caption endpoint is part of the Video API and is responsible for adding captions to a video file. It accepts a video URL, caption text, and various styling options for the captions.

2. Endpoint

URL: https://api.revidapi.com/paid/video/caption Method: POST

3. Request

Headers

  • x-api-key: Required. The API key for authentication.

Body Parameters

The request body must be a JSON object with the following properties:

  • video_url (string, required): The URL of the video file to be captioned.
  • captions (string, optional): Can be one of the following:
  • Raw caption text to be added to the video
  • URL to an SRT subtitle file
  • URL to an ASS subtitle file
  • If not provided, the system will automatically generate captions by transcribing the audio from the video
  • settings (object, optional): An object containing various styling options for the captions. See the schema below for available options.
  • replace (array, optional): An array of objects with find and replace properties, specifying text replacements to be made in the captions.
  • webhook_url (string, optional): A URL to receive a webhook notification when the captioning process is complete.
  • id (string, optional): An identifier for the request.
  • language (string, optional): The language code for the captions (e.g., "en", "fr"). Defaults to "auto".
  • exclude_time_ranges (array, optional): List of time ranges to skip when adding captions. Each item must be an object with:
  • start: (string, required) The start time of the excluded range, as a string timecode in hh:mm:ss.ms format (e.g., 00:01:23.456).
  • end: (string, required) The end time, as a string timecode in hh:mm:ss.ms format, which must be strictly greater than start. If either value is not a valid timecode string, or if end is not greater than start, the request will return an error.

Settings Schema

{
    "type": "object",
    "properties": {
        "line_color": {"type": "string"},
        "word_color": {"type": "string"},
        "outline_color": {"type": "string"},
        "all_caps": {"type": "boolean"},
        "max_words_per_line": {"type": "integer"},
        "x": {"type": "integer"},
        "y": {"type": "integer"},
        "position": {
            "type": "string",
            "enum": [
                "bottom_left", "bottom_center", "bottom_right",
                "middle_left", "middle_center", "middle_right",
                "top_left", "top_center", "top_right"
            ]
        },
        "alignment": {
            "type": "string",
            "enum": ["left", "center", "right"]
        },
        "font_family": {"type": "string"},
        "font_size": {"type": "integer"},
        "bold": {"type": "boolean"},
        "italic": {"type": "boolean"},
        "underline": {"type": "boolean"},
        "strikeout": {"type": "boolean"},
        "style": {
            "type": "string",
            "enum": [
                "classic",     // Regular captioning with all text displayed at once
                "karaoke",     // Highlights words sequentially in a karaoke style
                "highlight",   // Shows full text but highlights the current word
                "underline",   // Shows full text but underlines the current word
                "word_by_word" // Shows one word at a time
            ]
        },
        "outline_width": {"type": "integer"},
        "spacing": {"type": "integer"},
        "angle": {"type": "integer"},
        "shadow_offset": {"type": "integer"}
    },
    "additionalProperties": false
}

Example Requests

Example 1: Basic Automatic Captioning

{
    "video_url": "https://example.com/video.mp4"
}
This minimal request will automatically transcribe the video and add white captions at the bottom center.

Example 2: Custom Text with Styling

{
    "video_url": "https://example.com/video.mp4",
    "captions": "This is a sample caption text.",
    "settings": {
        "style": "classic",
        "line_color": "#FFFFFF",
        "outline_color": "#000000",
        "position": "bottom_center",
        "alignment": "center",
        "font_family": "Arial",
        "font_size": 24,
        "bold": true
    }
}

Example 3: Karaoke-Style Captions with Advanced Options

{
    "video_url": "https://example.com/video.mp4",
    "settings": {
        "line_color": "#FFFFFF",
        "word_color": "#FFFF00",
        "outline_color": "#000000",
        "all_caps": false,
        "max_words_per_line": 10,
        "position": "bottom_center",
        "alignment": "center",
        "font_family": "Arial",
        "font_size": 24,
        "bold": false,
        "italic": false,
        "style": "karaoke",
        "outline_width": 2,
        "shadow_offset": 2
    },
    "replace": [
        {
            "find": "um",
            "replace": ""
        },
        {
            "find": "like",
            "replace": ""
        }
    ],
    "webhook_url": "https://example.com/webhook",
    "id": "request-123",
    "language": "en"
}

Example 4: Using an External Subtitle File

{
    "video_url": "https://example.com/video.mp4",
    "captions": "https://example.com/subtitles.srt",
    "settings": {
        "line_color": "#FFFFFF",
        "outline_color": "#000000",
        "position": "bottom_center",
        "font_family": "Arial",
        "font_size": 24
    }
}

Example 5: Excluding Time Ranges from Captioning

{
    "video_url": "https://example.com/video.mp4",
    "settings": {
        "style": "classic",
        "line_color": "#FFFFFF",
        "outline_color": "#000000",
        "position": "bottom_center",
        "font_family": "Arial",
        "font_size": 24
    },
    "exclude_time_ranges": [
        { "start": "00:00:10.000", "end": "00:00:20.000" },
        { "start": "00:00:30.000", "end": "00:00:40.000" }
    ]
}
curl -X POST \
     -H "x-api-key: YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{
        "video_url": "https://example.com/video.mp4",
        "settings": {
            "line_color": "#FFFFFF",
            "word_color": "#FFFF00",
            "outline_color": "#000000",
            "all_caps": false,
            "max_words_per_line": 10,
            "position": "bottom_center",
            "alignment": "center",
            "font_family": "Arial",
            "font_size": 24,
            "style": "karaoke",
            "outline_width": 2
        },
        "replace": [
            {
                "find": "um",
                "replace": ""
            }
        ],
        "id": "custom-request-id"
    }' \
    https://api.revidapi.com/paid/video/caption

4. Response

Success Response

The response will be a JSON object with the following properties:

  • code (integer): The HTTP status code (200 for success).
  • id (string): The request identifier, if provided in the request.
  • job_id (string): A unique identifier for the job.
  • response (string): The cloud URL of the captioned video file.
  • message (string): A success message.
  • pid (integer): The process ID of the worker that processed the request.
  • queue_id (integer): The ID of the queue used for processing the request.
  • run_time (float): The time taken to process the request (in seconds).
  • queue_time (float): The time the request spent in the queue (in seconds).
  • total_time (float): The total time taken for the request (in seconds).
  • queue_length (integer): The current length of the processing queue.
  • build_number (string): The build number of the application.

Example:

{
    "code": 200,
    "id": "request-123",
    "job_id": "d290f1ee-6c54-4b01-90e6-d701748f0851",
    "response": "https://cloud.example.com/captioned-video.mp4",
    "message": "success",
    "pid": 12345,
    "queue_id": 140682639937472,
    "run_time": 5.234,
    "queue_time": 0.012,
    "total_time": 5.246,
    "queue_length": 0,
    "build_number": "1.0.0"
}

Error Responses

Missing or Invalid Parameters

Status Code: 400 Bad Request

{
    "code": 400,
    "id": "request-123",
    "job_id": "d290f1ee-6c54-4b01-90e6-d701748f0851",
    "message": "Missing or invalid parameters",
    "pid": 12345,
    "queue_id": 140682639937472,
    "queue_length": 0,
    "build_number": "1.0.0"
}

Font Error

Status Code: 400 Bad Request

{
    "code": 400,
    "error": "The requested font 'InvalidFont' is not available. Please choose from the available fonts.",
    "available_fonts": ["Arial", "Times New Roman", "Courier New", ...],
    "pid": 12345,
    "queue_id": 140682639937472,
    "queue_length": 0,
    "build_number": "1.0.0"
}

Internal Server Error

Status Code: 500 Internal Server Error

{
    "code": 500,
    "id": "request-123",
    "job_id": "d290f1ee-6c54-4b01-90e6-d701748f0851",
    "error": "An unexpected error occurred during the captioning process.",
    "pid": 12345,
    "queue_id": 140682639937472,
    "queue_length": 0,
    "build_number": "1.0.0"
}

5. Error Handling

The endpoint handles the following common errors:

  • Missing or Invalid Parameters: If any required parameters are missing or invalid, a 400 Bad Request error is returned with a descriptive error message.
  • Font Error: If the requested font is not available, a 400 Bad Request error is returned with a list of available fonts.
  • Internal Server Error: If an unexpected error occurs during the captioning process, a 500 Internal Server Error is returned with an error message.

Additionally, the main application context (app.py) includes error handling for queue overload. If the maximum queue length (MAX_QUEUE_LENGTH) is set and the queue size reaches that limit, a 429 Too Many Requests error is returned with a descriptive message.

6. Usage Notes

  • The video_url parameter must be a valid URL pointing to a video file (MP4, MOV, etc.).
  • The captions parameter is optional and can be used in multiple ways:
  • If not provided, the endpoint will automatically transcribe the audio and generate captions
  • If provided as plain text, the text will be used as captions for the entire video
  • If provided as a URL to an SRT or ASS subtitle file, the system will use that file for captioning
  • For SRT files, only 'classic' style is supported
  • For ASS files, the original styling will be preserved
  • The settings parameter allows for customization of the caption appearance and behavior:
  • style determines how captions are displayed, with options including:
    • classic: Regular captioning with all text displayed at once
    • karaoke: Highlights words sequentially in a karaoke style as they're spoken
    • highlight: Shows the full caption text but highlights each word as it's spoken
    • underline: Shows the full caption text but underlines each word as it's spoken
    • word_by_word: Shows only one word at a time
  • position can be used to place captions in one of nine positions on the screen
  • alignment determines text alignment within the position (left, center, right)
  • font_family can be any available system font
  • Color options can be set using hex codes (e.g., "#FFFFFF" for white)
  • The replace parameter can be used to perform text replacements in the captions (useful for correcting words or censoring content).
  • The webhook_url parameter is optional and can be used to receive a notification when the captioning process is complete.
  • The id parameter is optional and can be used to identify the request in webhook responses.
  • The language parameter is optional and can be used to specify the language of the captions for transcription. If not provided, the language will be automatically detected.
  • The exclude_time_ranges parameter can be used to specify time ranges to be excluded from captioning.

7. Common Issues

  • Providing an invalid or inaccessible video_url.
  • Requesting an unavailable font in the settings object.
  • Exceeding the maximum queue length, resulting in a 429 Too Many Requests error.

8. Best Practices

  • Validate the video_url parameter before sending the request to ensure it points to a valid and accessible video file.
  • Use the webhook_url parameter to receive notifications about the captioning process, rather than polling the API for updates.
  • Provide descriptive and meaningful id values to easily identify requests in logs and responses.
  • Use the replace parameter judiciously to avoid unintended text replacements in the captions.
  • Consider caching the captioned video files for frequently requested videos to improve performance and reduce processing time.