TwinTone’s Video Agent API empowers developers to seamlessly integrate AI-powered video agents into their applications, unlocking a new dimension of interactive and dynamic digital experiences. This REST API enables organizations to programmatically create and manage AI personas, facilitate real-time streaming conversations, and generate lifelike scripted speech.
Our documentation is optimized for efficiency, designed to fit within the context limits of leading large language models like Claude-3.5-Sonnet and ChatGPT-4o. Developers can use this guide to access a robust set of endpoints covering AI Agents videos, conversations, personas, and speech synthesis.
For best results, we recommend passing this documentation to an LLM for quick interpretation, but it's also structured for easy reading and navigation. Our API supports secure authentication via API keys, follows standard RESTful conventions, and includes detailed error handling for troubleshooting.
For API Keys, questions or support, feel free to reach out:
Recording Your Training Footage
Your journey to creating a personal AI Agent begins with a simple requirement:
A 2-minute video of you engaging with the camera. There is no predefined script beyond the consent statement, you can discuss anything that showcases your natural speaking style and expertise.
Tips for Success
Our platform simplifies the first step. Use your webcam to capture the essence of your persona. Achieving the best possible AI Agent involves attention to detail.
Here’s how:
✅ Do: Utilize high-definition recording equipment, ensure proper lighting, and maintain focus on your face and upper body. Aim for a quiet, well-lit setting, and speak naturally. See more in Best Practices & Examples.
🚫 Don’t: Wear clothes that blend with the background, bulky accessories, or any headwear that obscures your face. Keep your gaze steady, minimize background distractions, and avoid excessive movement.
Consent
An integral part of the process involves reading a specific authorization phrase. This step confirms your consent and kicks off the AI Agent creation process.
“I, [FULL NAME], am currently speaking and give consent to TwinTone to create an AI clone of me by using the audio and video samples I provide. I understand that this AI clone can be used to create videos that look and sound like me.”
Best Practices & Examples:
Set Up: Environment
🌞 Lighting
Ensure your face is evenly lit with no shadows.
Example: If a window casts shadows on your face, change your orientation or use a ring light to even it out.
A large diffuse light will work best, providing consistent even and neutral lighting for the entire face.
This helps Phoenix to properly map your face, resulting in a better-looking video overall.
🔊 Noise
Your space should be silent or almost silent.
Avoid noise from air conditioning, construction, traffic, refrigerators, and conversations.
Choose rooms with minimal reverb to prevent sound amplification.
Clean audio, free from background noises, will produce the best audio output for your AI Agent.
🌆 Background
Keep your background clear.
Remove moving objects.
Ensure no other people are visible in the video.
Set Up: Equipment
📷 Camera & Placement
Use a high-quality camera with at least 2K pixels.
Examples: DSLR, newer laptops, iPhones, Samsung Galaxy, or Google Pixel.
Frames per second: Optimal FPS is 30, but 24-60 FPS is acceptable.
Distance: Maintain a distance of 3ft-6ft (or 0.9m-1.8m) from the camera.
Level: The camera should be at eye level.
Lens: Ensure the camera lens is clean of smudges.
🎙️ Microphone
Start with your phone or computer’s microphone.
A high-quality mic can mitigate background noise/echo.
For external USB or XLR mics:
Place the mic 1ft (0.3m) from your mouth, not exceeding 2-3ft (0.5-0.9m).
Position the mic at least 1 inch below your chin to avoid blocking your mouth.
Wireless earbuds, like Apple AirPods or Samsung Galaxy Buds, are not recommended due to poor mic quality.
👾 Software
Disable any software-based audio enhancements.
Turn off compressors, equalizers, noise suppression, etc., as we perform our own sound processing post-recording.
Set Up: Yourself
👀 Gaze
Maintain eye level with the camera and act naturally.
🗣️ Speaking Vibe & Pace
Be yourself and relax.
Pace: Take your time, don’t rush.
Pausing: Close your lips during pauses (the script will remind you).
Tone: Aim for an upbeat tone to keep content positive and engaging. Keep continuous eye contact with the camera. Be animated in your mouth, eyes, and cheeks.
Gestures: Keep hand gestures to a minimum and avoid blocking your face.
Mistakes: If you stumble, continue speaking. Perfection isn’t necessary.
🎅 Accessories & Beards
If possible, avoid beards, glasses, and accessories.
Our model is still being refined to better process these elements.
This comprehensive guide ensures you capture the highest quality footage for your AI Agent leading to a more authentic and engaging digital representation.
Base URL
API Base URL:
Authentication
All API requests require an API key for authentication. Include the key in the x-api-key header for every request.
Example: x-api-key: your-api-key
Endpoints
Replicas
Value
List all replicas
Get /replicas
Create a new replica
POST /replicas
Retrieve details of a specific replica
GET /replicas/:id
Delete a replica
DELETE /replicas/:id
Update replica’s name
PATCH /replicas/:id
Videos
Value
List all generated videos
GET /videos
Generate a new video
POST /videos
Get a specific video
GET /videos/:id
Delete a replica
DELETE /replicas/:id
Update replica’s name
PATCH /replicas/:id
Delete a video
DELETE /videos/:id
Update video name
PATCH /videos/:id
Conversations
Value
List all conversations
GET /conversations
Create a new conversation
POST /conversations
Get a specific conversation
GET /conversations/:id
End a conversation
POST /conversations/:id
Delete a conversation
DELETE /conversations/:id
Personas
Value
List all personas
/personas
Create a new persona
POST /personas
Get a specific video
GET /videos/:id
Get a specific persona
GET /personas/:id
Update a persona
PATCH /personas/:id
Delete a persona
DELETE /personas/:id
Personas
Value
List all speech generations
GET /speech
Generate new speech
POST /speech
Get specific speech generation
GET /speech/:id
Delete speech
DELETE /speech/:id
Update speech name
PATCH /speech/:id
Error Handling
The API uses standard HTTP status codes to indicate the success or failure of requests. Responses include an error message for troubleshooting.
All errors return semantic HTTP status codes
502 status code likely indicates proxy error
Example error response:
{
"error": "InvalidRequest",
"message": "The provided parameters are invalid.",
"status": 400
}
Rate Limiting
Keep-alive connections are limited to 1 socket - can discuss if you have a use case that requires more.
Timeout set to 60 seconds for all requests
Exceeded Limits: Return status code 429.
Endpoint details
Videos
POST /videos Generate a new video using a replica.
GET /videos/:id Retrieve details of a specific video generation.
Response:
{
"id": "string",
"name": "string",
"status": "processing|ready|failed", // Indicates the current processing state
"url": "string", // Present only if status is "ready"
"created_at": "string",
"error": "string" // Present only if status is "failed"
}
using System; using System.Net.Http; using System.Text;
using System.Text.Json; using System.Threading.Tasks;
class TwintoneClient { private readonly HttpClient _client; private const string
BaseUrl = "https://api.twintone.ai/vi";public TwintoneClient(string apiKey)
{
_client = new HttpClient();
_client.DefaultRequestHeaders.Add("x-api-key", apiKey);
}
// GET /replicas
public async Task<string> GetReplicas()
{
var response = await _client.GetAsync($"{BaseUrl}/replicas");
return await response.Content.ReadAsStringAsync();
}
// POST /videos
public async Task<string> CreateVideo()
{
var data = new
{
replica_id = "rep_123",
script = "Hello world!",
name = "My First Video"
};
var content = new StringContent(
JsonSerializer.Serialize(data),
Encoding.UTF8,
"application/json"
);
var response = await _client.PostAsync($"{BaseUrl}/videos", content);
return await response.Content.ReadAsStringAsync();
}
// PATCH /personas/:id
public async Task<string> UpdatePersona(string personaId)
{
var data = new
{
name = "Updated Name",
description = "New description"
};
var content = new StringContent(
JsonSerializer.Serialize(data),
Encoding.UTF8,
"application/json"
);
var response = await _client.PatchAsync(
$"{BaseUrl}/personas/{personaId}",
content
);
return await response.Content.ReadAsStringAsync();
}
// DELETE /speech/:id
public async Task DeleteSpeech(string speechId)
{
await _client.DeleteAsync($"{BaseUrl}/speech/{speechId}");
}
Contact
Disclaimer
The TwinTone API is provided "as is" without warranty of any kind. TwinTone disclaims all warranties, whether express, implied, or statutory, including without limitation any implied warranties of merchantability, fitness for a particular purpose, or non-infringement. TwinTone does not guarantee that the API will be available uninterrupted or error-free, and TwinTone will not be responsible for any damages or losses arising from the use of the API.
This documentation is subject to change. Please refer to the online version for the most up-to-date information.
Should you have any further questions, suggestions or feedback regarding this documentation, please contact: