Question 1

What video formats are supported?

Accepted Answer

Interlocute supports common video formats including MP4, WebM, and MOV. Videos can be submitted via URL fetch, blob storage upload, or direct upload depending on your use case.

Question 2

What are the three indexing profiles?

Accepted Answer

Speech extracts transcripts and captions. Visual detects shots, scenes, OCR text, labels, and objects. Insights extracts entities, topics, sentiment, emotion, audio events, and safety signals. You can combine any or all of them in a single request.

Question 3

How does profile combination work?

Accepted Answer

When you select multiple profiles, the platform takes the union of their signal sets and selects the appropriate provider preset automatically. You are billed for the combined analysis, not per-profile — so combining Speech + Insights costs less than running them separately.

Question 4

Can I get sentiment or emotion analysis?

Accepted Answer

Yes, via the Insights profile. The index includes time-bounded sentiment segments (Positive, Negative, Neutral) and emotion segments (Joy, Sadness, Anger, Fear, Surprise, Disgust).

Question 5

How do AI-generated video summaries work?

Accepted Answer

You request a textual summary specifying length, style, and optional custom instructions. The summary is generated asynchronously — you submit the request, poll for completion, and retrieve the result. Keyframe images can be included for visual context.

Question 6

Is video processing metered?

Accepted Answer

Yes. Video indexing operations are metered by duration and complexity. Every submission and result retrieval is logged in your usage ledger with full attribution to the node and API key.

Question 7

Can I re-index a video with a different profile?

Accepted Answer

Yes. The ReIndex operation lets you re-process an existing video with a different profile combination — for example, adding Insights to a video that was originally indexed with Speech only — without re-uploading the source file.

Video Intelligence

Speech profile — what was said

Visual profile — what was seen

Insights profile — what it means

Composable profiles

AI-generated summaries

Streaming & playback

Frequently Asked Questions

Documentation

Related Features

Document Processing

Image Intelligence

RAG (Knowledge Retrieval)

Ready to build with Video Intelligence?