Search documentation
Search pages, API reference sections, and guide headings.
Summary
Prompt schemas define what gets extracted, what becomes searchable, and what can be reused in video-level synthesis. This page covers the public schema rules and field-path conventions.
VideoVector uses JSON Schema with a root object to define the output contract for prompt extraction.
Schema-aware metadata extraction means the prompt output is shaped by your JSON schema before analysis runs. Video,
audio, and image workflows can return nested schema outputs, repeated object paths, and metadata_text that later
supports search, filters, exports, and video-level synthesis.
Segment-level schema
The segment-level schema lives in json_schema.
- The root type must be
object. - Fields can be primitive values, objects, or arrays.
- Nested objects and repeated objects are supported and surfaced in search and filter tooling.
- The schema is part of the prompt definition, not part of the prompt run request.
{
"type": "object",
"properties": {
"summary": { "type": "string" },
"scene": {
"type": "object",
"properties": {
"location": { "type": "string" },
"people": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"emotion": { "type": "string" }
}
}
}
}
}
}
}
Nested fields and repeated object paths
Nested data is addressable in public search and filter tooling through canonical field paths:
scene.locationscene.people[].namescene.people[].emotion
Repeated objects use [] in field paths. That same path form appears in condition-style filtering and in search field selection.
Use stable, explicit property names. Search, filter, and semantic indexing configuration are easier to manage when field paths remain predictable across prompt iterations.
Reserved field-name rules
Prompt field names are part of the public query contract. Avoid special characters that break field-path parsing.
- Do not use
.,[or]inside field names. - Do not use reserved internal field names such as
__pydantic_extra__. - Dictionary-like object fields that disallow additional properties should still define at least one nested field.
Semantic indexing controls
Semantic indexing is configured at the prompt level:
disabled_segment_fieldsdisabled_video_level_fields
Use those lists when you want fields to remain in structured output but stay out of semantic embedding.
Typical reasons to disable a field:
- the field is high-volume but low-value for semantic retrieval
- the field contains internal bookkeeping
- the field is dynamic or sparse enough that embedding it would add noise
Video-level schema
The optional video_level block adds a second schema for media-wide synthesis:
instructions_text: the video-wide instructionincluded_segment_fields: segment fields supplied to the synthesis stepjson_schema: the video-level output contract
included_segment_fields can reference declared segment fields and certain system-provided fields such as transcription and metadata_text. Timing context such as start_time and end_time is automatically included where applicable.
{
"instructions_text": "Summarize the full program and identify the primary incident timeline.",
"included_segment_fields": ["summary", "scene.people[].emotion", "transcription"],
"json_schema": {
"type": "object",
"properties": {
"program_summary": { "type": "string" },
"incident_timeline": {
"type": "array",
"items": { "type": "string" }
}
}
}
}
Schema testing
The public surface includes schema validation endpoints and matching SDK/MCP methods so you can validate sample data against a schema before storing the prompt. Use that validation step whenever the schema includes nested objects, repeated arrays, or constrained types.
Design recommendations
- Put evidence-level facts in segment output.
- Put rollups, totals, and cross-segment conclusions in
video_level. - Keep field names operationally useful because they appear later in search, filters, and exports.
- Treat schema changes as contract changes. Existing prompt runs keep their own historical output shape.
Related documentation
This guide shows how to define a prompt with nested and repeated fields, validate the schema, and keep the output shape usable for search and filtering.
Add a second prompt layer that rolls segment evidence into one result per media item without replacing the segment-level output.
Prompts define the extraction contract. The public API supports prompt CRUD, schema testing, usage inspection, and prompt-definition draft generation.
