Three prompt shapes that work
1. Targeted scene analysis
When you have the specific scene UUIDs you care about, send only those scenes plus the character index. Small context, high signal.
const context = {
title: doc.title,
characters: doc.characters.map(({ id, name, traits }) => ({ id, name, traits })),
scenes: doc.document.scenes
.filter((s) => targetSceneIds.includes(s.id))
.map((s) => ({
id: s.id,
heading: s.heading,
body: s.body.map((el) => ({
type: el.type,
character: el.character,
text: el.text?.en,
})),
})),
};
2. Retrieval-augmented (RAG)
Use pre-computed embeddings to retrieve top-k passages, then send the passages plus their scene headings and the document character index. See generate embeddings.
3. Whole-document with summaries
For questions that need the full shape (“what is this script about”),
send the cover, logline, characters, scene headings only (no bodies),
and — if present — scene-scoped summaries from analysis.summaries.
Structured output, always
Always ask the model to return JSON with citations:
const schema = {
type: 'object',
properties: {
answer: { type: 'string' },
citations: {
type: 'array',
items: {
type: 'object',
properties: {
scene: { type: 'string' },
element: { type: 'string' },
excerpt: { type: 'string' },
},
required: ['scene', 'element'],
},
},
},
required: ['answer', 'citations'],
};
Validate every returned scene/element UUID against the source
document. Reject on mismatch — that’s a hallucination check.
System prompt template
You are analysing a screenplay in ScreenJSON format.
Every scene has an `id`; every element inside a scene body has an `id`.
When you cite text, return the scene id AND the element id.
Never invent ids that aren't in the input.