Generation¶
Generation is the core of Storywright — the AI writes your story one scene at a time, drawing on characters, lorebooks, memories, and continuity to produce coherent, context-aware fiction.
How Generation Works¶
For each scene, Storywright:
- Assembles a rich context prompt from your characters, lorebook entries, memories, continuity state, and scene plan.
- Sends it to the AI and streams the response back in real-time — you see words appearing as the AI writes.
- Extracts structured data (summary, memories, continuity changes) so future scenes stay consistent.
This pipeline runs automatically every time you generate or revise a scene.
The Generation Pipeline¶
1. Context Assembly (PromptBuilder)¶
Before calling the AI, Storywright builds a prompt from multiple layers. Each layer adds context the AI needs to write a great scene:
| Layer | What It Contains |
|---|---|
| System prompt | The writing style you selected (e.g., literary-fiction, horror) plus content boundaries. |
| Character cards | Full card text for every character assigned to this story. |
| Lorebook entries | Keyword-matched entries from World, Story, and Character lorebooks. Only entries whose keywords match the current context are included — the rest are left out to save tokens. |
| Memory recall | Semantically similar memories from previous scenes, recalled via vector similarity (cosine similarity against the current scene's outline/context). |
| Continuity ledger | The current state of all tracked entities — location, appearance, possessions, relationships, emotional state, and more. |
| Scene summaries | 2–3 sentence recaps of previous scenes (rolling window). Keeps the AI aware of what happened without using the full text. |
| Previous scene text | The full text of the immediately preceding scene, for tight continuity. |
| Scene plan | The current scene's title, outline, and word count target. |
| Variety seeds | Optional randomization to prevent repetitive patterns. |
The layers are assembled in the order listed above. Together they give the AI a comprehensive view of the story so far and clear instructions for what to write next.
2. Streaming Generation¶
- The assembled prompt is sent to the AI via an OpenAI-compatible API.
- Text streams back in real-time — you see words appearing as the AI writes.
- A live word count updates as text arrives.
- Click Stop at any time to cancel mid-generation. Partial text is kept.
3. Post-Processing (4 Parallel AI Calls)¶
After generation completes, four extraction calls run simultaneously:
- Summarization — Extracts a 2–3 sentence recap of the scene. This summary is stored and used in future scenes' rolling context window.
- Memory extraction — Identifies topic-tagged facts from the scene (e.g., "Character discovery: Protagonist learned about the treasure map"). Each memory is embedded as a vector for semantic recall in later scenes.
- Continuity update — Extracts entity state changes as deltas and updates the continuity ledger (e.g., "Elena: location changed from 'library' to 'rooftop'").
- Thread extraction — Identifies narrative threads (ongoing storylines, mysteries, relationships) and discovers new characters mentioned in the scene text. Threads are tracked across scenes to help the AI maintain story arcs.
Post-processing also generates character card suggestions — if the AI notices that story events reveal new information about a character (personality shift, new relationship, physical change), it creates a suggestion to update the card. When suggestions are available, a snackbar notification appears: "N new suggestions ready". Click Review to open the Suggestion Inbox.
All four extraction calls run in parallel to minimize wait time — post-processing usually takes 5–10 seconds total.
4. Save¶
- Scene text, summary, memories, and continuity updates are all saved.
- The next scene's generation will automatically read this updated context.
- You don't need to do anything — saving is automatic.
Model Roles¶
Storywright uses three AI model roles. You can configure each one independently to balance quality and cost:
| Role | Used For | Recommendation |
|---|---|---|
| Writing model | Scene generation and revision. | Use the best model you can afford (e.g., GPT-4.1, Claude Sonnet 4, Gemini 2.5 Pro). Quality matters most here. |
| Planning model | Scene planning and plan refinement. | Needs good reasoning but doesn't need to be as creative. |
| Extraction model | Summaries, memories, and continuity extraction. | Can be a cheaper/faster model (e.g., GPT-4.1-nano, Claude Haiku 3.5, Gemini Flash). These are structured extraction tasks. |
Configure model roles in Settings → Models, or override them per-story in the story workspace.
Batch Generation¶
Want a complete first draft fast? Use batch generation.
- Click Batch Generate to generate all remaining scenes from the current scene onward.
- Progress shows as each scene completes: "Scene 1/5", "Scene 2/5", etc.
- Each scene streams in real-time with post-processing running after each one.
- Auto-saves after every scene — if something goes wrong, you don't lose work.
- You can cancel between scenes (but not mid-scene).
Batch generation is great for producing a rough first draft that you can then go back and revise scene by scene.
Scene Revision¶
After generating a scene, you can revise it as many times as you like.
- Type freeform instructions describing what you want changed:
- "Add more tension"
- "Rewrite from her perspective"
- "Make the dialogue sharper"
- "Cut this scene to 500 words"
- The AI rewrites the scene using the same context assembly plus your revision instructions.
- Full undo/redo: every revision is stored as a separate version. You can go back to any previous version at any time.
- You can also manually edit text in the editor whenever you want — no need to use AI revision for small tweaks.
Cost and Tokens¶
Storywright uses your API key — you pay per token directly to your API provider.
- Context assembly is designed to be efficient. Only relevant lorebook entries are included (keyword-matched), and previous scenes are compressed into short summaries rather than included in full.
- Use a cheaper extraction model. Since the extraction model handles structured tasks (not creative writing), you can save money by assigning it a faster, less expensive model.
- Token usage is displayed in the app. Check your API provider's dashboard for detailed cost breakdowns.
- Typical cost: a 10-scene story runs roughly $0.50–$2.00 depending on models chosen and scene length.
Stopping Generation¶
| Situation | What to Do | What Happens |
|---|---|---|
| During streaming | Click Stop. | Generation cancels. Partial text is kept. |
| During batch generation | Cancel between scenes. | Completed scenes are saved. The current scene may be partially written. |
| During post-processing | Wait for it to finish. | Usually takes 5–10 seconds. |
Status Indicators¶
The toolbar shows the current generation state:
- Planning... — AI is generating a scene plan.
- Writing scene X... — AI is generating scene text.
- Post-processing... — Extracting summaries, memories, and continuity.
- Generating... — Batch generation in progress.
Tips¶
- Start with a good plan. Generation quality depends heavily on the scene outline — a detailed plan produces better scenes.
- Match the writing model to your budget and quality needs. A top-tier model produces noticeably better prose.
- Set the extraction model to something fast and cheap. It doesn't need to be creative — it just needs to extract structured data reliably.
- Review generated scenes before moving on. The AI builds on previous scenes, so catching issues early prevents them from compounding.
- Use revision for targeted fixes rather than regenerating from scratch.
- Batch generate for drafts, then go back and revise individual scenes for polish.
Related Guides¶
- Stories — the story workflow
- Context Systems — how memory, continuity, and summaries work in detail
- Lorebooks — how lorebook entries are keyword-matched
- Settings — configuring API and models