Skip to content

Generation

Generation is the core of Storywright — the AI writes your story one scene at a time, drawing on characters, lorebooks, memories, and continuity to produce coherent, context-aware fiction.


How Generation Works

For each scene, Storywright:

  1. Assembles a rich context prompt from your characters, lorebook entries, memories, continuity state, and scene plan.
  2. Sends it to the AI and streams the response back in real-time — you see words appearing as the AI writes.
  3. Extracts structured data (summary, memories, continuity changes) so future scenes stay consistent.

This pipeline runs automatically every time you generate or revise a scene.


The Generation Pipeline

1. Context Assembly (PromptBuilder)

Before calling the AI, Storywright builds a prompt from multiple layers. Each layer adds context the AI needs to write a great scene:

Layer What It Contains
System prompt The writing style you selected (e.g., literary-fiction, horror) plus content boundaries.
Character cards Full card text for every character assigned to this story.
Lorebook entries Keyword-matched entries from World, Story, and Character lorebooks. Only entries whose keywords match the current context are included — the rest are left out to save tokens.
Memory recall Semantically similar memories from previous scenes, recalled via vector similarity (cosine similarity against the current scene's outline/context).
Continuity ledger The current state of all tracked entities — location, appearance, possessions, relationships, emotional state, and more.
Scene summaries 2–3 sentence recaps of previous scenes (rolling window). Keeps the AI aware of what happened without using the full text.
Previous scene text The full text of the immediately preceding scene, for tight continuity.
Scene plan The current scene's title, outline, and word count target.
Variety seeds Optional randomization to prevent repetitive patterns.

The layers are assembled in the order listed above. Together they give the AI a comprehensive view of the story so far and clear instructions for what to write next.

2. Streaming Generation

  • The assembled prompt is sent to the AI via an OpenAI-compatible API.
  • Text streams back in real-time — you see words appearing as the AI writes.
  • A live word count updates as text arrives.
  • Click Stop at any time to cancel mid-generation. Partial text is kept.

3. Post-Processing (4 Parallel AI Calls)

After generation completes, four extraction calls run simultaneously:

  1. Summarization — Extracts a 2–3 sentence recap of the scene. This summary is stored and used in future scenes' rolling context window.
  2. Memory extraction — Identifies topic-tagged facts from the scene (e.g., "Character discovery: Protagonist learned about the treasure map"). Each memory is embedded as a vector for semantic recall in later scenes.
  3. Continuity update — Extracts entity state changes as deltas and updates the continuity ledger (e.g., "Elena: location changed from 'library' to 'rooftop'").
  4. Thread extraction — Identifies narrative threads (ongoing storylines, mysteries, relationships) and discovers new characters mentioned in the scene text. Threads are tracked across scenes to help the AI maintain story arcs.

Post-processing also generates character card suggestions — if the AI notices that story events reveal new information about a character (personality shift, new relationship, physical change), it creates a suggestion to update the card. When suggestions are available, a snackbar notification appears: "N new suggestions ready". Click Review to open the Suggestion Inbox.

All four extraction calls run in parallel to minimize wait time — post-processing usually takes 5–10 seconds total.

4. Save

  • Scene text, summary, memories, and continuity updates are all saved.
  • The next scene's generation will automatically read this updated context.
  • You don't need to do anything — saving is automatic.

Model Roles

Storywright uses three AI model roles. You can configure each one independently to balance quality and cost:

Role Used For Recommendation
Writing model Scene generation and revision. Use the best model you can afford (e.g., GPT-4.1, Claude Sonnet 4, Gemini 2.5 Pro). Quality matters most here.
Planning model Scene planning and plan refinement. Needs good reasoning but doesn't need to be as creative.
Extraction model Summaries, memories, and continuity extraction. Can be a cheaper/faster model (e.g., GPT-4.1-nano, Claude Haiku 3.5, Gemini Flash). These are structured extraction tasks.

Configure model roles in Settings → Models, or override them per-story in the story workspace.


Batch Generation

Want a complete first draft fast? Use batch generation.

  • Click Batch Generate to generate all remaining scenes from the current scene onward.
  • Progress shows as each scene completes: "Scene 1/5", "Scene 2/5", etc.
  • Each scene streams in real-time with post-processing running after each one.
  • Auto-saves after every scene — if something goes wrong, you don't lose work.
  • You can cancel between scenes (but not mid-scene).

Batch generation is great for producing a rough first draft that you can then go back and revise scene by scene.


Scene Revision

After generating a scene, you can revise it as many times as you like.

  • Type freeform instructions describing what you want changed:
  • "Add more tension"
  • "Rewrite from her perspective"
  • "Make the dialogue sharper"
  • "Cut this scene to 500 words"
  • The AI rewrites the scene using the same context assembly plus your revision instructions.
  • Full undo/redo: every revision is stored as a separate version. You can go back to any previous version at any time.
  • You can also manually edit text in the editor whenever you want — no need to use AI revision for small tweaks.

Cost and Tokens

Storywright uses your API key — you pay per token directly to your API provider.

  • Context assembly is designed to be efficient. Only relevant lorebook entries are included (keyword-matched), and previous scenes are compressed into short summaries rather than included in full.
  • Use a cheaper extraction model. Since the extraction model handles structured tasks (not creative writing), you can save money by assigning it a faster, less expensive model.
  • Token usage is displayed in the app. Check your API provider's dashboard for detailed cost breakdowns.
  • Typical cost: a 10-scene story runs roughly $0.50–$2.00 depending on models chosen and scene length.

Stopping Generation

Situation What to Do What Happens
During streaming Click Stop. Generation cancels. Partial text is kept.
During batch generation Cancel between scenes. Completed scenes are saved. The current scene may be partially written.
During post-processing Wait for it to finish. Usually takes 5–10 seconds.

Status Indicators

The toolbar shows the current generation state:

  • Planning... — AI is generating a scene plan.
  • Writing scene X... — AI is generating scene text.
  • Post-processing... — Extracting summaries, memories, and continuity.
  • Generating... — Batch generation in progress.

Tips

  • Start with a good plan. Generation quality depends heavily on the scene outline — a detailed plan produces better scenes.
  • Match the writing model to your budget and quality needs. A top-tier model produces noticeably better prose.
  • Set the extraction model to something fast and cheap. It doesn't need to be creative — it just needs to extract structured data reliably.
  • Review generated scenes before moving on. The AI builds on previous scenes, so catching issues early prevents them from compounding.
  • Use revision for targeted fixes rather than regenerating from scratch.
  • Batch generate for drafts, then go back and revise individual scenes for polish.

  • Stories — the story workflow
  • Context Systems — how memory, continuity, and summaries work in detail
  • Lorebooks — how lorebook entries are keyword-matched
  • Settings — configuring API and models