Skip to content

Choosing a Model

The single biggest lever for quality, speed, and cost.


Let the App Choose

Click Auto-select in the model picker and Storywright picks the best model for each task from what's available. Models marked ✦ Recommended are particularly well-suited for their role.

Use auto-select when: setting up for the first time, after adding a new provider, or when you're not sure what to pick.


Three Roles

Storywright uses different models for different tasks. Configure each in Settings → Models or per-story in the workspace.

Role What it does What to look for
Writing Generates and revises scene prose Creativity, voice, prose quality
Planning Creates scene outlines and refines plans Reasoning ability, structured output
Extraction Summarizes, extracts memories and world state Fast, cheap, reliable — runs 3–4× per scene

Tip: The extraction model is your biggest cost lever. Switch it to something fast and cheap — it doesn't need creativity, just reliable instruction-following.


Budget Tiers

Free — Local Models

Run everything locally with Ollama or LM Studio. No API costs. Quality depends on your hardware — 70B+ models compete with cloud; smaller models are fast but lower quality.

Budget — $0.50–$2.00 per story

Affordable cloud models. Good for daily writing and drafting. Save money by using a budget extraction model — it runs most often and doesn't need to be creative.

Premium — $2.00–$5.00 per story

Top-tier writing models for the best prose quality. Save money by mixing: premium writing model + budget extraction model.


Google Gemini

Role Models
Writing Gemini 2.5 Pro
Planning Gemini 2.5 Pro
Extraction Gemini 2.0 Flash

Anthropic

Role Models
Writing Claude Sonnet 4, Claude Sonnet 4.5
Planning Claude Opus 4
Extraction Claude Haiku 3.5

OpenAI

Role Models
Writing GPT-4.1, GPT-4o
Planning o4-mini, o3
Extraction GPT-4.1-nano, GPT-4.1-mini

NanoGPT (open-weight)

Role Models
Writing Euryale 70B, Magnum 72B, Mistral Small Creative
Planning GLM-5.1 Thinking, Gemini 2.5 Flash
Extraction Gemini Flash, any small fast model

Local (Ollama / LM Studio)

Role Models
Writing Llama 3.3 70B, Mistral Large, Qwen 2.5 72B
Planning DeepSeek R1, Qwen 2.5 72B
Extraction Llama 3.2 8B, Phi-3 Mini, Gemma 2 9B

Tips

  • Test with a short story first. Generate 3 scenes to evaluate a model before committing to a full project.
  • Avoid reasoning/thinking models for extraction — they're slow and expensive for tasks that don't benefit from deep reasoning.
  • Per-story overrides let you use a premium model for your main project and a budget model for experiments.
  • In demo mode, models are pre-selected for you. Add your own API key to unlock the full model picker.