Choosing a Model¶

The single biggest lever for quality, speed, and cost.

Let the App Choose¶

Click Auto-select in the model picker and Storywright picks the best model for each task from what's available. Models marked ✦ Recommended are particularly well-suited for their role.

Use auto-select when: setting up for the first time, after adding a new provider, or when you're not sure what to pick.

Three Roles¶

Storywright uses different models for different tasks. Configure each in Settings → Models or per-story in the workspace.

Role	What it does	What to look for
Writing	Generates and revises scene prose	Creativity, voice, prose quality
Planning	Creates scene outlines and refines plans	Reasoning ability, structured output
Extraction	Summarizes, extracts memories and world state	Fast, cheap, reliable — runs 3–4× per scene

Tip: The extraction model is your biggest cost lever. Switch it to something fast and cheap — it doesn't need creativity, just reliable instruction-following.

Budget Tiers¶

Free — Local Models¶

Run everything locally with Ollama or LM Studio. No API costs. Quality depends on your hardware — 70B+ models compete with cloud; smaller models are fast but lower quality.

Budget — $0.50–$2.00 per story¶

Affordable cloud models. Good for daily writing and drafting. Save money by using a budget extraction model — it runs most often and doesn't need to be creative.

Premium — $2.00–$5.00 per story¶

Top-tier writing models for the best prose quality. Save money by mixing: premium writing model + budget extraction model.

Recommended Models¶

Google Gemini¶

Role	Models
Writing	Gemini 2.5 Pro
Planning	Gemini 2.5 Pro
Extraction	Gemini 2.0 Flash

Anthropic¶

Role	Models
Writing	Claude Sonnet 4, Claude Sonnet 4.5
Planning	Claude Opus 4
Extraction	Claude Haiku 3.5

OpenAI¶

Role	Models
Writing	GPT-4.1, GPT-4o
Planning	o4-mini, o3
Extraction	GPT-4.1-nano, GPT-4.1-mini

NanoGPT (open-weight)¶

Role	Models
Writing	Euryale 70B, Magnum 72B, Mistral Small Creative
Planning	GLM-5.1 Thinking, Gemini 2.5 Flash
Extraction	Gemini Flash, any small fast model

Local (Ollama / LM Studio)¶

Role	Models
Writing	Llama 3.3 70B, Mistral Large, Qwen 2.5 72B
Planning	DeepSeek R1, Qwen 2.5 72B
Extraction	Llama 3.2 8B, Phi-3 Mini, Gemma 2 9B

Tips¶

Test with a short story first. Generate 3 scenes to evaluate a model before committing to a full project.
Avoid reasoning/thinking models for extraction — they're slow and expensive for tasks that don't benefit from deep reasoning.
Per-story overrides let you use a premium model for your main project and a budget model for experiments.
In demo mode, models are pre-selected for you. Add your own API key to unlock the full model picker.