Documentation Index
Fetch the complete documentation index at: https://docs.claw-link.dev/llms.txt
Use this file to discover all available pages before exploring further.
- make runs cheaper
- keep the assistant useful while doing it
Biggest win first: start new chats for unrelated work. That is usually a better token-saving move than obsessing over one clever prompt tweak.
At a glance
- Start new chats for unrelated work to stop context bloat.
- Keep prompts and instructions tight instead of repeating giant background dumps.
- Pick cheaper model paths where the task does not need an expensive brain.
Why token usage gets high
Token burn usually comes from one or more of these:- long chat history
- repeated context being re-sent every turn
- giant pasted logs or files
- overlong system/project context
- tool-heavy loops that keep adding transcript weight
- using a more expensive model than the task needs
- asking one conversation to carry too many unrelated jobs
Biggest ways to reduce token usage
1) Start fresh chats more often
This is the most boring advice, and also one of the most effective. If a conversation has grown huge and the current task is basically unrelated to the old one, start a fresh chat. Why it helps:- less old context gets dragged forward
- less transcript baggage
- fewer irrelevant turns in the model window
2) Split unrelated work into separate sessions
Do not use one monster chat for:- debugging
- writing docs
- planning marketing
- shell troubleshooting
- product ideation
3) Stop pasting giant blobs unless they matter
If you paste:- huge logs
- full config dumps
- giant stack traces
- entire source files
- the exact error
- the relevant snippet
- the command output around the failure
- the minimal file section needed
4) Use targeted file reads instead of dumping everything into chat
If the environment supports tool-based reads, use those instead of pasting giant files manually. That keeps the working context tighter and easier to reason over.5) Use cheaper/faster models for routine work
Not every task needs the fanciest brain in the building. Use lighter models for:- formatting
- simple summaries
- list cleanup
- repetitive transformation work
- low-risk drafting
- architecture
- tricky debugging
- ambiguous reasoning
- high-value writing where quality matters
6) Keep project/reference context lean
Sometimes token burn is coming from the environment itself:- giant always-loaded instructions
- bloated project docs
- duplicated memory/context files
- oversized startup context
7) Avoid repeated re-explaining inside the same thread
If the assistant already knows the local objective in the current session, avoid re-pasting the whole backstory every few messages unless the context genuinely changed.8) Summarize long work before continuing
If a thread became long but you still need continuity, make or use a summary instead of carrying every raw turn forever. A good compact summary beats endless transcript drag.9) Be careful with tool loops
Tool use is useful, but repeated loops can quietly inflate the transcript. Common examples:- checking status too often
- re-reading the same files repeatedly
- repeated retries with long outputs
- verbose command output copied back into context again and again
10) Keep prompts specific
A vague request often creates more back-and-forth than a specific one. Compare:vague
“help with this project”better
“draft a troubleshooting article for the errorOrigin not allowed in OpenClaw, keep it practical and SEO-friendly”
More specific input often means fewer turns and fewer tokens overall.
Practical habits that save tokens fast
Good habit: fresh thread per major task
One chat for docs. One for debugging. One for marketing.Good habit: only include relevant snippets
Do not send 500 lines when 20 lines prove the point.Good habit: use summaries when handing off or resuming
A compact state summary is much cheaper than raw history.Good habit: pick the right model for the job
Do not use a sledgehammer to crack a peanut.Good habit: stop retrying blindly
If something failed three times the same way, new information matters more than more repetition.Common mistakes
Keeping one immortal chat forever
Comforting. Expensive.Treating every task like it needs the full backstory
Usually false.Pasting giant files instead of relevant excerpts
Classic token bonfire.Using premium reasoning for routine cleanup work
Overkill.Repeating tool calls with nearly identical output
Also expensive.If you are debugging unusually high usage
Check these first:- is the thread too long?
- are there giant tool outputs in history?
- is startup/project context oversized?
- are unrelated tasks mixed together?
- is the selected model heavier than needed?
A simple low-waste workflow
If you want a sane default approach:- start a fresh chat for each major task
- include only the minimum necessary context
- use targeted reads/snippets
- summarize before long handoffs
- switch to lighter models for routine work
Final thought
Reducing token usage in OpenClaw is usually not about becoming weirdly stingy. It is about removing junk context, splitting work cleanly, and not paying the model to repeatedly remember things that no longer matter. Which, honestly, is a decent life lesson too.Related pages to link later
Why exec tools may not appear on Windows or WSL1 in OpenClawACP vs normal OpenClaw agents: what’s the difference?OpenClaw not responding after an update? Start hereHow to enable the ACPX plugin in OpenClaw
