OpenClaw Upgrade Playbook
OpenClaw Upgrade Playbook
This page is the actual skill we invoke before every openclaw update. Published here because I’ve watched this playbook grow from a three-line shell script to a nine-phase ritual, and every phase was bought with an incident.
If you’re running OpenClaw in production, you don’t have to copy this wholesale — but you probably do need your own version of most of it, and the incident history below should save you from learning the same lessons the expensive way.
The lifecycle at a glance
Nothing moves to the next phase until the current one is clean. No power-through on a failed smoke test. No “I’ll fix that later” on a clobbered auth profile.
Why each phase exists
Phase 1 — Backup
The origin phase. Nothing controversial: openclaw backup runs before anything else touches the install, and the script auto-verifies the archive. If the backup fails, the upgrade stops. Every other phase can be rolled forward after a partial failure; a broken upgrade with no backup is data loss.
The one thing worth saying out loud: always snapshot the current version before updating. Not just for the changelog — for the rollback story. If you have to openclaw backup restore a week later you’ll want to know which patched state you were on.
Phase 2 — Pre-update repair, update, doctor
Three sub-steps: run a pre-update config-repair script (detects breaking config changes since the last known-good version and auto-fixes what it can), run openclaw update, then run openclaw doctor --fix --yes to apply any migrations the new version brought.
The “no non-release builds” rule. Never use
--tag main,--tag beta, or--tag dev. Stable release tags (e.g.v2026.4.15) or no tag at all. A dev build installed with--tag mainonce broke the gateway install on this deployment and required a fullnpm install -g openclaw@latestto recover. Dev tags are for contributors running against source, not for production gateways.
Phase 3 — Re-check temporary mitigations, then re-apply patches
This is the phase that exists because upstream fixes land on their own schedule, and you have both defensive disables (features you turned off pending an upstream fix) and local patches (code you’re carrying on top of a release to work around a bug).
Re-check temporary mitigations. Maintain a table of features you’ve disabled with a citation to the upstream issue. On every upgrade, check whether the issue is closed or the CHANGELOG mentions the fix — and re-enable if so. If you don’t do this, you accumulate dead weight forever. A feature worth disabling is worth turning back on the moment the fix ships.
The patch registry. At peak on this deployment there were 8 simultaneously-active patches against the OpenClaw dist (BlueBubbles reply threading, WhatsApp reaction handling, inbound dedup, a WA listener fix, a webhook-route tweak, and more). Without a registry it would have been impossible to remember which patches were still needed after each release — upstream would fix one, I’d miss the fix, and either re-apply an unnecessary patch or forget to re-apply a still-necessary one.
The registry is a JSON file (config/patches.json in our setup) with one entry per patch:
{ "id": "bb-inbound-dedup", "status": "active", "upstreamIssue": "https://github.com/openclaw/openclaw/pull/31159", "introducedIn": "2026.4.10", "retiredIn": null, "scriptPath": "~/.local/bin/openclaw-patch-bb-inbound-dedup"}Each patch script is idempotent and supports --check (exit 0 if already patched or upstream fixed it, exit 1 if it needs applying). The playbook runs --check on every active patch after every upgrade, re-applies what’s needed, and flags retirement candidates (patches where --check now exits 0 because upstream shipped the fix). Retired patches stay in the file with retiredIn set so you can see the history.
Phase 3b — Auth health check
Upgrades and re-auth flows are the two most common triggers for OAuth token sink on this deployment: a process that’s supposed to refresh a single profile instead creates a duplicate, the allowlist gets clobbered to its default, and the next model call fails in a way that’s hard to root-cause.
The check verifies:
- No duplicate OAuth profiles per provider/account.
- Auth mode matches credential type (
api_keyprofiles don’t have OAuth-typed creds and vice versa — mismatches throwFailoverErrorat call time, not at config-load time). - Model allowlist intact (all models + aliases + params). Re-auth flows have historically clobbered this to a single-model default.
- Plugin slots (
contextEngine,memory) still explicitly set. An upstream-upgrade once left these unset, and unset = dormant = silent failure. auth.orderstill pins the correct profile per provider.
If any of these fail, fix them before running smoke tests. Fixing after smoke tests means you can’t trust the smoke tests.
Phase 4 — Smoke tests
The minimum bar: a script that exercises one tool from every loaded plugin against a dedicated smoke-test agent. Plugin SDK breaking changes sometimes ship quietly — the plugin loads, the gateway starts clean, but the first tool call fails because a function signature changed. Smoke tests catch this before the user does.
On this deployment the script is scripts/smoke-test-plugins.sh — a bash file that makes one gateway call per plugin and checks exit codes. Common post-upgrade failures:
- Plugin dist filenames changed → patch scripts target the old filename.
- New SDK imports need a
npm installin the plugin directory. - Plugin IDs changed → config still references the old name.
Also verify channel health: openclaw channels status --probe hits each configured channel and reports connected/disconnected. A BlueBubbles or WhatsApp regression that disconnects the webhook is cosmetic-looking in logs but catastrophic for inbound messages.
Phase 5 — Refresh knowledge base
If your agent references the upstream docs to answer user questions about its own platform (ours does: the CLAUDE.md points at a local clone of ~/GitHub/openclaw/docs/), the agent’s mental model of what’s possible drifts with every release. New CLI commands, new config keys, renamed files — the agent will keep trying old patterns until you refresh.
The refresh is:
git pullon the upstream docs clone.- Walk the
docs/tree and diff against the last-known doc map. - For every new file, read the first 20-30 lines and add it to the map.
- Check
openclaw --helpfor new top-level commands. - Scan the
configuration-reference.mddiff for new config keys. - If anything changed that’s operationally relevant, update your project’s CLAUDE.md.
This is the phase that gets skipped most often on other people’s upgrades, and it’s the phase that causes the most “why is the agent trying a command that doesn’t exist” friction three weeks later.
Phase 6 — “What’s new for you” briefing
Release notes are written for maintainers, not operators. You need a filtered view against your actual usage — and an explicit decision on each new feature: enable now / worth investigating / good to know / not relevant.
The briefing checks:
- New CLI commands or flags that could simplify an existing workflow.
- New config keys with default
false/off— opt-in features you might want. - New channel capabilities for iMessage/WhatsApp/Telegram.
- New plugin SDK features that could improve your custom plugins.
- New skills or tools that could unlock automation you haven’t considered.
- Memory/context improvements.
- New default behaviors the
openclaw doctoroutput flagged as migrated. - Entirely new doc pages (often signal new capabilities not obvious from the changelog).
Output is categorized:
### Enable Now (low-risk, clear benefit)- [Feature] — [why you want it] Enable: openclaw config set path.to.key value
### Worth Investigating (try when you have time)- [Feature] — [potential benefit] Docs: path/to/doc.md
### Good to Know (no action needed)- [Change] — [how it affects you]
### Not Relevant (skipped)- [One-liner per feature that doesn't apply]Phase 7 — Update config baseline
Drift detection only works against a snapshot. After a successful upgrade:
cp ~/.openclaw/openclaw.json <your-config-repo>/config/config-baseline.jsonThen bump meta.lastTouchedVersion in the repo copy of openclaw.json so the pre-update repair script in Phase 2a knows what to migrate from next time. Forget this once and every future drift check compares against the pre-upgrade state and tells you the same lies forever.
Phase 8–9 — Changelog and report
Record what you did. Previous → new version, patches re-applied, patches retired, patches that failed, config changes, smoke test results, knowledge base updates, notable new features from the briefing. Future-you will thank you when the next incident turns out to be a regression from a version you upgraded through three weeks ago.
The rollback
If anything goes catastrophically wrong:
openclaw backup restore <latest-backup-path>openclaw gateway restartopenclaw health && openclaw channels status --probeThe reason Phase 1 is non-negotiable.
Safety rules
- Never upgrade without a verified backup.
- Never use non-release tags (
main,beta,dev) for production. - Never skip patch re-application — patches exist because upstream hasn’t fixed the issue yet. If upstream fixed it, the
--checkwill tell you and the patch gets retired. - Never mark the upgrade complete if smoke tests fail.
- Always restart the gateway exactly once after all patches are applied (not per-patch — each restart is disruptive).
- Always verify plugin slots are explicitly set after an upgrade. Unset slots can drop silently to “dormant.”
- If any phase fails, stop and report. Don’t power through.
What to adapt for your setup
The skill below assumes:
- Backup script:
~/.local/bin/openclaw-backup. Yours might just beopenclaw backup create. - Patch registry:
<config-repo>/config/patches.json. You might not have one yet — if so, the first patch you need to carry is the moment to start the file, not a reason to skip this phase. - Smoke test script:
scripts/smoke-test-plugins.sh. Could be as simple asopenclaw health+ one tool call per plugin you care about. - Config repo: we check a Git repo with the gateway’s
openclaw.jsoncommitted separately from the live~/.openclaw/openclaw.jsonand sync both. If you only have the live copy, the baseline-snapshot step is still useful — just snapshot to the same file you already have. - Agent knowledge-refresh target: we update a CLAUDE.md that the agent reads. If your agent doesn’t reference the upstream docs, skip Phase 5.
The skill, verbatim
The Claude Code skill file we actually invoke. Lightly scrubbed for PII (user-specific paths and naming), otherwise unchanged from the version that runs on this deployment.
---name: openclaw-upgradedescription: | Perform a full OpenClaw upgrade with backup, verification, patch re-application, smoke tests, and knowledge base refresh. Use when ready to actually upgrade (after reviewing changes with /openclaw-release). Covers the complete lifecycle: backup -> update -> doctor -> patches -> smoke tests -> knowledge refresh -> docs.argument-hint: "[version-tag]"allowed-tools: Read, Edit, Write, Bash(openclaw *), Bash(~/.local/bin/openclaw-*), Bash(bash */smoke-test-plugins.sh*), Bash(git *), Bash(python3 *), Bash(find *), Bash(gh *), Bash(tailscale *), Agent, WebFetch---
# OpenClaw Upgrade
End-to-end upgrade executor. Run this after reviewing the release with`/openclaw-release` and deciding to proceed.
## Pre-flight Checklist
Before starting, confirm with the user:- Have they reviewed the release notes? (suggest `/openclaw-release` if not)- Is now a good time? (gateway will restart, brief message delivery pause)
## Phase 1 — Backup
Create a verified backup before touching anything.
```bash~/.local/bin/openclaw-backup```
Confirm the backup succeeded (script auto-verifies). If it fails, **stop** andreport the error — never upgrade without a backup.
Also snapshot the current version for the changelog:
```bashopenclaw --version```
## Phase 2 — Update
### 2a. Pre-update config repair
Run `check-update.sh` to auto-fix any breaking config issues before the upgrade(detects version changes since last run, explains broken config, auto-repairswith `--fix`):
```bashbash "${CLAUDE_PLUGIN_ROOT}/scripts/check-update.sh" --fix```
### 2b. Run the update
If a specific version tag was provided:```bashopenclaw update --tag <version>```
Otherwise, update to latest stable:```bashopenclaw update```
**CRITICAL:** Never use `--tag main`, `--tag beta`, or `--tag dev`. Only stablerelease tags (e.g., `v2026.4.5`) or no tag at all.
### 2c. Run doctor
```bashopenclaw doctor --fix --yes```
Review the doctor output for:- Config migrations applied- Deprecation warnings- Any failures that need manual intervention
Note the new version:```bashopenclaw --version```
## Phase 3 — Re-apply Patches
### 3.0 Re-check temporary mitigations
Before patches, check whether any **disabled-pending-upstream-fix** features canbe re-enabled on the new version. Maintain a table (examples below) where eachrow cites the upstream issue that caused the feature to be disabled and theexact re-enable command:
| Feature | Disabled because | Re-enable check | Command to re-enable ||---------|------------------|-----------------|----------------------|| `plugins.entries.<name>.enabled` | openclaw/openclaw#NNNNN — short description of the regression | Issue closed OR version > X.Y.Z with a visible fix in CHANGELOG | `openclaw config set plugins.entries.<name>.enabled true && openclaw gateway restart` |
If you re-enable something here, delete its row and note it in the Phase 8changelog entry.
### 3.1 Patch registry
Read `<config-repo>/config/patches.json` for the patch registry.
For each patch with `"status": "active"`:
1. **Check** if still needed: ```bash ~/.local/bin/openclaw-patch-<id> --check ``` - Exit 0 = already patched or not needed (upstream may have fixed it) - Exit 1 = needs patching
2. **Apply** if needed: ```bash ~/.local/bin/openclaw-patch-<id> ```
3. Track results: - Which patches were re-applied - Which patches appear to be fixed upstream (candidates for retirement) - Any patches that failed (flag as **URGENT**)
After all patches are applied:```bashopenclaw gateway restart```
Wait 5 seconds, then verify the gateway is healthy:```bashopenclaw health```
## Phase 3b — Auth Health Check
Run an auth-health skill (e.g. `/openclaw-auth`) to catch post-upgrade authdamage. Upgrades and re-auth flows are the two most common triggers for OAuthtoken sink and config clobber. This check must pass before continuing.
Verify:1. No duplicate OAuth profiles per provider (token sink)2. Auth mode matches credential type (FailoverError prevention)3. Model allowlist not clobbered (all models + aliases + params intact)4. Plugin slots still explicitly set (LCM dormancy prevention)5. `auth.order` still pins the correct profile per provider
If any critical issues are found, fix them before proceeding to smoke tests.After fixes, restart the gateway once more:```bashopenclaw gateway restart```
## Phase 4 — Smoke Tests
Run your plugin smoke test suite:```bashbash <config-repo>/scripts/smoke-test-plugins.sh```
Expected: all tests pass. If any fail:- Check if plugin dist filenames changed (patch scripts may need updating)- Check if new SDK imports need `npm install` in plugin directories- Check if plugin IDs changed (config needs updating)- Report failures — do not silently continue
Also verify channels are connected:```bashopenclaw channels status --probe```
## Phase 5 — Refresh Knowledge Base
### 5a. Pull latest OpenClaw docs
```bashgit -C ~/GitHub/openclaw pull --rebase```
### 5b. Rebuild the documentation map
Scan the OpenClaw docs directory and regenerate your local knowledge map.
Walk all directories under `~/GitHub/openclaw/docs/` (skip `ja-JP/` and`zh-CN/`). For each directory:- List all `.md` files- Read the first 20-30 lines of any **new** files (not in the current map)- Update the tables in the knowledge map with any new/removed/renamed files
Preserve the existing structure of the map file (sections, quick lookuprecipes). Only modify rows that changed.
### 5c. Check for new CLI commands
```bashopenclaw --help 2>&1```
Compare against the CLI section in the docs map. Add any new commands.
### 5d. Check for new config keys
If the release notes mentioned new config options, read the relevant section of`~/GitHub/openclaw/docs/gateway/configuration-reference.md` and note any keysrelevant to your setup.
### 5e. Update CLAUDE.md if needed
If the upgrade revealed:- New CLI commands the agent uses regularly- Changed config key names or semantics- New features that affect the setup- Deprecated features the agent relies on
Then update the relevant sections in your agent's CLAUDE.md. Keep changesminimal and focused on what's operationally relevant.
## Phase 6 — What's New for You
After the upgrade is stable and knowledge is refreshed, produce a practicalbriefing on new capabilities worth knowing about or investigating. This goesbeyond the release-note triage in `/openclaw-release` — it's about actionableopportunities, not just risk assessment.
### 6a. Identify new features worth exploring
Read the full CHANGELOG.md diff between the old and new version:```bashgit -C ~/GitHub/openclaw log v{old}..v{new} --oneline -- CHANGELOG.mdgit -C ~/GitHub/openclaw diff v{old}..v{new} -- CHANGELOG.md```
For each new feature or change, evaluate against your actual usage patterns:
| Signal | Why It Matters ||--------|----------------|| New CLI command or flag | Could simplify an existing workflow or script || New config key with default `false`/`off` | Opt-in feature you might want to enable || New channel capability | Could improve channel UX || New plugin SDK feature | Could improve custom plugins || New skill or tool | Could unlock automation not considered || Memory/context improvements | Could improve long-conversation quality || New provider support | Could add model options || Performance improvements | Could reduce token costs or latency || New security hardening options | Evaluate for your threat model |
### 6b. Check for new default behaviors
Some releases change defaults. Read `openclaw doctor` output from Phase 2b for"migrated" or "default changed" messages. For each:- What was the old default?- What is the new default?- Is the new default better, or should the old behavior be pinned?
### 6c. Explore new docs
Check if the release added entirely new documentation pages:```bashgit -C ~/GitHub/openclaw diff v{old}..v{new} --name-status -- docs/ | grep "^A"```
New docs often signal new capabilities not obvious from the changelog alone.
### 6d. Check for new slash commands or directives
```bashgit -C ~/GitHub/openclaw diff v{old}..v{new} -- docs/tools/slash-commands.md```
### 6e. Present the briefing
Output a practical "What's New" section organized by action type:
```## What's New in v{new}
### Enable Now (low-risk, clear benefit)- **[Feature]** — [what it does, why you want it] Enable: `openclaw config set path.to.key value`
### Worth Investigating (try when you have time)- **[Feature]** — [what it does, potential benefit] Docs: ~/GitHub/openclaw/docs/path/to/doc.md
### Good to Know (no action needed, but awareness helps)- **[Change]** — [what changed, how it affects you]
### Not Relevant (skipped)- [One-liner per feature that doesn't apply to your setup]```
For "Enable Now" items, include the exact `openclaw config set` command. For"Worth Investigating" items, include the doc path. Keep it focused — 3-5 itemsper category max. Don't pad with noise.
## Phase 7 — Update Config Baseline
After a successful upgrade, update the config baseline so future driftdetection works against the new known-good state:
```bashcp ~/.openclaw/openclaw.json <config-repo>/config/config-baseline.json```
Also update `meta.lastTouchedVersion` in `<config-repo>/config/openclaw.json` tomatch the new version.
## Phase 8 — Documentation
Add a changelog entry for the upgrade. Include:- Previous version -> new version- Patches re-applied, retired, or failed- Any config changes made- Smoke test results- Knowledge base updates- Notable new features from the "What's New" briefing
## Phase 9 — Report
Present a summary:
```## OpenClaw Upgrade Complete
**Version:** v{old} -> v{new}**Backup:** {path}
### Patches- {patch}: re-applied | retired | FAILED
### Smoke Tests- {result}
### Channels- {channel status}
### Knowledge Base- Docs map: {files added/removed/unchanged}- CLAUDE.md: {updated sections or "no changes needed"}
### Action Items- [ ] {any remaining manual steps}```
## Rollback
If the upgrade fails catastrophically:1. Restore from backup: `openclaw backup restore <latest-backup-path>`2. Restart gateway: `openclaw gateway restart`3. Verify: `openclaw health && openclaw channels status --probe`
## Safety Rules
- **Never** upgrade without a verified backup (Phase 1)- **Never** use non-release tags (`main`, `beta`, `dev`)- **Never** skip patch re-application- **Never** mark the upgrade complete if smoke tests fail- **Always** restart the gateway exactly once after all patches are applied- **Always** check plugin slots are still explicitly set after upgrade- If anything fails, stop and report — don't try to power through