Skip to content

OpenClaw Upgrade Playbook

OpenClaw Upgrade Playbook

This page is the actual skill we invoke before every openclaw update. Published here because I’ve watched this playbook grow from a three-line shell script to a nine-phase ritual, and every phase was bought with an incident.

If you’re running OpenClaw in production, you don’t have to copy this wholesale — but you probably do need your own version of most of it, and the incident history below should save you from learning the same lessons the expensive way.

The lifecycle at a glance

Phase 1: Backup

Phase 2: Pre-repair + update + doctor

Phase 3: Re-check mitigations + re-apply patches

Phase 3b: Auth health check

Phase 4: Smoke tests

Phase 5: Refresh knowledge base

Phase 6: 'What's new for you' briefing

Phase 7: Update config baseline

Phase 8–9: Changelog + report

Nothing moves to the next phase until the current one is clean. No power-through on a failed smoke test. No “I’ll fix that later” on a clobbered auth profile.

Why each phase exists

Phase 1 — Backup

The origin phase. Nothing controversial: openclaw backup runs before anything else touches the install, and the script auto-verifies the archive. If the backup fails, the upgrade stops. Every other phase can be rolled forward after a partial failure; a broken upgrade with no backup is data loss.

The one thing worth saying out loud: always snapshot the current version before updating. Not just for the changelog — for the rollback story. If you have to openclaw backup restore a week later you’ll want to know which patched state you were on.

Phase 2 — Pre-update repair, update, doctor

Three sub-steps: run a pre-update config-repair script (detects breaking config changes since the last known-good version and auto-fixes what it can), run openclaw update, then run openclaw doctor --fix --yes to apply any migrations the new version brought.

The “no non-release builds” rule. Never use --tag main, --tag beta, or --tag dev. Stable release tags (e.g. v2026.4.15) or no tag at all. A dev build installed with --tag main once broke the gateway install on this deployment and required a full npm install -g openclaw@latest to recover. Dev tags are for contributors running against source, not for production gateways.

Phase 3 — Re-check temporary mitigations, then re-apply patches

This is the phase that exists because upstream fixes land on their own schedule, and you have both defensive disables (features you turned off pending an upstream fix) and local patches (code you’re carrying on top of a release to work around a bug).

Re-check temporary mitigations. Maintain a table of features you’ve disabled with a citation to the upstream issue. On every upgrade, check whether the issue is closed or the CHANGELOG mentions the fix — and re-enable if so. If you don’t do this, you accumulate dead weight forever. A feature worth disabling is worth turning back on the moment the fix ships.

The patch registry. At peak on this deployment there were 8 simultaneously-active patches against the OpenClaw dist (BlueBubbles reply threading, WhatsApp reaction handling, inbound dedup, a WA listener fix, a webhook-route tweak, and more). Without a registry it would have been impossible to remember which patches were still needed after each release — upstream would fix one, I’d miss the fix, and either re-apply an unnecessary patch or forget to re-apply a still-necessary one.

The registry is a JSON file (config/patches.json in our setup) with one entry per patch:

{
"id": "bb-inbound-dedup",
"status": "active",
"upstreamIssue": "https://github.com/openclaw/openclaw/pull/31159",
"introducedIn": "2026.4.10",
"retiredIn": null,
"scriptPath": "~/.local/bin/openclaw-patch-bb-inbound-dedup"
}

Each patch script is idempotent and supports --check (exit 0 if already patched or upstream fixed it, exit 1 if it needs applying). The playbook runs --check on every active patch after every upgrade, re-applies what’s needed, and flags retirement candidates (patches where --check now exits 0 because upstream shipped the fix). Retired patches stay in the file with retiredIn set so you can see the history.

Phase 3b — Auth health check

Upgrades and re-auth flows are the two most common triggers for OAuth token sink on this deployment: a process that’s supposed to refresh a single profile instead creates a duplicate, the allowlist gets clobbered to its default, and the next model call fails in a way that’s hard to root-cause.

The check verifies:

  1. No duplicate OAuth profiles per provider/account.
  2. Auth mode matches credential type (api_key profiles don’t have OAuth-typed creds and vice versa — mismatches throw FailoverError at call time, not at config-load time).
  3. Model allowlist intact (all models + aliases + params). Re-auth flows have historically clobbered this to a single-model default.
  4. Plugin slots (contextEngine, memory) still explicitly set. An upstream-upgrade once left these unset, and unset = dormant = silent failure.
  5. auth.order still pins the correct profile per provider.

If any of these fail, fix them before running smoke tests. Fixing after smoke tests means you can’t trust the smoke tests.

Phase 4 — Smoke tests

The minimum bar: a script that exercises one tool from every loaded plugin against a dedicated smoke-test agent. Plugin SDK breaking changes sometimes ship quietly — the plugin loads, the gateway starts clean, but the first tool call fails because a function signature changed. Smoke tests catch this before the user does.

On this deployment the script is scripts/smoke-test-plugins.sh — a bash file that makes one gateway call per plugin and checks exit codes. Common post-upgrade failures:

  • Plugin dist filenames changed → patch scripts target the old filename.
  • New SDK imports need a npm install in the plugin directory.
  • Plugin IDs changed → config still references the old name.

Also verify channel health: openclaw channels status --probe hits each configured channel and reports connected/disconnected. A BlueBubbles or WhatsApp regression that disconnects the webhook is cosmetic-looking in logs but catastrophic for inbound messages.

Phase 5 — Refresh knowledge base

If your agent references the upstream docs to answer user questions about its own platform (ours does: the CLAUDE.md points at a local clone of ~/GitHub/openclaw/docs/), the agent’s mental model of what’s possible drifts with every release. New CLI commands, new config keys, renamed files — the agent will keep trying old patterns until you refresh.

The refresh is:

  1. git pull on the upstream docs clone.
  2. Walk the docs/ tree and diff against the last-known doc map.
  3. For every new file, read the first 20-30 lines and add it to the map.
  4. Check openclaw --help for new top-level commands.
  5. Scan the configuration-reference.md diff for new config keys.
  6. If anything changed that’s operationally relevant, update your project’s CLAUDE.md.

This is the phase that gets skipped most often on other people’s upgrades, and it’s the phase that causes the most “why is the agent trying a command that doesn’t exist” friction three weeks later.

Phase 6 — “What’s new for you” briefing

Release notes are written for maintainers, not operators. You need a filtered view against your actual usage — and an explicit decision on each new feature: enable now / worth investigating / good to know / not relevant.

The briefing checks:

  • New CLI commands or flags that could simplify an existing workflow.
  • New config keys with default false/off — opt-in features you might want.
  • New channel capabilities for iMessage/WhatsApp/Telegram.
  • New plugin SDK features that could improve your custom plugins.
  • New skills or tools that could unlock automation you haven’t considered.
  • Memory/context improvements.
  • New default behaviors the openclaw doctor output flagged as migrated.
  • Entirely new doc pages (often signal new capabilities not obvious from the changelog).

Output is categorized:

### Enable Now (low-risk, clear benefit)
- [Feature] — [why you want it]
Enable: openclaw config set path.to.key value
### Worth Investigating (try when you have time)
- [Feature] — [potential benefit]
Docs: path/to/doc.md
### Good to Know (no action needed)
- [Change] — [how it affects you]
### Not Relevant (skipped)
- [One-liner per feature that doesn't apply]

Phase 7 — Update config baseline

Drift detection only works against a snapshot. After a successful upgrade:

Terminal window
cp ~/.openclaw/openclaw.json <your-config-repo>/config/config-baseline.json

Then bump meta.lastTouchedVersion in the repo copy of openclaw.json so the pre-update repair script in Phase 2a knows what to migrate from next time. Forget this once and every future drift check compares against the pre-upgrade state and tells you the same lies forever.

Phase 8–9 — Changelog and report

Record what you did. Previous → new version, patches re-applied, patches retired, patches that failed, config changes, smoke test results, knowledge base updates, notable new features from the briefing. Future-you will thank you when the next incident turns out to be a regression from a version you upgraded through three weeks ago.

The rollback

If anything goes catastrophically wrong:

Terminal window
openclaw backup restore <latest-backup-path>
openclaw gateway restart
openclaw health && openclaw channels status --probe

The reason Phase 1 is non-negotiable.

Safety rules

  • Never upgrade without a verified backup.
  • Never use non-release tags (main, beta, dev) for production.
  • Never skip patch re-application — patches exist because upstream hasn’t fixed the issue yet. If upstream fixed it, the --check will tell you and the patch gets retired.
  • Never mark the upgrade complete if smoke tests fail.
  • Always restart the gateway exactly once after all patches are applied (not per-patch — each restart is disruptive).
  • Always verify plugin slots are explicitly set after an upgrade. Unset slots can drop silently to “dormant.”
  • If any phase fails, stop and report. Don’t power through.

What to adapt for your setup

The skill below assumes:

  • Backup script: ~/.local/bin/openclaw-backup. Yours might just be openclaw backup create.
  • Patch registry: <config-repo>/config/patches.json. You might not have one yet — if so, the first patch you need to carry is the moment to start the file, not a reason to skip this phase.
  • Smoke test script: scripts/smoke-test-plugins.sh. Could be as simple as openclaw health + one tool call per plugin you care about.
  • Config repo: we check a Git repo with the gateway’s openclaw.json committed separately from the live ~/.openclaw/openclaw.json and sync both. If you only have the live copy, the baseline-snapshot step is still useful — just snapshot to the same file you already have.
  • Agent knowledge-refresh target: we update a CLAUDE.md that the agent reads. If your agent doesn’t reference the upstream docs, skip Phase 5.

The skill, verbatim

The Claude Code skill file we actually invoke. Lightly scrubbed for PII (user-specific paths and naming), otherwise unchanged from the version that runs on this deployment.

---
name: openclaw-upgrade
description: |
Perform a full OpenClaw upgrade with backup, verification, patch re-application,
smoke tests, and knowledge base refresh. Use when ready to actually upgrade
(after reviewing changes with /openclaw-release). Covers the complete lifecycle:
backup -> update -> doctor -> patches -> smoke tests -> knowledge refresh -> docs.
argument-hint: "[version-tag]"
allowed-tools: Read, Edit, Write, Bash(openclaw *), Bash(~/.local/bin/openclaw-*), Bash(bash */smoke-test-plugins.sh*), Bash(git *), Bash(python3 *), Bash(find *), Bash(gh *), Bash(tailscale *), Agent, WebFetch
---
# OpenClaw Upgrade
End-to-end upgrade executor. Run this after reviewing the release with
`/openclaw-release` and deciding to proceed.
## Pre-flight Checklist
Before starting, confirm with the user:
- Have they reviewed the release notes? (suggest `/openclaw-release` if not)
- Is now a good time? (gateway will restart, brief message delivery pause)
## Phase 1 — Backup
Create a verified backup before touching anything.
```bash
~/.local/bin/openclaw-backup
```
Confirm the backup succeeded (script auto-verifies). If it fails, **stop** and
report the error — never upgrade without a backup.
Also snapshot the current version for the changelog:
```bash
openclaw --version
```
## Phase 2 — Update
### 2a. Pre-update config repair
Run `check-update.sh` to auto-fix any breaking config issues before the upgrade
(detects version changes since last run, explains broken config, auto-repairs
with `--fix`):
```bash
bash "${CLAUDE_PLUGIN_ROOT}/scripts/check-update.sh" --fix
```
### 2b. Run the update
If a specific version tag was provided:
```bash
openclaw update --tag <version>
```
Otherwise, update to latest stable:
```bash
openclaw update
```
**CRITICAL:** Never use `--tag main`, `--tag beta`, or `--tag dev`. Only stable
release tags (e.g., `v2026.4.5`) or no tag at all.
### 2c. Run doctor
```bash
openclaw doctor --fix --yes
```
Review the doctor output for:
- Config migrations applied
- Deprecation warnings
- Any failures that need manual intervention
Note the new version:
```bash
openclaw --version
```
## Phase 3 — Re-apply Patches
### 3.0 Re-check temporary mitigations
Before patches, check whether any **disabled-pending-upstream-fix** features can
be re-enabled on the new version. Maintain a table (examples below) where each
row cites the upstream issue that caused the feature to be disabled and the
exact re-enable command:
| Feature | Disabled because | Re-enable check | Command to re-enable |
|---------|------------------|-----------------|----------------------|
| `plugins.entries.<name>.enabled` | openclaw/openclaw#NNNNN — short description of the regression | Issue closed OR version > X.Y.Z with a visible fix in CHANGELOG | `openclaw config set plugins.entries.<name>.enabled true && openclaw gateway restart` |
If you re-enable something here, delete its row and note it in the Phase 8
changelog entry.
### 3.1 Patch registry
Read `<config-repo>/config/patches.json` for the patch registry.
For each patch with `"status": "active"`:
1. **Check** if still needed:
```bash
~/.local/bin/openclaw-patch-<id> --check
```
- Exit 0 = already patched or not needed (upstream may have fixed it)
- Exit 1 = needs patching
2. **Apply** if needed:
```bash
~/.local/bin/openclaw-patch-<id>
```
3. Track results:
- Which patches were re-applied
- Which patches appear to be fixed upstream (candidates for retirement)
- Any patches that failed (flag as **URGENT**)
After all patches are applied:
```bash
openclaw gateway restart
```
Wait 5 seconds, then verify the gateway is healthy:
```bash
openclaw health
```
## Phase 3b — Auth Health Check
Run an auth-health skill (e.g. `/openclaw-auth`) to catch post-upgrade auth
damage. Upgrades and re-auth flows are the two most common triggers for OAuth
token sink and config clobber. This check must pass before continuing.
Verify:
1. No duplicate OAuth profiles per provider (token sink)
2. Auth mode matches credential type (FailoverError prevention)
3. Model allowlist not clobbered (all models + aliases + params intact)
4. Plugin slots still explicitly set (LCM dormancy prevention)
5. `auth.order` still pins the correct profile per provider
If any critical issues are found, fix them before proceeding to smoke tests.
After fixes, restart the gateway once more:
```bash
openclaw gateway restart
```
## Phase 4 — Smoke Tests
Run your plugin smoke test suite:
```bash
bash <config-repo>/scripts/smoke-test-plugins.sh
```
Expected: all tests pass. If any fail:
- Check if plugin dist filenames changed (patch scripts may need updating)
- Check if new SDK imports need `npm install` in plugin directories
- Check if plugin IDs changed (config needs updating)
- Report failures — do not silently continue
Also verify channels are connected:
```bash
openclaw channels status --probe
```
## Phase 5 — Refresh Knowledge Base
### 5a. Pull latest OpenClaw docs
```bash
git -C ~/GitHub/openclaw pull --rebase
```
### 5b. Rebuild the documentation map
Scan the OpenClaw docs directory and regenerate your local knowledge map.
Walk all directories under `~/GitHub/openclaw/docs/` (skip `ja-JP/` and
`zh-CN/`). For each directory:
- List all `.md` files
- Read the first 20-30 lines of any **new** files (not in the current map)
- Update the tables in the knowledge map with any new/removed/renamed files
Preserve the existing structure of the map file (sections, quick lookup
recipes). Only modify rows that changed.
### 5c. Check for new CLI commands
```bash
openclaw --help 2>&1
```
Compare against the CLI section in the docs map. Add any new commands.
### 5d. Check for new config keys
If the release notes mentioned new config options, read the relevant section of
`~/GitHub/openclaw/docs/gateway/configuration-reference.md` and note any keys
relevant to your setup.
### 5e. Update CLAUDE.md if needed
If the upgrade revealed:
- New CLI commands the agent uses regularly
- Changed config key names or semantics
- New features that affect the setup
- Deprecated features the agent relies on
Then update the relevant sections in your agent's CLAUDE.md. Keep changes
minimal and focused on what's operationally relevant.
## Phase 6 — What's New for You
After the upgrade is stable and knowledge is refreshed, produce a practical
briefing on new capabilities worth knowing about or investigating. This goes
beyond the release-note triage in `/openclaw-release` — it's about actionable
opportunities, not just risk assessment.
### 6a. Identify new features worth exploring
Read the full CHANGELOG.md diff between the old and new version:
```bash
git -C ~/GitHub/openclaw log v{old}..v{new} --oneline -- CHANGELOG.md
git -C ~/GitHub/openclaw diff v{old}..v{new} -- CHANGELOG.md
```
For each new feature or change, evaluate against your actual usage patterns:
| Signal | Why It Matters |
|--------|----------------|
| New CLI command or flag | Could simplify an existing workflow or script |
| New config key with default `false`/`off` | Opt-in feature you might want to enable |
| New channel capability | Could improve channel UX |
| New plugin SDK feature | Could improve custom plugins |
| New skill or tool | Could unlock automation not considered |
| Memory/context improvements | Could improve long-conversation quality |
| New provider support | Could add model options |
| Performance improvements | Could reduce token costs or latency |
| New security hardening options | Evaluate for your threat model |
### 6b. Check for new default behaviors
Some releases change defaults. Read `openclaw doctor` output from Phase 2b for
"migrated" or "default changed" messages. For each:
- What was the old default?
- What is the new default?
- Is the new default better, or should the old behavior be pinned?
### 6c. Explore new docs
Check if the release added entirely new documentation pages:
```bash
git -C ~/GitHub/openclaw diff v{old}..v{new} --name-status -- docs/ | grep "^A"
```
New docs often signal new capabilities not obvious from the changelog alone.
### 6d. Check for new slash commands or directives
```bash
git -C ~/GitHub/openclaw diff v{old}..v{new} -- docs/tools/slash-commands.md
```
### 6e. Present the briefing
Output a practical "What's New" section organized by action type:
```
## What's New in v{new}
### Enable Now (low-risk, clear benefit)
- **[Feature]** — [what it does, why you want it]
Enable: `openclaw config set path.to.key value`
### Worth Investigating (try when you have time)
- **[Feature]** — [what it does, potential benefit]
Docs: ~/GitHub/openclaw/docs/path/to/doc.md
### Good to Know (no action needed, but awareness helps)
- **[Change]** — [what changed, how it affects you]
### Not Relevant (skipped)
- [One-liner per feature that doesn't apply to your setup]
```
For "Enable Now" items, include the exact `openclaw config set` command. For
"Worth Investigating" items, include the doc path. Keep it focused — 3-5 items
per category max. Don't pad with noise.
## Phase 7 — Update Config Baseline
After a successful upgrade, update the config baseline so future drift
detection works against the new known-good state:
```bash
cp ~/.openclaw/openclaw.json <config-repo>/config/config-baseline.json
```
Also update `meta.lastTouchedVersion` in `<config-repo>/config/openclaw.json` to
match the new version.
## Phase 8 — Documentation
Add a changelog entry for the upgrade. Include:
- Previous version -> new version
- Patches re-applied, retired, or failed
- Any config changes made
- Smoke test results
- Knowledge base updates
- Notable new features from the "What's New" briefing
## Phase 9 — Report
Present a summary:
```
## OpenClaw Upgrade Complete
**Version:** v{old} -> v{new}
**Backup:** {path}
### Patches
- {patch}: re-applied | retired | FAILED
### Smoke Tests
- {result}
### Channels
- {channel status}
### Knowledge Base
- Docs map: {files added/removed/unchanged}
- CLAUDE.md: {updated sections or "no changes needed"}
### Action Items
- [ ] {any remaining manual steps}
```
## Rollback
If the upgrade fails catastrophically:
1. Restore from backup: `openclaw backup restore <latest-backup-path>`
2. Restart gateway: `openclaw gateway restart`
3. Verify: `openclaw health && openclaw channels status --probe`
## Safety Rules
- **Never** upgrade without a verified backup (Phase 1)
- **Never** use non-release tags (`main`, `beta`, `dev`)
- **Never** skip patch re-application
- **Never** mark the upgrade complete if smoke tests fail
- **Always** restart the gateway exactly once after all patches are applied
- **Always** check plugin slots are still explicitly set after upgrade
- If anything fails, stop and report — don't try to power through