feat(updater): tier 4 — autonomous update in maintenance window (#7607)#7753
Conversation
…7607) Maps PR 4 of the auto-update design spec (§"Tier 4 — autonomous") to concrete files, tasks, and verification steps. Subsequent commits scaffold against this plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tier 4 Pure module: parseWindow, inWindow, nextWindowStart. Supports tz=local|utc and cross-midnight ranges. Used by upcoming Scheduler + UpdatePolicy changes. 22 vitest unit tests cover format validation, same-day + cross-midnight boundaries, and host-local vs UTC clock comparisons. DST handling is absorbed by JS Date constructor's wall-clock normalization (documented in the file header). Refs #7607 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires MaintenanceWindow into the existing tier 3 backend so autonomous
updates only fire while `now` is inside `updates.maintenanceWindow`.
UpdatePolicy
- new optional `maintenanceWindow` input
- canAutonomous flips on only for git+tier=autonomous+parse-valid window
- new reasons `maintenance-window-missing` / `maintenance-window-invalid`
- rollback-failed still wins over window denial
Scheduler
- decideSchedule snaps scheduledFor forward to nextWindowStart when
canAutonomous + grace lands outside the window
- decideTriggerApply returns a new `{action: 'defer'}` when canAutonomous
+ fire-time is outside the window; carries nextStart for the runner
- canAutonomous=false preserves Tier 3 behavior unchanged
index.ts wires settings.updates.maintenanceWindow through both passes and
re-arms the timer on defer. Status endpoint surface (nextWindowOpensAt) +
admin UI picker land in a follow-up commit.
Settings adds `maintenanceWindow: {start, end, tz} | null`, defaulting to
null. settings.json.template / settings.json.docker document the shape.
Tests
- 22 vitest cases for MaintenanceWindow already cover the math
- 4 new UpdatePolicy cases for the window outcomes
- 6 new Scheduler cases for tier-4 schedule/trigger paths
- Full backend-new suite: 629 passed (35 files)
Refs #7607
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nner
GET /admin/update/status now returns:
- `maintenanceWindow`: the parsed window object (admin sessions only)
- `nextWindowOpensAt`: ISO of the next window opening when tier=autonomous
UpdatePage
- new "Maintenance window" section when tier=autonomous, shows current
window summary + next opens at, or "Not configured" when unset
- scheduled panel now appends a "deferred until <iso>" line when the
backend has snapped scheduledFor to the next window opening
UpdateBanner
- new variant when tier=autonomous and policy.reason is
`maintenance-window-missing` or `maintenance-window-invalid`, linking
to /admin/update
i18n
- 8 new keys under `update.banner.*`, `update.page.policy.*`,
`update.page.scheduled.*`, `update.window.*` (en.json only;
translations follow via the usual locale workflow)
Interactive picker is intentionally deferred — admins edit
`updates.maintenanceWindow` via the parsed JSONC settings editor (#7709).
A follow-up commit may add a thin write-through component if the JSONC
round-trip turns out to be too rough for typical operators.
Refs #7607
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CHANGELOG: flip Tier 4 from "designed, not yet implemented" to current. Document maintenanceWindow shape, snap-forward, defer-at-fire, and the two missing/invalid policy reasons. doc/admin/updates.md: new "Tier 4 — autonomous in a maintenance window" section with config example, policy gating, DST/timezone notes, admin UI behavior. runbook: §12 walks a disposable VM through missing-window, malformed, outside-window deferral, fire-at-opening, and window-closes-mid-grace. Adds five sign-off checklist items. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mocha integration covering the four scenarios called out in the spec
§"Tier 4 — autonomous":
- outside-window: decideSchedule snaps scheduledFor forward to the
next opening and the snapped value round-trips through saveState
- inside-window at fire-time: decideTriggerApply returns fire
- window-closes-mid-grace: decideTriggerApply returns defer with
nextStart at the next opening; persisted state moves forward
- cancel during deferred-grace: state returns to idle, and the next
decideSchedule pass re-emits a schedule snapped to the next opening
All 4 cases passing locally under tsx mocha.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the (would send email) stub introduced in PR #7601 with a nodemailer-backed transport. The dependency is lazy-imported so installs that don't set mail.host pay no runtime cost. Settings additions - new top-level mail block: host, port, secure, from, auth (user/pass) - mail.host=null keeps the legacy log-only behaviour; the Notifier still updates dedupe state so we don't re-evaluate every tick - settings.json.template documents the shape inline - settings.json.docker reads MAIL_HOST / MAIL_FROM / MAIL_PORT / MAIL_SECURE from env so operators can configure via container env Transport - lazy import('nodemailer') on first send - transport cached by host; settings reload picks up new host without needing a restart - send errors are swallowed (logged warn) so a transient SMTP failure can never poison the surrounding updater state machine - successful sends log at info; legacy "(would send email)" path remains the visible signal when mail is disabled Refs #7607 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before mutating the working tree, runPreflight now reads the target tag's
package.json via `git show <tag>:package.json` and verifies that
process.versions.node satisfies its engines.node range. Failures land at
preflight-failed cleanly (no rollback needed — nothing has changed yet).
Motivation: a release that bumps the Node floor used to either fail
mid-`pnpm install` (which then rolls back successfully) or restart on the
new build and crash in the boot path (which then rolls back via the
health-check timer). Both paths recover, but they burn a drain + restart
cycle on a condition we can reject upfront.
Implementation
- new PreflightReason `node-engine-mismatch`
- new dep `readTargetEnginesNode(tag)` — runs the git-show as a child
process with stdio captured to a string; missing tag / missing file /
malformed JSON / missing engines.node all resolve to null (treated as
"no constraint, pass")
- uses existing semver dep with includePrerelease: true
- new PreflightInput field `currentNodeVersion`; threaded from
process.versions.node in both wirings (scheduler + manual apply)
- check runs *after* signature verification so we trust the package.json
- PreflightResult carries an optional `detail` string; applyPipeline
appends it to the lastResult.reason so the admin UI shows e.g.
"node-engine-mismatch: target requires Node >=26.0.0, running 25.0.0"
Tests: 6 new vitest cases (no engines.node, satisfies, fails below floor,
caret range, loose-spaced range, ordering after signature). Full
backend-new: 635 passed (was 629).
Refs #7607
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this commit, only the terminal rollback-failed state emailed the
admin. Auto-recovered failures (rolled-back-install-failed, rolled-back-
build-failed, rolled-back-health-check, rolled-back-crash-loop) and pre-
flight-failed surfaced only via the /admin/update banner — so a 3am
autonomous update that failed because of, say, a Node engine bump would
roll back silently and stay invisible until the admin next logged in.
Notifier
- new EmailKinds: 'update-preflight-failed', 'update-rolled-back',
'update-rollback-failed'
- new pure decideOutcomeEmail(input) → {toSend, newState}
- dedupe key `<outcome>:<targetTag>` in EmailSendLog.lastFailureKey:
same outcome on same tag emits one email per cycle (kills retry-loop
spam); a different outcome or different tag resets the key
- rollback-failed always fires (terminal — overrides dedupe)
- state.ts validator + loadState backfill the new field for legacy
state files (Tier 1/2/3 installs upgrading in place)
Wiring
- new index.ts helper notifyApplyFailure() loads state, runs the pure
notifier, sends (via the nodemailer-backed sendEmailViaSmtp from the
previous commit), persists the new dedupe key — all best-effort
- schedulerTriggerApply: fires on applyUpdate returning preflight-failed
or rolled-back
- /admin/update/apply HTTP handler: same
- boot path in expressCreateServer: if state.lastResult is a failure
outcome we haven't already emailed about, fire then. Covers:
- health-check timeout rollback (timer expired between boots)
- crash-loop forced rollback caught on a later boot
- preflight-failed where the process didn't get to email before exit
- unacknowledged rollback-failed terminal
Tests
- 8 new vitest cases for decideOutcomeEmail (adminEmail=null, each
outcome's content, dedupe by tag, dedupe by outcome, rollback-failed
bypass)
- Full backend-new suite: 643 passed (was 635)
Refs #7607
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
Review Summary by QodoTier 4 — autonomous update in maintenance window with SMTP notifications
WalkthroughsDescription• Implements **Tier 4 (autonomous)** auto-update feature with maintenance window support, completing
the four-tier auto-update design
• New pure module MaintenanceWindow.ts with parseWindow, inWindow, and nextWindowStart
functions handling same-day, cross-midnight, and DST cases
• UpdatePolicy.canAutonomous now requires valid updates.maintenanceWindow; missing/invalid
windows degrade to Tier 3 with explicit reason values
• Scheduler.decideSchedule snaps scheduledFor forward to next window opening;
decideTriggerApply defers when fire-time slips outside window
• SMTP email transport integration via nodemailer with failure outcome notifications and dedupe
tracking
• Node engine version compatibility check in preflight validation
• Settings adds updates.maintenanceWindow: {start, end, tz} | null and mail SMTP configuration
• Admin UI surfaces maintenance window info, deferred-until subtitle, and misconfiguration banner
• Comprehensive test coverage: 22 MaintenanceWindow unit tests, Tier 4 scheduler/policy/notifier
tests, integration tests for window boundaries
• Documentation: Tier 4 section in doc/admin/updates.md, smoke test runbook, implementation plan,
i18n keys, and CHANGELOG entry
Diagramflowchart LR
Settings["Settings<br/>maintenanceWindow<br/>mail config"]
Policy["UpdatePolicy<br/>canAutonomous gated<br/>by window"]
Scheduler["Scheduler<br/>snap-forward<br/>defer at fire"]
Notifier["Notifier<br/>failure emails<br/>dedupe tracking"]
Preflight["Preflight<br/>node engine check"]
Email["SMTP Transport<br/>nodemailer"]
AdminUI["Admin UI<br/>window display<br/>deferred subtitle<br/>config banner"]
Settings -- "provides window" --> Policy
Policy -- "gates autonomous" --> Scheduler
Scheduler -- "triggers apply" --> Preflight
Preflight -- "failure outcome" --> Notifier
Notifier -- "sends via" --> Email
Email -- "delivery" --> AdminUI
Settings -- "mail config" --> Email
Scheduler -- "defer action" --> AdminUI
File Changes1. src/node/updater/MaintenanceWindow.ts
|
Code Review by Qodo
1.
|
Resolves conflicts in src/package.json (@types/node bumped on develop, nodemailer added on this branch — keep both) and pnpm-lock.yaml (regenerated from the merged package.json via `pnpm install --lockfile-only`). Local verification: tsc --noEmit clean, vitest backend-new 643/643 green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…autonomous # Conflicts: # pnpm-lock.yaml
- UpdatePage: only show "deferred until" subtitle when scheduledFor actually matches nextWindowOpensAt. The previous `scheduledFor > now + 60s` heuristic misfired during a normal in-window 15-min grace period. - applyPipeline: return the enriched preflight reason (`reason: detail`) instead of only `pf.reason`, so /admin/update/apply 409 bodies and failure-notify emails preserve diagnostics like the Node engine mismatch detail. - updater/index: key the cached nodemailer transport on the full set of SMTP options (host + port + secure + auth) so runtime changes to port/credentials via reloadSettings() invalidate the cache. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…autonomous # Conflicts: # CHANGELOG.md
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
Summary
Ships Tier 4 (autonomous) of the four-tier auto-update design from
docs/superpowers/specs/2026-04-25-auto-update-design.md(§"Tier 4 —autonomous"). Completes the work tracked in #7607 after tiers 1 (#7601),
2 (#7704), and 3 (#7720).
MaintenanceWindow.ts(parseWindow,inWindow,nextWindowStart) covers same-day, cross-midnight, and DST cases.UpdatePolicy.canAutonomousnow requires a parse-validupdates.maintenanceWindow. Missing/invalid windows degrade to Tier 3(
canAuto: true) with explicitreasonvaluesmaintenance-window-missing/maintenance-window-invalid. Theterminal
rollback-failedstate still wins.Scheduler.decideSchedulesnapsscheduledForforward to the nextwindow opening when grace would otherwise land outside the window.
Scheduler.decideTriggerApplyreturns a new{action: 'defer', nextStart, reason: 'outside-maintenance-window'}when fire-time has slipped outside the window;
index.tspersists thenew
scheduledForand re-arms the timer.updates.maintenanceWindow: {start, end, tz} | null,defaulting to
null. Documented insettings.json.templateandsettings.json.docker.Phase tracking — this PR currently lands the backend for Tier 4.
Follow-up commits on the same PR will land:
MaintenanceWindowPicker.tsx, scheduled-panel "deferreduntil" subtitle, "configure window" banner, i18n keys under
update.window.*.GET /admin/update/statussurfacesnextWindowOpensAt.window-closes-mid-grace defers.
doc/admin/updates.mdTier 4 section, runbook smoke entry,CHANGELOG.mdUnreleased.Plan
docs/superpowers/plans/2026-05-15-auto-update-pr4-tier4-autonomous.md(committed in this PR) maps the spec section to concrete files and
verification steps task-by-task.
Test plan
MaintenanceWindow.test.ts(22 cases: parser,same-day, cross-midnight, tz=utc vs local, DST host-clock notes).
UpdatePolicy.test.tsextended (missing window,invalid window, lower-tier ignore, rollback-failed precedence).
Scheduler.test.tsextended (snap-forward, in-windowno-snap, canAutonomous=false bypass, defer at fire, fire in window,
email dedupe across defer).
pnpm exec tsc --noEmitclean.Closes #7607 once the follow-up commits land and CI is green.
🤖 Generated with Claude Code