Use this section as the newest summary of what changed in the workspace after the E2E recovery. If older notes elsewhere in this file conflict with it, trust this section.
- A green main browser gate was recovered by fixing real product and test mismatches, not by wrapper retries alone.
- The concrete regressions found in this recovery were:
-`DevicesPage` cursor flow could self-trigger a second page request and flood `/admin/devices`.
-`webhooks` and `social-accounts` services decoded the wrong backend response shapes.
-`settings` service unwrapped `data` twice even though the shared HTTP client had already returned `result.data`.
- Broad text-based Playwright assertions in later admin scenarios created brittle false negatives.
- The latest evidence set for this recovery was:
-`cd frontend/admin && npm.cmd run test:run -- src/pages/admin/DevicesPage/DevicesPage.test.tsx`
-`cd frontend/admin && npm.cmd run test:run -- src/services/webhooks.test.ts`
-`cd frontend/admin && npm.cmd run test:run -- src/pages/admin/WebhooksPage/WebhooksPage.test.tsx`
-`cd frontend/admin && npm.cmd run test:run -- src/services/social-accounts.test.ts`
-`cd frontend/admin && npm.cmd run lint`
-`cd frontend/admin && npm.cmd run build`
-`cd frontend/admin && npm.cmd run e2e:full:win`
- Practical rule: when `e2e:full:win` fails late in the suite, inspect both real application behavior and locator or route assumptions before blaming only browser or CDP instability.
## 2026-04-23 Governance Lessons From E2E Recovery
- A red browser gate can hide several different failure classes at once: product bug, integration-contract drift, selector drift, and browser-runtime instability.
- This recovery was closed by fixing real contract and locator problems, not by increasing retries around the wrapper.
- Pagination regressions are high-noise defects: they often show up as rate limiting, empty lists, or flaky E2E much earlier than they show up as obvious local exceptions.
- Response-envelope mismatches are easy to miss when pages silently fall back to empty arrays or partial data; service tests must pin the real backend field names.
- Documentation lag recreates stale priorities. Once the supported browser gate changes state, norms and experience docs need the same-day update.
- Browser-suite retry logic can create false failures when the first attempt mutates one-time backend state. Retry code has to re-read live preconditions instead of replaying stale assumptions.
## 2026-04-23 Password Reset Expansion Lessons
- Capability endpoints and mounted routes are one product contract. If the route is live but the capability bit is false, the browser surface is still effectively broken.
- A targeted green scenario is not enough evidence when the supported gate is the full suite. The 19th scenario only counted after `cd frontend/admin && npm.cmd run e2e:full:win` stayed green.
- Late-suite CDP page loss is best treated as a recoverable connection problem first, not as a reason to blindly multiply wrapper retries.
- Real auth coverage is worth the setup cost. The password-reset scenario now proves SMTP capture, token validation, password reset submission, and post-reset login in one browser chain.
## 2026-04-23 Permissions CRUD Closure Lessons
- A red browser scenario can come from product behavior, adapter drift, auth-header handling, or runner observation gaps. The fastest path was to separate those four possibilities instead of assuming every timeout meant browser flakiness.
- A successful browser fetch does not guarantee that Playwright CDP request or response listeners will observe the call under every proxy path. When the UI updated and the in-page fetch log showed `201` and `200`, the correct conclusion was "runner evidence gap", not "permission create is broken".
- Shared HTTP client state is easy to misread under concurrency. "A refresh is in flight" and "this request lacks a usable token" are different facts; merging them creates false auth regressions.
- Adapter normalization changes must update both focused service tests and aggregate service suites. Fixing only the local adapter test leaves a second failure surface in cross-service regression packs.
- Modal animations are a real source of E2E false negatives. A dialog that is visually closing can still block clicks long enough to break the next CRUD step unless the runner waits for the overlay to stop intercepting input.
- Build tooling can be a real release blocker. Vite root resolution on Windows became part of the supported gate the moment `npm.cmd run build` started failing under the documented command.
- The 20th browser scenario only counted after two proofs existed on the same branch state: the targeted `permissions-management-crud` run and the full `cd frontend/admin && npm.cmd run e2e:full:win` run.
- Browser E2E is often the first place where outbound write contracts are validated end to end. A service adapter can look fine in page-level tests while still sending the wrong backend field names.
- Service tests must assert the serialized write payload, not only the UI form model. Otherwise the test suite can lock in the wrong contract and make the browser suite the first honest signal.
- Orphaned async diagnostics waste debugging time. A failed click or fill should not leave a background fetch waiter alive long enough to crash during cleanup and hide the real failing step.
- A targeted scenario recovery is still not enough evidence on its own. The `profile-and-security` fix only counted after `cd frontend/admin && npm.cmd run e2e:full:win` returned green on the same workspace state.
## 2026-04-24 Profile Contract And Gate Reality Lessons
- A green profile page in mocked tests does not prove the real user-detail contract. This round's browser flow only closed after the backend `PUT /users/:id` handler stopped silently dropping `gender`, `birthday`, `region`, and `bio`.
- Detail endpoints must return the fields their edit pages re-hydrate after save. Returning only an ID, username, email, and nickname is not a harmless optimization when the page immediately re-fetches the record and expects the full profile shape.
- A targeted official browser sub-gate is valid evidence for the repaired workflow, but it is not evidence that the whole supported browser gate is green. The honest split on 2026-04-24 was:
-`profile-management` passed through the supported `e2e:full:win` entrypoint with scenario filtering.
- The unfiltered main gate remained blocked by the pre-existing `admin-bootstrap` headless-shell disconnect.
- Wrapper drift matters. Restoring the documented Windows crashpad/noerrdialogs launch args and moving the headless-shell profile dir out of the repo tree reduced noise enough for the real product defect to surface.
- When Chromium-family browsers all show the same host-level `crashpad` or `mojo platform_channel` access-denied signals, it is no longer rigorous to keep treating every E2E collapse as a product bug.
- Shared backend state does not require a shared browser process. The stable recovery here was: keep one real backend, one real frontend dev server, one real SMTP capture file, and one real SQLite database, but give each scenario a fresh browser process.
- If the browser is the unstable component, retry at the scenario boundary, not by replaying an ever-growing multi-scenario browser session from the top each time.
- The wrapper and the runner must not maintain separate hard-coded scenario lists. Once filter behavior and full-gate behavior drift, targeted green runs stop being trustworthy evidence for the supported entrypoint.
## 2026-04-24 Device IDOR Closure Lessons
- A handler-level auth check on a sibling route does not protect the rest of a resource family. `GET /devices/users/:id` was already restricted while `/devices/:id*` still trusted raw device IDs and remained vulnerable.
- Ownership-sensitive APIs need actor-aware service entry points. Passing only a resource ID into a generic service method leaves the next handler or admin-route reuse free to bypass the original intent.
- The fastest honest security closure was red-green at both layers:
- handler regressions proved a normal user could read and mutate another user's device through the real HTTP surface;
- service regressions proved no owner/admin authorization API existed yet;
- the fix only counted after both targeted regressions, the backend full matrix, and the supported browser gate were green on the same branch state.