docs/team/PROJECT_EXPERIENCE_SUMMARY.md

# 项目经验总结

更新时间：2026-03-25

这份总结只记录本项目已经真实发生过、并且已经影响工程决策的经验。

## 1. 真正的收口来自证据，不来自感觉

- 只做代码修改，不做完整验证，不能称为收口。
- 这次项目推进中，真正有价值的闭环来自：
  - `go test ./... -count=1`
  - `go vet ./...`
  - `go build ./cmd/server`
  - `cd frontend/admin && npm.cmd run lint`
  - `cd frontend/admin && npm.cmd run build`
  - `cd frontend/admin && npm.cmd run e2e:full:win`

## 2. 浏览器级真实 E2E 与 OS 级自动化不是一回事

- 当前项目已经形成稳定的浏览器级真实 E2E 路径。
- 但这不覆盖系统文件选择器、原生权限弹窗、桌面窗口层行为。
- 因此对外必须区分：
  - 浏览器级真实验证已闭环
  - 完整 OS 级自动化未闭环

## 3. 字符串猜错误类型非常脆弱

- 邮箱验证码限流曾因为错误文本编码漂移，从 `429` 退化成 `500`。
- 短信发送也存在同类风险，甚至一度把限流错误错误映射成 `400`。
- 结论：
  - 错误分级必须优先使用显式错误类型
  - 旧字符串判断只能短期兼容，不能长期依赖

## 4. fake success 比直接失败更危险

- 邮件、短信、OAuth、上传这类链路，如果依赖缺失仍然返回成功，会让前端、测试和运营都得到错误信号。
- 这类问题不会减少故障，只会推迟暴露时间并放大排查成本。
- 结论：
  - 运行时必须 fail closed
  - 缺配置时要么禁用能力，要么启动失败

## 5. 分层设计不是形式问题，而是稳定性问题

- TOTP 服务曾依赖具体仓储实现断言，导致 service 对替换实现和测试 mock 都很脆弱。
- 后续把依赖收回到接口能力后，分层更稳，测试也更自然。
- 结论：
  - service 依赖接口，不依赖具体 repo 类型

## 6. 非测试 `panic` 会放大生产风险

- 兼容入口中的 `panic` 即使当前主路径不用，也会在后续复用、测试或错误调用时变成进程级风险。
- 结论：
  - 非测试代码中的 `panic` 必须持续清零

## 7. `smoke` 可以保留，但必须明确降级

- 诊断脚本有价值，但不能被包装成“主验收已通过”的替代品。
- 结论：
  - `smoke` 只能做补充诊断
  - 主验收必须走真实主链路

## 8. 前端弹窗问题必须被当成缺陷，而不是小瑕疵

- 浏览器原生弹窗会直接打断真实后台主流程和自动化执行。
- 这次项目里，给 `window.alert/confirm/prompt/open` 增加阻断和日志后，验证稳定性明显提高。
- 结论：
  - 原生弹窗和 popup 都应纳入失败信号

## 9. 文档如果不跟着代码一起更新，很快就会反过来误导团队

- 真实状态、规则、发布门槛如果不及时更新，后续协作会不断重复已经踩过的坑。
- 结论：
  - 状态、规则、经验、agent 都要跟代码一起维护

## 10. 接下来仍然属于真实缺口的部分

以下不是"代码没写完"，而是仍未形成完整外部交付证据：

- 真实第三方 OAuth live browser validation
- 外部 Secrets Manager / KMS 证据
- 多环境 CI/CD 密钥分发证据
- 跨历史版本 schema downgrade 回滚证据
- 完整 OS 级自动化证据

## 11. 多智能体并行是提效的关键路径

- 2026-04-02 起，引入 Gitea 远程仓库作为协作基线。
- 后续迭代采用多智能体并行模式：
  - 方案对比阶段：多个智能体并行输出不同方案，由决策者选择最优解。
  - 实现阶段：无依赖的任务并行执行，有依赖的任务按拓扑序执行。
  - 验证阶段：后端测试、前端 lint/build、E2E 测试并行执行。
- 经验教训：
  - 任务拆分必须明确依赖关系，否则并行执行会互相阻塞。
  - 多个智能体修改同一文件时，必须在任务拆分阶段识别并协调。
  - 验证阶段并行执行可以显著缩短反馈周期。

## 12. 方案对比能避免走弯路

- 新增核心功能或架构变更时，必须先做方案对比。
- 对比维度：实现复杂度、性能影响、可维护性、与现有架构的兼容性、测试难度。
- 选定的方案必须记录决策原因，被否决的方案必须记录否决原因。
- 经验教训：
  - 不经过方案对比直接实现，容易在后期发现更优方案，导致返工。
  - 对比记录是团队知识沉淀的重要组成部分。

## 13. 快速迭代的核心是小步验证

- 每个迭代周期不超过 2 小时。
- 每个迭代完成后立即执行验证矩阵。
- 如果验证失败，立即回滚到上一个可用状态。
- 阻塞超过 30 分钟必须上报并寻求协助。
- 经验教训：
  - 大步提交会增加回滚成本和排查难度。
  - 快速验证能尽早发现设计断链和实现偏差。
  - 持续验证比最终验证更可靠。

## 14. 测试全面性决定上线信心

- 新增代码必须有对应测试。
- 修复 bug 必须有回归测试。
- 安全敏感代码必须有边界条件测试。
- 经验教训：
  - 没有测试的代码变更是定时炸弹。
  - 回归测试能防止已修复的问题再次出现。
  - 边界条件测试能发现最隐蔽的缺陷。

## 15. 虚假测试比没有测试更危险

- 虚假测试会给人"已通过"的错觉，推迟问题暴露时间并放大排查成本。
- 项目中发现过的虚假测试模式：
  - 使用 mock 响应替代真实 API 调用进行 E2E 验证
  - 在测试中硬编码预期结果而不走真实业务链路
  - 跳过认证、权限校验等安全环节直接断言页面状态
  - 在测试中使用 `context.Background()` 绕过上下文治理
- 结论：
  - E2E 测试必须启动真实后端进程和前端服务器
  - 必须通过真实浏览器（CDP 协议）执行用户操作
  - 必须验证真实 API 响应和真实数据库状态变化
  - 当前项目的真实 E2E 路径是 `cd frontend/admin && npm.cmd run e2e:full:win`

## 16. 浏览器自动化工具是 E2E 能力的延伸

- Playwright CDP E2E 已经覆盖管理员引导、注册、邮箱激活、登录、认证工作流、响应式布局、桌面/移动端导航。
- 但仍有一些复杂交互场景未被覆盖：
  - 设备信任管理
  - 批量操作
  - 系统设置页
  - 管理员管理页
  - 登录日志导出
- 未来应引入 `agent-browser`（bb browse）等浏览器自动化工具：
  - 补充 Playwright 未覆盖的交互场景
  - 增加复杂业务流程的端到端验证
  - 提供更灵活的用户操作模拟能力

## 17. 2026-04-10 多轮 Review 的新增经验

- 2026-04-08、2026-04-09、2026-04-10 的连续 review 证明：真正难的不是把 stub 改成 live，而是把 live 链路补到可治理、可回滚、可验证。
- `GetUserRoles`、`AssignRoles`、`CreateAdmin`、`DeleteAdmin` 从 stub 变成 live 后，问题从“功能没实现”升级成“权限边界、事务一致性、管理员治理是否成立”。
- 经验教训：
  - “功能通了”不是结束，live 后第一轮就应该补越权读取、越权修改、自删管理员、最后管理员、失败回滚等负向验证。
  - 高风险治理面不能靠默认假设，必须用显式规则和测试守住。

## 18. 主入口绿灯比局部绿灯更重要

- 连续 review 反复说明：`go vet ./...`、`go build ./cmd/server`、`go test ./... -short -count=1` 的绿灯，不能代替全量 `go test ./... -count=1` 与 `npm.cmd run e2e:full:win`。
- 2026-04-10 的 review 里，`LL_001` 仍让全量后端测试失败，`e2e:full:win` 仍卡在包装入口；这说明“单步可过”与“主入口可过”是两件不同的事。
- 经验教训：
  - 发布判断必须跟着文档支持的主入口走。
  - 任何脚本包装层失败都算真实失败，不应被下层局部绿灯掩盖。

## 19. 测试噪声也是质量问题

- 前端 `test:run` 与 `test:coverage` 即使最终返回成功，只要仍输出 `window.alert` 的 jsdom `Not implemented` 噪声，就说明代码库里还保留着会破坏真实交互的缺陷信号。
- 经验教训：
  - “success summary 之后还有噪声”不算干净通过。
  - 原生弹窗与 popup 应继续按缺陷治理，而不是按低优先级美观问题处理。

## 20. 文档如果慢于代码，会制造第二轮返工

- 多轮 review 的另一个稳定结论是：状态文档、质量规范、发布清单、技术指引如果不跟着真实结论更新，很快就会反向误导后续协作。
- 经验教训：
  - review 一旦改变了真实结论，当轮就要同步文档。
  - 文档不是收尾材料，而是下一轮决策的输入。

## 21. 部分完成等于未完成

- 项目中发现：声称"已添加 swagger 注解"但只添加了部分方法的注解。
- 项目中发现：声称"已统一响应格式"但 SSO handler 仍有 3 个端点未统一。
- 项目中发现：声称"已定义测试基础设施"但 IntegrationRedisSuite 类型从未定义。
- 经验教训：
  - "80% 完成"在质量语境下等于"未完成"。
  - 验证必须逐项，不能只看整体数字。
  - 每次提交前必须运行完整性检查。

## 22. 完整性检查必须是自动化的

- 手动检查容易被跳过或遗漏。
- 经验教训：
  - 必须有自动化检查脚本验证 swagger 注解完整性。
  - 必须在 CI 中集成完整性检查。
  - 必须在 PR 检查清单中明确列出完整性验证命令。

## 23. 声称 vs 实际的差距来源

虚假完成通常来自：
1. **部分完成就说完成**：swagger 注解 80% 完整就声称"已完成"
2. **格式不统一**：大部分统一但有例外就声称"已统一"
3. **类型未定义**：引用未定义的类型但测试没运行就声称"测试通过"
4. **覆盖率数字失真**：mock 测试占比高但计入覆盖率

防范措施：
- 完整性检查必须逐项
- 覆盖率必须验证真实测试运行
- 类型引用必须验证定义存在
## 2026-04-18 从复核到修复的经验

本附录记录了 2026-04-17 报告复核和 2026-04-18 文档对齐过程中提炼出的工程经验。

### 1. 评审报告不是实时状态页

- 一份报告可以在技术上仍然有价值，但它的门禁摘要会很快过时。
- 团队必须把以下两类事实分开：
  - 报告日期的发现
  - 当前工作区的真实门禁状态
- 如果这两类事实混写，执行顺序和优先级判断会很快漂移。

### 2. 新鲜命令证据优先于继承结论

- `go test ./... -count=1` 曾在评审材料里被视为红灯，但新鲜执行后在当前工作区已经转绿。
- 与此同时，前端 `lint` 已经重新变红。
- 经验：
  在安排修复顺序前，必须先刷新真实门禁。

### 3. stub 转 live 会带来第二波风险

- `AssignRoles`、`CreateAdmin/DeleteAdmin`、`UploadAvatar` 已经越过了旧的“未实现”阶段。
- 一旦转为 live，主导风险就会从“功能缺失”切换为：
  - 授权边界
  - 事务性
  - 公开暴露面
  - 自操作 / 最后管理员治理
- 经验：
  live 实现必须被当作新的安全与治理面重新复核，不能因为 stub 消失就直接标记为“闭环”。

### 4. 发布阻塞往往是策略链断裂，不是没写代码

- 密码登录绕过 TOTP/设备信任校验，比很多显眼的“功能缺失”更像真实发布阻塞项。
- refresh token 吊销 fail-open 也是发布阻塞项，即使代码路径本身已经存在。
- 经验：
  在认证系统里，“已实现”不等于“完整”，只要安全策略链断了，就是关键缺陷。

### 5. 事实成立，不代表措辞可以粗糙

- LIKE 搜索问题是真实的，但把它笼统写成通用 SQL 注入，会夸大具体缺陷类型。
- 密码重置 replay 问题也是真实的，但必须精确指出脆弱路径。
- 经验：
  严重级别可以保持不变，但措辞必须更精确；精确措辞能加快修复，也能减少无效争论。

### 6. 主入口绿灯比局部绿灯更重要

- 局部命令成功，不能替代项目正式支持的主命令成功。
- 包装层失败或顶层命令失败，就是真实项目失败，即使更深层子命令单独能过。
- 经验：
  所有结论都必须对齐文档中声明的主验收入口。

### 7. 文档漂移会制造返工

- `REAL_PROJECT_STATUS`、评审报告和团队规范已经开始出现漂移。
- 这种漂移会把下一轮修复引向过时优先级。
- 经验：
  文档更新不是交付后的清理工作，而是交付本身的一部分。

## 0. 2026-04-23 E2E Recovery Lessons

Use this section as the newest summary of what changed in the workspace after the E2E recovery. If older notes elsewhere in this file conflict with it, trust this section.

- A green main browser gate was recovered by fixing real product and test mismatches, not by wrapper retries alone.
- The concrete regressions found in this recovery were:
  - `DevicesPage` cursor flow could self-trigger a second page request and flood `/admin/devices`.
  - `webhooks` and `social-accounts` services decoded the wrong backend response shapes.
  - `settings` service unwrapped `data` twice even though the shared HTTP client had already returned `result.data`.
  - Broad text-based Playwright assertions in later admin scenarios created brittle false negatives.
- The latest evidence set for this recovery was:
  - `cd frontend/admin && npm.cmd run test:run -- src/pages/admin/DevicesPage/DevicesPage.test.tsx`
  - `cd frontend/admin && npm.cmd run test:run -- src/services/webhooks.test.ts`
  - `cd frontend/admin && npm.cmd run test:run -- src/pages/admin/WebhooksPage/WebhooksPage.test.tsx`
  - `cd frontend/admin && npm.cmd run test:run -- src/services/social-accounts.test.ts`
  - `cd frontend/admin && npm.cmd run lint`
  - `cd frontend/admin && npm.cmd run build`
  - `cd frontend/admin && npm.cmd run e2e:full:win`
- Practical rule: when `e2e:full:win` fails late in the suite, inspect both real application behavior and locator or route assumptions before blaming only browser or CDP instability.

## 2026-04-23 Governance Lessons From E2E Recovery

- A red browser gate can hide several different failure classes at once: product bug, integration-contract drift, selector drift, and browser-runtime instability.
- This recovery was closed by fixing real contract and locator problems, not by increasing retries around the wrapper.
- Pagination regressions are high-noise defects: they often show up as rate limiting, empty lists, or flaky E2E much earlier than they show up as obvious local exceptions.
- Response-envelope mismatches are easy to miss when pages silently fall back to empty arrays or partial data; service tests must pin the real backend field names.
- Documentation lag recreates stale priorities. Once the supported browser gate changes state, norms and experience docs need the same-day update.
- Browser-suite retry logic can create false failures when the first attempt mutates one-time backend state. Retry code has to re-read live preconditions instead of replaying stale assumptions.

## 2026-04-23 Password Reset Expansion Lessons

- Capability endpoints and mounted routes are one product contract. If the route is live but the capability bit is false, the browser surface is still effectively broken.
- A targeted green scenario is not enough evidence when the supported gate is the full suite. The 19th scenario only counted after `cd frontend/admin && npm.cmd run e2e:full:win` stayed green.
- Late-suite CDP page loss is best treated as a recoverable connection problem first, not as a reason to blindly multiply wrapper retries.
- Real auth coverage is worth the setup cost. The password-reset scenario now proves SMTP capture, token validation, password reset submission, and post-reset login in one browser chain.

## 2026-04-23 Permissions CRUD Closure Lessons

- A red browser scenario can come from product behavior, adapter drift, auth-header handling, or runner observation gaps. The fastest path was to separate those four possibilities instead of assuming every timeout meant browser flakiness.
- A successful browser fetch does not guarantee that Playwright CDP request or response listeners will observe the call under every proxy path. When the UI updated and the in-page fetch log showed `201` and `200`, the correct conclusion was "runner evidence gap", not "permission create is broken".
- Shared HTTP client state is easy to misread under concurrency. "A refresh is in flight" and "this request lacks a usable token" are different facts; merging them creates false auth regressions.
- Adapter normalization changes must update both focused service tests and aggregate service suites. Fixing only the local adapter test leaves a second failure surface in cross-service regression packs.
- Modal animations are a real source of E2E false negatives. A dialog that is visually closing can still block clicks long enough to break the next CRUD step unless the runner waits for the overlay to stop intercepting input.
- Build tooling can be a real release blocker. Vite root resolution on Windows became part of the supported gate the moment `npm.cmd run build` started failing under the documented command.
- The 20th browser scenario only counted after two proofs existed on the same branch state: the targeted `permissions-management-crud` run and the full `cd frontend/admin && npm.cmd run e2e:full:win` run.

## 2026-04-24 Profile Security Contract Recovery Lessons

- Browser E2E is often the first place where outbound write contracts are validated end to end. A service adapter can look fine in page-level tests while still sending the wrong backend field names.
- Service tests must assert the serialized write payload, not only the UI form model. Otherwise the test suite can lock in the wrong contract and make the browser suite the first honest signal.
- Orphaned async diagnostics waste debugging time. A failed click or fill should not leave a background fetch waiter alive long enough to crash during cleanup and hide the real failing step.
- A targeted scenario recovery is still not enough evidence on its own. The `profile-and-security` fix only counted after `cd frontend/admin && npm.cmd run e2e:full:win` returned green on the same workspace state.
## 2026-04-24 Profile Contract And Gate Reality Lessons

- A green profile page in mocked tests does not prove the real user-detail contract. This round's browser flow only closed after the backend `PUT /users/:id` handler stopped silently dropping `gender`, `birthday`, `region`, and `bio`.
- Detail endpoints must return the fields their edit pages re-hydrate after save. Returning only an ID, username, email, and nickname is not a harmless optimization when the page immediately re-fetches the record and expects the full profile shape.
- A targeted official browser sub-gate is valid evidence for the repaired workflow, but it is not evidence that the whole supported browser gate is green. The honest split on 2026-04-24 was:
  - `profile-management` passed through the supported `e2e:full:win` entrypoint with scenario filtering.
  - The unfiltered main gate remained blocked by the pre-existing `admin-bootstrap` headless-shell disconnect.
- Wrapper drift matters. Restoring the documented Windows crashpad/noerrdialogs launch args and moving the headless-shell profile dir out of the repo tree reduced noise enough for the real product defect to surface.

## 2026-04-24 Scenario-Isolated Browser Orchestration Lessons

- When Chromium-family browsers all show the same host-level `crashpad` or `mojo platform_channel` access-denied signals, it is no longer rigorous to keep treating every E2E collapse as a product bug.
- Shared backend state does not require a shared browser process. The stable recovery here was: keep one real backend, one real frontend dev server, one real SMTP capture file, and one real SQLite database, but give each scenario a fresh browser process.
- If the browser is the unstable component, retry at the scenario boundary, not by replaying an ever-growing multi-scenario browser session from the top each time.
- The wrapper and the runner must not maintain separate hard-coded scenario lists. Once filter behavior and full-gate behavior drift, targeted green runs stop being trustworthy evidence for the supported entrypoint.

## 2026-04-24 Device IDOR Closure Lessons

- A handler-level auth check on a sibling route does not protect the rest of a resource family. `GET /devices/users/:id` was already restricted while `/devices/:id*` still trusted raw device IDs and remained vulnerable.
- Ownership-sensitive APIs need actor-aware service entry points. Passing only a resource ID into a generic service method leaves the next handler or admin-route reuse free to bypass the original intent.
- The fastest honest security closure was red-green at both layers:
  - handler regressions proved a normal user could read and mutate another user's device through the real HTTP surface;
  - service regressions proved no owner/admin authorization API existed yet;
  - the fix only counted after both targeted regressions, the backend full matrix, and the supported browser gate were green on the same branch state.
-												docs: project docs, scripts, deployment configs, and evidence

											
										
										
											2026-04-02 11:22:17 +08:00
+								# 项目经验总结
 								更新时间：2026-03-25
 								这份总结只记录本项目已经真实发生过、并且已经影响工程决策的经验。
 								## 1. 真正的收口来自证据，不来自感觉
 								- 只做代码修改，不做完整验证，不能称为收口。
 								- 这次项目推进中，真正有价值的闭环来自：
 								  - `go test ./... -count=1`
 								  - `go vet ./...`
 								  - `go build ./cmd/server`
 								  - `cd frontend/admin && npm.cmd run lint`
 								  - `cd frontend/admin && npm.cmd run build`
 								  - `cd frontend/admin && npm.cmd run e2e:full:win`
 								## 2. 浏览器级真实 E2E 与 OS 级自动化不是一回事
 								- 当前项目已经形成稳定的浏览器级真实 E2E 路径。
 								- 但这不覆盖系统文件选择器、原生权限弹窗、桌面窗口层行为。
 								- 因此对外必须区分：
 								  - 浏览器级真实验证已闭环
 								  - 完整 OS 级自动化未闭环
 								## 3. 字符串猜错误类型非常脆弱
 								- 邮箱验证码限流曾因为错误文本编码漂移，从 `429` 退化成 `500`。
 								- 短信发送也存在同类风险，甚至一度把限流错误错误映射成 `400`。
 								- 结论：
 								  - 错误分级必须优先使用显式错误类型
 								  - 旧字符串判断只能短期兼容，不能长期依赖
 								## 4. fake success 比直接失败更危险
 								- 邮件、短信、OAuth、上传这类链路，如果依赖缺失仍然返回成功，会让前端、测试和运营都得到错误信号。
 								- 这类问题不会减少故障，只会推迟暴露时间并放大排查成本。
 								- 结论：
 								  - 运行时必须 fail closed
 								  - 缺配置时要么禁用能力，要么启动失败
 								## 5. 分层设计不是形式问题，而是稳定性问题
 								- TOTP 服务曾依赖具体仓储实现断言，导致 service 对替换实现和测试 mock 都很脆弱。
 								- 后续把依赖收回到接口能力后，分层更稳，测试也更自然。
 								- 结论：
 								  - service 依赖接口，不依赖具体 repo 类型
 								## 6. 非测试 `panic` 会放大生产风险
 								- 兼容入口中的 `panic` 即使当前主路径不用，也会在后续复用、测试或错误调用时变成进程级风险。
 								- 结论：
 								  - 非测试代码中的 `panic` 必须持续清零
 								## 7. `smoke` 可以保留，但必须明确降级
 								- 诊断脚本有价值，但不能被包装成“主验收已通过”的替代品。
 								- 结论：
 								  - `smoke` 只能做补充诊断
 								  - 主验收必须走真实主链路
 								## 8. 前端弹窗问题必须被当成缺陷，而不是小瑕疵
 								- 浏览器原生弹窗会直接打断真实后台主流程和自动化执行。
 								- 这次项目里，给 `window.alert/confirm/prompt/open` 增加阻断和日志后，验证稳定性明显提高。
 								- 结论：
 								  - 原生弹窗和 popup 都应纳入失败信号
 								## 9. 文档如果不跟着代码一起更新，很快就会反过来误导团队
 								- 真实状态、规则、发布门槛如果不及时更新，后续协作会不断重复已经踩过的坑。
 								- 结论：
 								  - 状态、规则、经验、agent 都要跟代码一起维护
 								## 10. 接下来仍然属于真实缺口的部分
-												refactor: 整理项目根目录结构

整理内容:
- 删除 60+ 临时测试输出文件 (*.txt)
- 移动二进制文件到 bin/ 目录
- 移动 Shell 脚本到 scripts/ 目录
  - scripts/dev/: check_gitea.sh, check_sub2api.sh, run_tests.sh
  - scripts/deploy/: deploy_*.sh, simple_deploy.sh
  - scripts/ops/: fix_nginx.sh, fix_ssl.sh, install_docker.sh
  - scripts/test/: test_*.sh, test_*.bat
- 移动批处理文件到 scripts/
- 移动 Python 脚本到 tools/
- 清理临时日志文件

保留根目录必要文件:
- go.mod, go.sum, go.work
- Makefile, docker-compose.yml
- .env.example, .gitignore
- README.md, AGENTS.md, DEPLOY_GUIDE.md

验证: go build ./... && go test ./... 通过

											
										
										
											2026-04-07 18:10:36 +08:00
+								以下不是"代码没写完"，而是仍未形成完整外部交付证据：
-												docs: project docs, scripts, deployment configs, and evidence

											
										
										
											2026-04-02 11:22:17 +08:00
 								- 真实第三方 OAuth live browser validation
 								- 外部 Secrets Manager / KMS 证据
 								- 多环境 CI/CD 密钥分发证据
 								- 跨历史版本 schema downgrade 回滚证据
 								- 完整 OS 级自动化证据
-												refactor: 整理项目根目录结构

整理内容:
- 删除 60+ 临时测试输出文件 (*.txt)
- 移动二进制文件到 bin/ 目录
- 移动 Shell 脚本到 scripts/ 目录
  - scripts/dev/: check_gitea.sh, check_sub2api.sh, run_tests.sh
  - scripts/deploy/: deploy_*.sh, simple_deploy.sh
  - scripts/ops/: fix_nginx.sh, fix_ssl.sh, install_docker.sh
  - scripts/test/: test_*.sh, test_*.bat
- 移动批处理文件到 scripts/
- 移动 Python 脚本到 tools/
- 清理临时日志文件

保留根目录必要文件:
- go.mod, go.sum, go.work
- Makefile, docker-compose.yml
- .env.example, .gitignore
- README.md, AGENTS.md, DEPLOY_GUIDE.md

验证: go build ./... && go test ./... 通过

											
										
										
											2026-04-07 18:10:36 +08:00
 								## 11. 多智能体并行是提效的关键路径
 								- 2026-04-02 起，引入 Gitea 远程仓库作为协作基线。
 								- 后续迭代采用多智能体并行模式：
 								  - 方案对比阶段：多个智能体并行输出不同方案，由决策者选择最优解。
 								  - 实现阶段：无依赖的任务并行执行，有依赖的任务按拓扑序执行。
 								  - 验证阶段：后端测试、前端 lint/build、E2E 测试并行执行。
 								- 经验教训：
 								  - 任务拆分必须明确依赖关系，否则并行执行会互相阻塞。
 								  - 多个智能体修改同一文件时，必须在任务拆分阶段识别并协调。
 								  - 验证阶段并行执行可以显著缩短反馈周期。
 								## 12. 方案对比能避免走弯路
 								- 新增核心功能或架构变更时，必须先做方案对比。
 								- 对比维度：实现复杂度、性能影响、可维护性、与现有架构的兼容性、测试难度。
 								- 选定的方案必须记录决策原因，被否决的方案必须记录否决原因。
 								- 经验教训：
 								  - 不经过方案对比直接实现，容易在后期发现更优方案，导致返工。
 								  - 对比记录是团队知识沉淀的重要组成部分。
 								## 13. 快速迭代的核心是小步验证
 								- 每个迭代周期不超过 2 小时。
 								- 每个迭代完成后立即执行验证矩阵。
 								- 如果验证失败，立即回滚到上一个可用状态。
 								- 阻塞超过 30 分钟必须上报并寻求协助。
 								- 经验教训：
 								  - 大步提交会增加回滚成本和排查难度。
 								  - 快速验证能尽早发现设计断链和实现偏差。
 								  - 持续验证比最终验证更可靠。
 								## 14. 测试全面性决定上线信心
 								- 新增代码必须有对应测试。
 								- 修复 bug 必须有回归测试。
 								- 安全敏感代码必须有边界条件测试。
 								- 经验教训：
 								  - 没有测试的代码变更是定时炸弹。
 								  - 回归测试能防止已修复的问题再次出现。
 								  - 边界条件测试能发现最隐蔽的缺陷。
 								## 15. 虚假测试比没有测试更危险
 								- 虚假测试会给人"已通过"的错觉，推迟问题暴露时间并放大排查成本。
 								- 项目中发现过的虚假测试模式：
 								  - 使用 mock 响应替代真实 API 调用进行 E2E 验证
 								  - 在测试中硬编码预期结果而不走真实业务链路
 								  - 跳过认证、权限校验等安全环节直接断言页面状态
 								  - 在测试中使用 `context.Background()` 绕过上下文治理
 								- 结论：
 								  - E2E 测试必须启动真实后端进程和前端服务器
 								  - 必须通过真实浏览器（CDP 协议）执行用户操作
 								  - 必须验证真实 API 响应和真实数据库状态变化
 								  - 当前项目的真实 E2E 路径是 `cd frontend/admin && npm.cmd run e2e:full:win`
 								## 16. 浏览器自动化工具是 E2E 能力的延伸
 								- Playwright CDP E2E 已经覆盖管理员引导、注册、邮箱激活、登录、认证工作流、响应式布局、桌面/移动端导航。
 								- 但仍有一些复杂交互场景未被覆盖：
 								  - 设备信任管理
 								  - 批量操作
 								  - 系统设置页
 								  - 管理员管理页
 								  - 登录日志导出
 								- 未来应引入 `agent-browser`（bb browse）等浏览器自动化工具：
 								  - 补充 Playwright 未覆盖的交互场景
 								  - 增加复杂业务流程的端到端验证
 								  - 提供更灵活的用户操作模拟能力
-												docs: add multi-round review learnings to team quality docs

- PRODUCTION_CHECKLIST: add RBAC/admin governance checklist section
- PROJECT_EXPERIENCE_SUMMARY: add lessons from 2026-04-10 reviews (live ≠ done, main-entry green > local green, test noise = quality issue, docs lag = rework)
- QUALITY_STANDARD: add stub→live review threshold rules

											
										
										
											2026-04-11 10:41:08 +08:00
 								## 17. 2026-04-10 多轮 Review 的新增经验
 								- 2026-04-08、2026-04-09、2026-04-10 的连续 review 证明：真正难的不是把 stub 改成 live，而是把 live 链路补到可治理、可回滚、可验证。
 								- `GetUserRoles`、`AssignRoles`、`CreateAdmin`、`DeleteAdmin` 从 stub 变成 live 后，问题从“功能没实现”升级成“权限边界、事务一致性、管理员治理是否成立”。
 								- 经验教训：
 								  - “功能通了”不是结束，live 后第一轮就应该补越权读取、越权修改、自删管理员、最后管理员、失败回滚等负向验证。
 								  - 高风险治理面不能靠默认假设，必须用显式规则和测试守住。
 								## 18. 主入口绿灯比局部绿灯更重要
 								- 连续 review 反复说明：`go vet ./...`、`go build ./cmd/server`、`go test ./... -short -count=1` 的绿灯，不能代替全量 `go test ./... -count=1` 与 `npm.cmd run e2e:full:win`。
 								- 2026-04-10 的 review 里，`LL_001` 仍让全量后端测试失败，`e2e:full:win` 仍卡在包装入口；这说明“单步可过”与“主入口可过”是两件不同的事。
 								- 经验教训：
 								  - 发布判断必须跟着文档支持的主入口走。
 								  - 任何脚本包装层失败都算真实失败，不应被下层局部绿灯掩盖。
 								## 19. 测试噪声也是质量问题
 								- 前端 `test:run` 与 `test:coverage` 即使最终返回成功，只要仍输出 `window.alert` 的 jsdom `Not implemented` 噪声，就说明代码库里还保留着会破坏真实交互的缺陷信号。
 								- 经验教训：
 								  - “success summary 之后还有噪声”不算干净通过。
 								  - 原生弹窗与 popup 应继续按缺陷治理，而不是按低优先级美观问题处理。
 								## 20. 文档如果慢于代码，会制造第二轮返工
 								- 多轮 review 的另一个稳定结论是：状态文档、质量规范、发布清单、技术指引如果不跟着真实结论更新，很快就会反向误导后续协作。
 								- 经验教训：
 								  - review 一旦改变了真实结论，当轮就要同步文档。
 								  - 文档不是收尾材料，而是下一轮决策的输入。
-												docs: add false completion prevention rules and fix swagger gaps

Changes:
- Add FALSE_COMPLETION_PREVENTION.md documenting false completion patterns
- Add integrity check script (scripts/check-integrity.sh) for automated verification
- Fix swagger annotation gaps in 3 handlers (+10 annotations):
  - password_reset_handler.go: +4 annotations
  - totp_handler.go: +4 annotations
  - log_handler.go: +2 annotations
- Define IntegrationRedisSuite type for Redis integration tests
- Update QUALITY_STANDARD.md with swagger completeness and response format requirements
- Update PROJECT_EXPERIENCE_SUMMARY.md with new learnings on false completion

Integrity check now validates:
- Swagger annotation completeness per handler
- Response format uniformity (with OAuth whitelist)
- Test infrastructure type definitions
- Repository test coverage

											
										
										
											2026-04-11 23:38:43 +08:00
 								## 21. 部分完成等于未完成
 								- 项目中发现：声称"已添加 swagger 注解"但只添加了部分方法的注解。
 								- 项目中发现：声称"已统一响应格式"但 SSO handler 仍有 3 个端点未统一。
 								- 项目中发现：声称"已定义测试基础设施"但 IntegrationRedisSuite 类型从未定义。
 								- 经验教训：
 								  - "80% 完成"在质量语境下等于"未完成"。
 								  - 验证必须逐项，不能只看整体数字。
 								  - 每次提交前必须运行完整性检查。
 								## 22. 完整性检查必须是自动化的
 								- 手动检查容易被跳过或遗漏。
 								- 经验教训：
 								  - 必须有自动化检查脚本验证 swagger 注解完整性。
 								  - 必须在 CI 中集成完整性检查。
 								  - 必须在 PR 检查清单中明确列出完整性验证命令。
 								## 23. 声称 vs 实际的差距来源
 								虚假完成通常来自：
 . **部分完成就说完成**：swagger 注解 80% 完整就声称"已完成"
 . **格式不统一**：大部分统一但有例外就声称"已统一"
 . **类型未定义**：引用未定义的类型但测试没运行就声称"测试通过"
 . **覆盖率数字失真**：mock 测试占比高但计入覆盖率
 								防范措施：
 								- 完整性检查必须逐项
 								- 覆盖率必须验证真实测试运行
 								- 类型引用必须验证定义存在
-												docs: add 2026-04-18 optimization baseline to governance documents

- Add optimization baseline appendix to QUALITY_STANDARD.md defining
  current baseline gates for all future optimization work
- Update REAL_PROJECT_STATUS.md with latest project status
- Add experience summary to PROJECT_EXPERIENCE_SUMMARY.md
- Add technical guide updates to TECHNICAL_GUIDE.md
- Add FULL_CODE_REVIEW_REPORT_2026-04-17.md as reference document

											
										
										
											2026-04-18 12:24:36 +08:00
+								## 2026-04-18 从复核到修复的经验
 								本附录记录了 2026-04-17 报告复核和 2026-04-18 文档对齐过程中提炼出的工程经验。
 								### 1. 评审报告不是实时状态页
 								- 一份报告可以在技术上仍然有价值，但它的门禁摘要会很快过时。
 								- 团队必须把以下两类事实分开：
 								  - 报告日期的发现
 								  - 当前工作区的真实门禁状态
 								- 如果这两类事实混写，执行顺序和优先级判断会很快漂移。
 								### 2. 新鲜命令证据优先于继承结论
 								- `go test ./... -count=1` 曾在评审材料里被视为红灯，但新鲜执行后在当前工作区已经转绿。
 								- 与此同时，前端 `lint` 已经重新变红。
 								- 经验：
 								  在安排修复顺序前，必须先刷新真实门禁。
 								### 3. stub 转 live 会带来第二波风险
 								- `AssignRoles`、`CreateAdmin/DeleteAdmin`、`UploadAvatar` 已经越过了旧的“未实现”阶段。
 								- 一旦转为 live，主导风险就会从“功能缺失”切换为：
 								  - 授权边界
 								  - 事务性
 								  - 公开暴露面
 								  - 自操作 / 最后管理员治理
 								- 经验：
 								  live 实现必须被当作新的安全与治理面重新复核，不能因为 stub 消失就直接标记为“闭环”。
 								### 4. 发布阻塞往往是策略链断裂，不是没写代码
 								- 密码登录绕过 TOTP/设备信任校验，比很多显眼的“功能缺失”更像真实发布阻塞项。
 								- refresh token 吊销 fail-open 也是发布阻塞项，即使代码路径本身已经存在。
 								- 经验：
 								  在认证系统里，“已实现”不等于“完整”，只要安全策略链断了，就是关键缺陷。
 								### 5. 事实成立，不代表措辞可以粗糙
 								- LIKE 搜索问题是真实的，但把它笼统写成通用 SQL 注入，会夸大具体缺陷类型。
 								- 密码重置 replay 问题也是真实的，但必须精确指出脆弱路径。
 								- 经验：
 								  严重级别可以保持不变，但措辞必须更精确；精确措辞能加快修复，也能减少无效争论。
 								### 6. 主入口绿灯比局部绿灯更重要
 								- 局部命令成功，不能替代项目正式支持的主命令成功。
 								- 包装层失败或顶层命令失败，就是真实项目失败，即使更深层子命令单独能过。
 								- 经验：
 								  所有结论都必须对齐文档中声明的主验收入口。
 								### 7. 文档漂移会制造返工
 								- `REAL_PROJECT_STATUS`、评审报告和团队规范已经开始出现漂移。
 								- 这种漂移会把下一轮修复引向过时优先级。
 								- 经验：
 								  文档更新不是交付后的清理工作，而是交付本身的一部分。
-												feat: permissions CRUD browser integration + E2E enhancements

Backend:
- permission_handler: 完善权限 CRUD 接口（列表/创建/更新/删除）
- auth_handler: 修复认证处理逻辑
- router: 新增权限管理路由
- handler_test: 新增权限 handler 测试覆盖

Frontend:
- permissions.ts/test.ts: 权限服务层完整实现
- profile/settings/service_tests: 服务适配器修正
- client.ts: HTTP 客户端健壮性增强
- vite.config.js: 构建配置优化
- E2E 脚本: run-playwright-cdp-e2e 大幅增强（权限流程覆盖）

Docs:
- REAL_PROJECT_STATUS: 状态更新
- PRODUCTION_CHECKLIST/QUALITY_STANDARD/TECHNICAL_GUIDE/PROJECT_EXPERIENCE_SUMMARY: 团队规范完善
- plans/2026-04-23: 权限浏览器 CRUD 设计方案

验证: go build 0错误

											
										
										
											2026-04-24 07:30:18 +08:00
 								## 0. 2026-04-23 E2E Recovery Lessons
 								Use this section as the newest summary of what changed in the workspace after the E2E recovery. If older notes elsewhere in this file conflict with it, trust this section.
 								- A green main browser gate was recovered by fixing real product and test mismatches, not by wrapper retries alone.
 								- The concrete regressions found in this recovery were:
 								  - `DevicesPage` cursor flow could self-trigger a second page request and flood `/admin/devices`.
 								  - `webhooks` and `social-accounts` services decoded the wrong backend response shapes.
 								  - `settings` service unwrapped `data` twice even though the shared HTTP client had already returned `result.data`.
 								  - Broad text-based Playwright assertions in later admin scenarios created brittle false negatives.
 								- The latest evidence set for this recovery was:
 								  - `cd frontend/admin && npm.cmd run test:run -- src/pages/admin/DevicesPage/DevicesPage.test.tsx`
 								  - `cd frontend/admin && npm.cmd run test:run -- src/services/webhooks.test.ts`
 								  - `cd frontend/admin && npm.cmd run test:run -- src/pages/admin/WebhooksPage/WebhooksPage.test.tsx`
 								  - `cd frontend/admin && npm.cmd run test:run -- src/services/social-accounts.test.ts`
 								  - `cd frontend/admin && npm.cmd run lint`
 								  - `cd frontend/admin && npm.cmd run build`
 								  - `cd frontend/admin && npm.cmd run e2e:full:win`
 								- Practical rule: when `e2e:full:win` fails late in the suite, inspect both real application behavior and locator or route assumptions before blaming only browser or CDP instability.
 								## 2026-04-23 Governance Lessons From E2E Recovery
 								- A red browser gate can hide several different failure classes at once: product bug, integration-contract drift, selector drift, and browser-runtime instability.
 								- This recovery was closed by fixing real contract and locator problems, not by increasing retries around the wrapper.
 								- Pagination regressions are high-noise defects: they often show up as rate limiting, empty lists, or flaky E2E much earlier than they show up as obvious local exceptions.
 								- Response-envelope mismatches are easy to miss when pages silently fall back to empty arrays or partial data; service tests must pin the real backend field names.
 								- Documentation lag recreates stale priorities. Once the supported browser gate changes state, norms and experience docs need the same-day update.
 								- Browser-suite retry logic can create false failures when the first attempt mutates one-time backend state. Retry code has to re-read live preconditions instead of replaying stale assumptions.
 								## 2026-04-23 Password Reset Expansion Lessons
 								- Capability endpoints and mounted routes are one product contract. If the route is live but the capability bit is false, the browser surface is still effectively broken.
 								- A targeted green scenario is not enough evidence when the supported gate is the full suite. The 19th scenario only counted after `cd frontend/admin && npm.cmd run e2e:full:win` stayed green.
 								- Late-suite CDP page loss is best treated as a recoverable connection problem first, not as a reason to blindly multiply wrapper retries.
 								- Real auth coverage is worth the setup cost. The password-reset scenario now proves SMTP capture, token validation, password reset submission, and post-reset login in one browser chain.
 								## 2026-04-23 Permissions CRUD Closure Lessons
 								- A red browser scenario can come from product behavior, adapter drift, auth-header handling, or runner observation gaps. The fastest path was to separate those four possibilities instead of assuming every timeout meant browser flakiness.
 								- A successful browser fetch does not guarantee that Playwright CDP request or response listeners will observe the call under every proxy path. When the UI updated and the in-page fetch log showed `201` and `200`, the correct conclusion was "runner evidence gap", not "permission create is broken".
 								- Shared HTTP client state is easy to misread under concurrency. "A refresh is in flight" and "this request lacks a usable token" are different facts; merging them creates false auth regressions.
 								- Adapter normalization changes must update both focused service tests and aggregate service suites. Fixing only the local adapter test leaves a second failure surface in cross-service regression packs.
 								- Modal animations are a real source of E2E false negatives. A dialog that is visually closing can still block clicks long enough to break the next CRUD step unless the runner waits for the overlay to stop intercepting input.
 								- Build tooling can be a real release blocker. Vite root resolution on Windows became part of the supported gate the moment `npm.cmd run build` started failing under the documented command.
 								- The 20th browser scenario only counted after two proofs existed on the same branch state: the targeted `permissions-management-crud` run and the full `cd frontend/admin && npm.cmd run e2e:full:win` run.
 								## 2026-04-24 Profile Security Contract Recovery Lessons
 								- Browser E2E is often the first place where outbound write contracts are validated end to end. A service adapter can look fine in page-level tests while still sending the wrong backend field names.
 								- Service tests must assert the serialized write payload, not only the UI form model. Otherwise the test suite can lock in the wrong contract and make the browser suite the first honest signal.
 								- Orphaned async diagnostics waste debugging time. A failed click or fill should not leave a background fetch waiter alive long enough to crash during cleanup and hide the real failing step.
 								- A targeted scenario recovery is still not enough evidence on its own. The `profile-and-security` fix only counted after `cd frontend/admin && npm.cmd run e2e:full:win` returned green on the same workspace state.
-												fix(n+1): 批量查询替代循环单查

- IsAdminBootstrapRequired: userRepo.GetByID 循环 → GetByIDs 批量
- AssignRoles: roleRepo.GetByID 循环 → GetByIDs 批量
- 在 userRepositoryInterface 补充 GetByIDs 方法签名

											
										
										
											2026-05-08 08:05:26 +08:00
+								## 2026-04-24 Profile Contract And Gate Reality Lessons
 								- A green profile page in mocked tests does not prove the real user-detail contract. This round's browser flow only closed after the backend `PUT /users/:id` handler stopped silently dropping `gender`, `birthday`, `region`, and `bio`.
 								- Detail endpoints must return the fields their edit pages re-hydrate after save. Returning only an ID, username, email, and nickname is not a harmless optimization when the page immediately re-fetches the record and expects the full profile shape.
 								- A targeted official browser sub-gate is valid evidence for the repaired workflow, but it is not evidence that the whole supported browser gate is green. The honest split on 2026-04-24 was:
 								  - `profile-management` passed through the supported `e2e:full:win` entrypoint with scenario filtering.
 								  - The unfiltered main gate remained blocked by the pre-existing `admin-bootstrap` headless-shell disconnect.
 								- Wrapper drift matters. Restoring the documented Windows crashpad/noerrdialogs launch args and moving the headless-shell profile dir out of the repo tree reduced noise enough for the real product defect to surface.
 								## 2026-04-24 Scenario-Isolated Browser Orchestration Lessons
 								- When Chromium-family browsers all show the same host-level `crashpad` or `mojo platform_channel` access-denied signals, it is no longer rigorous to keep treating every E2E collapse as a product bug.
 								- Shared backend state does not require a shared browser process. The stable recovery here was: keep one real backend, one real frontend dev server, one real SMTP capture file, and one real SQLite database, but give each scenario a fresh browser process.
 								- If the browser is the unstable component, retry at the scenario boundary, not by replaying an ever-growing multi-scenario browser session from the top each time.
 								- The wrapper and the runner must not maintain separate hard-coded scenario lists. Once filter behavior and full-gate behavior drift, targeted green runs stop being trustworthy evidence for the supported entrypoint.
 								## 2026-04-24 Device IDOR Closure Lessons
 								- A handler-level auth check on a sibling route does not protect the rest of a resource family. `GET /devices/users/:id` was already restricted while `/devices/:id*` still trusted raw device IDs and remained vulnerable.
 								- Ownership-sensitive APIs need actor-aware service entry points. Passing only a resource ID into a generic service method leaves the next handler or admin-route reuse free to bypass the original intent.
 								- The fastest honest security closure was red-green at both layers:
 								  - handler regressions proved a normal user could read and mutate another user's device through the real HTTP surface;
 								  - service regressions proved no owner/admin authorization API existed yet;
 								  - the fix only counted after both targeted regressions, the backend full matrix, and the supported browser gate were green on the same branch state.