31 Commits

Author SHA1 Message Date
Your Name
3b70fe1865 P4-A: 三服务共享auth/logging能力 - 共享包边界定义/golden测试/契约测试
- gateway/internal/shared/: 新建 shared/auth 和 shared/logging 包
- shared/logging: LogEntry/Logger/NewLogger/sanitizeFields, 7个golden output测试
- shared/auth: ExtractBearerToken/HasExternalQueryKey/WriteAuthError/AuditEvent, 8个契约测试
- docs/plans/2026-04-21-shared-auth-logging-analysis.md: P4-A完整分析文档

迁移顺序: logging(第一步) -> auth基础(第二步) -> audit(第三步) -> 契约测试(第四步)
共享边界: JWT验证/token状态查询/授权策略/BruteForce保持服务特有
2026-04-21 19:00:25 +08:00
Your Name
e249a9160b P3-C: 三服务可观测面统一 - metrics端点统一/健康检查别名/traceID透传
Gateway:
- remote_runtime.go: P3-C-08 从请求上下文透传 X-Request-Id 到 platform-token-runtime

Supply-api:
- 新建 internal/metrics/metrics.go: HTTP请求计数/latency/token发布/worker queue指标 (Prometheus-text)
- 新建 internal/metrics/metrics_test.go: 6个测试覆盖
- bootstrap.go: 注册 /metrics (P3-C-01/04)、/health、/healthz 别名 (P3-C-05)

Platform-token-runtime:
- bootstrap.go: 添加 /health 和 /livez 别名 (P3-C-05)

三服务 /metrics 统一为 text/plain; version=0.0.4
三服务 /health 端点统一别名
Gateway → platform-token-runtime 透传 trace ID
2026-04-21 18:40:43 +08:00
Your Name
472d9ad4c1 P3-B: Router 熔断器实现 - 健康检查/状态机/半开试探
Gateway:
- ProviderHealth 新增熔断器字段 (CircuitState, ConsecutiveFailures, LastStateChange, OpenReason)
- CircuitBreakerConfig 熔断器配置 (FailureRateThreshold=50%, ConsecutiveFailureThreshold=5, HalfOpenSuccessThreshold=3, OpenTimeout=30s)
- circuit.go: 熔断器状态机 (Closed→Open→HalfOpen→Closed)
- healthcheck.go: 后台健康检查循环 (ProviderHealthCheckInterval 探测 + 自动半开转换)
- RecordResult 集成熔断器状态转换
- isProviderAvailable: CircuitOpen=false, CircuitHalfOpen=true (允许试探)
- GetCircuitState/SetCircuitConfig 管理接口
- metrics.go: 新增 circuit_state_changes_total 指标
- bootstrap.go: BuildServer 返回 ServerBundle(含 Router 和 ShutdownFunc)
- main.go: 适配 ServerBundle,graceful shutdown 停止健康检查器
- bootstrap_test.go: 适配 ServerBundle

17 个新测试,50 个 router 测试全部通过
2026-04-21 17:46:02 +08:00
Your Name
ae2b1bfe75 P3-A: Token Runtime 缓存层实现 - HTTPTimeout/LRU淘汰/命中率指标
Phase 3-A 完整实现,包含:

Gateway (lijiaoqiao/gateway):
- RemoteTokenRuntime 缓存实现: active=30s/expired=2m/revoked=10m TTL淘汰
- LRU 容量淘汰 (max_entries=10000,插入顺序淘汰)
- HTTPTimeoutConfig: 4个环境变量 (Dial/KeepAlive/Read/Write/MaxIdle)
- 缓存命中率指标: GetCacheHitRate() + 实例级别统计
- 上游延迟指标: RecordTokenRuntime() histogram
- buildTimeoutClient: 基于 HTTPTimeoutConfig 的 HTTP 客户端工厂
- 新增测试: 22个矩阵测试 (remote_runtime_matrix_test.go, config_test.go)

Platform Token Runtime (lijiaoqiao/platform-token-runtime):
- metrics/metrics.go: GetCacheHitRate() 方法
- inmemory_runtime.go: GetCacheHitRate() 实现

变更文件 (8 modified + 5 new):
- gateway/internal/middleware/remote_runtime.go    # 核心缓存实现
- gateway/internal/middleware/remote_runtime_test.go
- gateway/internal/middleware/remote_runtime_cache_test.go
- gateway/internal/middleware/remote_runtime_matrix_test.go
- gateway/internal/middleware/remote_runtime_metrics_test.go
- gateway/internal/metrics/metrics.go             # 新增
- gateway/internal/config/config.go                # HTTPTimeoutConfig
- gateway/internal/config/config_test.go
- gateway/internal/app/bootstrap.go                # 初始化顺序
- gateway/internal/router/router.go                # 指标注入
- platform-token-runtime/internal/metrics/metrics.go  # 新增
- platform-token-runtime/internal/app/bootstrap.go
- platform-token-runtime/internal/auth/service/inmemory_runtime.go
2026-04-21 17:27:51 +08:00
Your Name
1fec3e981d feat(ci): 实现 Phase 1/2 严格退出标准的所有代码实现
Phase 1 Criterion 4: contract tests 场景清单 → backend-verify.sh --phase1-contract-gate(含四个场景:合法token全链路、吊销拒绝、scope不足拒绝、runtime快速失败),repo_integrity_check.sh 集成调用

Phase 2 Criterion 1: manifest.json 系统(lib/manifest_lib.sh + staging_release_pipeline.sh),run_id 作为硬门禁,manifest_hard_gate_run_id() 验证非空

Phase 2 Criterion 2: superpowers_stage_validate.sh exit 1 条件从 NO_GO 扩展到 CONDITIONAL_GO,staging 硬门禁不再放行条件通过

Phase 2 Criterion 3: DEFERRED 语义修正,CONDITIONAL_GO 不再出现在复审结论选项中;CONDITIONAL_GO 在 pipeline 中强制 exit 1

Phase 2 Criterion 5: cross_service_smoke.sh 从 DESIGN_ONLY 变为可执行(exit 0=PASS/1=FAIL/2=SKIP_LOCAL_PLACEHOLDER),纳入 staging_release_pipeline.sh STEP-03

Phase 2 Criterion 4: 配置分离(已之前落地,本次确认)

环境问题记录: docs/plans/2026-04-21-environmental-issues-log.md
- P3-A: HTTP timeout + cache eviction(需要真实 staging env + env var 热加载支持)
- P3-B/C: /metrics 端点(需要 Prometheus scrape 配置 + 运维介入)
- P3-D: graceful shutdown(需要 staging 流量压测验证)
2026-04-21 12:14:50 +08:00
Your Name
b3e34c6e36 feat(ci): normalize shared environment semantics 2026-04-21 09:34:29 +08:00
Your Name
b864a4ef1b docs(plan): tighten token authority contract
Record the OpenAPI vs canonical principal gap, add tenant_id to the introspection response contract, and make the gateway README explicit that non-dev environments must use remote introspection.
2026-04-21 08:01:07 +08:00
Your Name
1f56b32257 feat(logging): unify structured startup logs 2026-04-20 19:55:38 +08:00
Your Name
414ecbb08c fix(token-runtime): preserve fingerprint on refresh and revoke 2026-04-20 10:47:59 +08:00
Your Name
014c183c84 fix: correct environment issues doc and add missing config improvements
- Remove fabricated etcd/Kafka/AWS issues from TEST_ENVIRONMENT_ISSUES.md
  (codebase contains zero references to these dependencies)
- Add Kafka/etcd/CloudWatch clarification: early design docs discuss
  these but actual implementation uses none of them
- Add getEnvInt() for GATEWAY_PORT env variable support
- Add devtest stack scripts for local development
- Update verification report and repair plan status
2026-04-18 11:34:58 +08:00
Your Name
ebd11867c3 docs(gateway): clarify advanced routing strategy status 2026-04-17 20:05:56 +08:00
Your Name
7434496470 feat(gateway): serve models from registered providers 2026-04-17 20:04:05 +08:00
Your Name
0b8de726a8 fix(gateway): fail closed on secret and cors defaults 2026-04-17 20:00:43 +08:00
Your Name
a31ea09045 test(gateway): realign mux and error response assertions 2026-04-17 16:24:05 +08:00
Your Name
ad776e4079 fix: P0/P1 security fixes across gateway, token-runtime, and supply-api
P0 fixes:
- platform-token-runtime: Add store.Save() after Refresh token update (P0-3)
- platform-token-runtime: Add sync.RWMutex to InMemoryRuntimeStore (P0-4)
- platform-token-runtime: Add bearer token auth to /audit-events endpoint (P0-5)
- gateway: Fail startup in production if PASSWORD_ENCRYPTION_KEY uses default (P0-1)
- gateway: Require explicit CORS_ALLOW_ORIGINS in production (P0-2)

P1 fixes:
- gateway: Add TrustedProxies config field + env var GATEWAY_TRUSTED_PROXIES (P1-5)
- gateway: Sanitize X-Request-ID header to prevent log injection (P1-6)
- gateway: Strip internal error details from error responses to clients (P1-7)
- supply-api: Upgrade deriveDEK from trivial byte-rotation to HKDF-SHA256 (P1-1)
- supply-api: Reject HS256/HS384/HS512 in production, require RSA (P1-2)

Code quality fixes:
- supply-api: Add BruteForceMaxAttempts + BruteForceLockoutDuration to AuthConfig (MED-12)
- supply-api: Add TrustedProxies to token_auth_middleware (IP spoofing protection)
- supply-api: Use shared pathutil.SplitPath instead of duplicate splitPath
- supply-api: Fix query_key_reject_middleware call sites with trustedProxies param
- gateway: Wire TrustedProxies into AuthMiddlewareConfig and extractClientIP
- gateway: Add CORSAllowOrigins to AuthConfig, wire into CORSMiddleware
- gateway: Fix CompletionsHandle to have context and RecordResult like ChatCompletions
- gateway: Add sanitizeRequestID helper for X-Request-ID log injection prevention
- gateway: Add os import for PASSWORD_ENCRYPTION_KEY check
- gateway: Add strings import to handler.go for sanitizeRequestID

Environment issues documented in TEST_ENVIRONMENT_ISSUES.md
2026-04-17 14:36:02 +08:00
Your Name
567446bb25 test(repo): cover untested core packages 2026-04-15 10:17:32 +08:00
Your Name
88d842648d chore(repo): align integrity entrypoints with current state
Rewrite module READMEs around the current verified run and test paths, tighten repo_integrity_check.sh with fact-source checks, update supply-api migration baseline, and remove the platform-token-runtime audit query placeholder response.
2026-04-14 12:29:13 +08:00
Your Name
cd26802d7d refactor(gateway): extract bootstrap and provider registry 2026-04-14 10:45:30 +08:00
Your Name
d28f83a6a8 chore(repo): add integrity baseline check 2026-04-14 10:38:24 +08:00
Your Name
dfa8a891ab fix(gateway): harden cors origin validation
Reject non-whitelisted origins on actual requests and format Access-Control-Max-Age correctly. This keeps wildcard subdomain matching explicit and avoids silently serving blocked origins.
2026-04-11 09:33:33 +08:00
Your Name
4adeee2e06 fix: close p0 auth and release gate gaps 2026-04-11 09:25:31 +08:00
Your Name
d90cc382a4 fix: 验证并修复comprehensive_review_v4问题
已验证的问题状态:
1. P0-07补偿处理器 - 已集成到main.go 
2. P0-09外键校验器 - 已集成到main.go并调用 
3. 幂等协议Idempotency-Key - 已在idempotency.go实现 
4. 幂等唯一索引 - 已在SQL中定义 

Gateway修复:
- 修复cors.go语法错误(重复函数定义)
- 修复middleware_test.go参数不匹配问题
- 修复go.mod降级到go 1.21解决依赖问题
2026-04-08 20:17:07 +08:00
Your Name
9931075e94 feat(gateway): 优化OpenAI适配器实现
1. 使用bufio.Scanner代替io.ReadLine进行流式读取,提高效率
2. MapError返回ProviderError结构化错误码,便于错误处理和追踪
3. 更新go.mod添加必要依赖
2026-04-03 09:59:32 +08:00
Your Name
a9d304fdfa fix(gateway): 修复P2-03 regexp.MustCompile可能panic的问题
将regexp.MustCompile替换为regexp.Compile并处理错误,
避免在正则表达式无效时panic。fallback使用永远不匹配
的正则表达式(a^)来保证服务可用性。

修复的问题:P2-03 regexp.MustCompile可能panic
2026-04-03 09:58:13 +08:00
Your Name
d44e9966e0 fix(security): 修复多个MED安全问题
MED-03: 数据库密码明文配置
- 在 gateway/internal/config/config.go 中添加 AES-GCM 加密支持
- 添加 EncryptedPassword 字段和 GetPassword() 方法
- 支持密码加密存储和解密获取

MED-04: 审计日志Route字段未验证
- 在 supply-api/internal/middleware/auth.go 中添加 sanitizeRoute() 函数
- 防止路径遍历攻击(.., ./, \ 等)
- 防止 null 字节和换行符注入

MED-05: 请求体大小无限制
- 在 gateway/internal/handler/handler.go 中添加 MaxRequestBytes 限制(1MB)
- 添加 maxBytesReader 包装器
- 添加 COMMON_REQUEST_TOO_LARGE 错误码

MED-08: 缺少CORS配置
- 创建 gateway/internal/middleware/cors.go CORS 中间件
- 支持来源域名白名单、通配符子域名
- 支持预检请求处理和凭证配置

MED-09: 错误信息泄露内部细节
- 添加测试验证 JWT 错误消息不包含敏感信息
- 当前实现已正确返回安全错误消息

MED-10: 数据库凭证日志泄露风险
- 在 gateway/cmd/gateway/main.go 中使用 GetPassword() 代替 Password
- 避免 DSN 中明文密码被记录

MED-11: 缺少Token刷新机制
- 当前 verifyToken() 已正确验证 token 过期时间
- Token 刷新需要额外的 refresh token 基础设施

MED-12: 缺少暴力破解保护
- 添加 BruteForceProtection 结构体
- 支持最大尝试次数和锁定时长配置
- 在 TokenVerifyMiddleware 中集成暴力破解保护
2026-04-03 09:51:39 +08:00
Your Name
b2d32be14f fix(P2): 修复4个P2轻微问题
P2-01: 通配符scope安全风险 (scope_auth.go)
- 添加hasWildcardScope()函数检测通配符scope
- 添加logWildcardScopeAccess()函数记录审计日志
- 在RequireScope/RequireAllScopes/RequireAnyScope中间件中调用审计日志

P2-02: isSamePayload比较字段不完整 (audit_service.go)
- 添加ActionDetail字段比较
- 添加ResultMessage字段比较
- 添加Extensions字段比较
- 添加compareExtensions()辅助函数

P2-03: regexp.MustCompile可能panic (sanitizer.go)
- 添加compileRegex()安全编译函数替代MustCompile
- 处理编译错误,避免panic

P2-04: StrategyRoundRobin未实现 (router.go)
- 添加selectByRoundRobin()方法
- 添加roundRobinCounter原子计数器
- 使用atomic.AddUint64实现线程安全的轮询

P2-05: 错误信息泄露内部细节 - 已在MED-09中处理,跳过
2026-04-03 09:39:32 +08:00
Your Name
6924b2bafc fix: 修复6个代码质量问题
P1-01: 提取重复的角色层级定义为包级常量
- 将 roleHierarchy 提取为 roleHierarchyLevels 包级变量
- 消除重复定义

P1-02: 修复伪随机数用于加权选择
- 使用 math/rand 的线程安全随机数生成器替代时间戳
- 确保加权路由的均匀分布

P1-03: 修复 FailureRate 初始化计算错误
- 将成功时的恢复因子从 0.9 改为 0.5
- 加速失败后的恢复过程

P1-04: 为 DefaultIAMService 添加并发控制
- 添加 sync.RWMutex 保护 map 操作
- 确保所有服务方法的线程安全

P1-05: 修复 IP 伪造漏洞
- 添加 TrustedProxies 配置
- 只在来自可信代理时才使用 X-Forwarded-For

P1-06: 修复限流 key 提取逻辑错误
- 从 Authorization header 中提取 Bearer token
- 避免使用完整的 header 作为限流 key
2026-04-03 07:58:46 +08:00
Your Name
90490ce86d fix(gateway): 修复RuleEngine中regexp编译错误和并发安全问题
P0-05: regexp.Compile错误被静默忽略
- extractMatch函数现在返回(string, error)
- 正确处理regexp.Compile错误,返回格式化错误信息
- 修复无效正则导致的panic问题

P0-06: compiledPatterns非线程安全
- 添加sync.RWMutex保护map并发访问
- matchRegex和extractMatch使用读锁/写锁保护
- 实现双重检查锁定模式优化性能

测试验证:
- 使用-race flag验证无数据竞争
- 并发100个goroutine测试通过
2026-04-03 07:48:05 +08:00
Your Name
bc59b57d4d fix(gateway): 修复路由引擎P0问题
P0-07: RegisterStrategy添加互斥锁保护,解决并发注册策略时的数据竞争问题
P0-08: SelectProvider添加decision nil检查,避免nil指针被传递

使用TDD方法:
1. 编写测试验证问题存在
2. 修复代码
3. 测试验证通过
2026-04-03 07:46:16 +08:00
Your Name
89104bd0db feat(P1/P2): 完成TDD开发及P1/P2设计文档
## 设计文档
- multi_role_permission_design: 多角色权限设计 (CONDITIONAL GO)
- audit_log_enhancement_design: 审计日志增强 (CONDITIONAL GO)
- routing_strategy_template_design: 路由策略模板 (CONDITIONAL GO)
- sso_saml_technical_research: SSO/SAML调研 (CONDITIONAL GO)
- compliance_capability_package_design: 合规能力包设计 (CONDITIONAL GO)

## TDD开发成果
- IAM模块: supply-api/internal/iam/ (111个测试)
- 审计日志模块: supply-api/internal/audit/ (40+测试)
- 路由策略模块: gateway/internal/router/ (33+测试)
- 合规能力包: gateway/internal/compliance/ + scripts/ci/compliance/

## 规范文档
- parallel_agent_output_quality_standards: 并行Agent产出质量规范
- project_experience_summary: 项目经验总结 (v2)
- 2026-04-02-p1-p2-tdd-execution-plan: TDD执行计划

## 评审报告
- 5个CONDITIONAL GO设计文档评审报告
- fix_verification_report: 修复验证报告
- full_verification_report: 全面质量验证报告
- tdd_module_quality_verification: TDD模块质量验证
- tdd_execution_summary: TDD执行总结

依据: Superpowers执行框架 + TDD规范
2026-04-02 23:35:53 +08:00
Your Name
0484c7be74 feat(gateway): 实现网关核心模块
实现内容:
- internal/adapter: Provider Adapter抽象层和OpenAI实现
- internal/router: 多Provider路由(支持latency/weighted/availability策略)
- internal/handler: OpenAI兼容API端点(/v1/chat/completions, /v1/completions)
- internal/ratelimit: Token Bucket和Sliding Window限流器
- internal/alert: 告警系统(支持邮件/钉钉/飞书)
- internal/config: 配置管理
- pkg/error: 完整错误码体系
- pkg/model: API请求/响应模型

PRD对齐:
- P0-1: 统一API接入  (OpenAI兼容)
- P0-2: 基础路由与稳定性  (多Provider路由+Fallback)
- P0-4: 预算与限流  (Token Bucket限流)

注意:需要供应链模块支持后再完善成本归因和账单导出
2026-04-01 10:04:52 +08:00