174 lines
3.2 KiB
Markdown
174 lines
3.2 KiB
Markdown
|
|
# 备份恢复 Runbook
|
|||
|
|
|
|||
|
|
## 触发条件
|
|||
|
|
- 数据损坏或丢失
|
|||
|
|
- 升级失败需要回滚
|
|||
|
|
- 灾难恢复
|
|||
|
|
|
|||
|
|
## 警告
|
|||
|
|
|
|||
|
|
**恢复操作会覆盖当前数据!**
|
|||
|
|
|
|||
|
|
在执行恢复前:
|
|||
|
|
1. 确认当前数据已无法修复
|
|||
|
|
2. 记录当前状态
|
|||
|
|
3. 通知相关人员
|
|||
|
|
|
|||
|
|
## 恢复步骤
|
|||
|
|
|
|||
|
|
### 1. 确认备份存在
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 列出所有备份
|
|||
|
|
./scripts/backup/backup.sh --list
|
|||
|
|
|
|||
|
|
# 验证最新备份
|
|||
|
|
./scripts/backup/backup.sh --verify
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2. 停止服务
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 停止服务(保持容器运行以便回滚)
|
|||
|
|
docker compose stop
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3. 备份当前数据(以防万一)
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 复制当前数据库
|
|||
|
|
cp ./data/user_management.db ./data/user_management.db.bak.$(date +%Y%m%d)
|
|||
|
|
|
|||
|
|
# 复制当前配置
|
|||
|
|
cp ./configs/config.yaml ./configs/config.yaml.bak.$(date +%Y%m%d)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4. 执行恢复
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 从最新备份恢复
|
|||
|
|
./scripts/backup/backup.sh --restore
|
|||
|
|
|
|||
|
|
# 或指定特定备份恢复
|
|||
|
|
# 1. 解压备份到临时目录
|
|||
|
|
mkdir -p /tmp/restore
|
|||
|
|
tar -xzf ./backups/user-management_YYYYMMDD_HHMMSS.tar.gz -C /tmp/restore
|
|||
|
|
|
|||
|
|
# 2. 手动复制文件
|
|||
|
|
cp /tmp/restore/*/database.db ./data/user_management.db
|
|||
|
|
cp /tmp/restore/*/config.yaml ./configs/config.yaml
|
|||
|
|
|
|||
|
|
# 3. 清理临时目录
|
|||
|
|
rm -rf /tmp/restore
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 5. 验证恢复
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 重启服务
|
|||
|
|
docker compose restart
|
|||
|
|
|
|||
|
|
# 检查服务状态
|
|||
|
|
docker compose ps
|
|||
|
|
|
|||
|
|
# 检查日志无错误
|
|||
|
|
docker compose logs | grep -i error
|
|||
|
|
|
|||
|
|
# 验证数据库
|
|||
|
|
sqlite3 ./data/user_management.db "SELECT COUNT(*) FROM users;"
|
|||
|
|
|
|||
|
|
# 测试 API
|
|||
|
|
curl http://localhost:8080/api/v1/health
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 6. 验证数据完整性
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 检查用户数量
|
|||
|
|
curl http://localhost:8080/api/v1/users | jq '.total'
|
|||
|
|
|
|||
|
|
# 检查最近的日志
|
|||
|
|
curl http://localhost:8080/api/v1/logs/login | jq '.total'
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 时间点恢复(Point-in-Time Recovery)
|
|||
|
|
|
|||
|
|
如果需要恢复到特定时间点:
|
|||
|
|
|
|||
|
|
1. **找到最近的备份**
|
|||
|
|
```bash
|
|||
|
|
ls -la ./backups/
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
2. **识别恢复点之前的数据**
|
|||
|
|
- 检查备份中的数据时间戳
|
|||
|
|
|
|||
|
|
3. **执行恢复**
|
|||
|
|
```bash
|
|||
|
|
# 解压备份
|
|||
|
|
mkdir -p /tmp/restore
|
|||
|
|
tar -xzf ./backups/user-management_YYYYMMDD_HHMMSS.tar.gz -C /tmp/restore
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
4. **手动恢复数据**
|
|||
|
|
```bash
|
|||
|
|
# 使用 SQLite 的挽回工具
|
|||
|
|
sqlite3 ./data/user_management.db
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 回滚步骤
|
|||
|
|
|
|||
|
|
如果恢复失败:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 恢复之前的手动备份
|
|||
|
|
cp ./data/user_management.db.bak.* ./data/user_management.db
|
|||
|
|
cp ./configs/config.yaml.bak.* ./configs/config.yaml
|
|||
|
|
|
|||
|
|
# 重启服务
|
|||
|
|
docker compose restart
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 恢复后检查清单
|
|||
|
|
|
|||
|
|
- [ ] 服务正常运行
|
|||
|
|
- [ ] 健康检查通过
|
|||
|
|
- [ ] 用户数据完整
|
|||
|
|
- [ ] 配置正确
|
|||
|
|
- [ ] 日志正常
|
|||
|
|
- [ ] 通知相关人员恢复完成
|
|||
|
|
|
|||
|
|
## 灾难恢复(全面故障)
|
|||
|
|
|
|||
|
|
如果服务器完全不可用:
|
|||
|
|
|
|||
|
|
1. **在新服务器上部署**
|
|||
|
|
```bash
|
|||
|
|
# 克隆代码
|
|||
|
|
git clone <repository-url>
|
|||
|
|
cd user-management
|
|||
|
|
|
|||
|
|
# 安装 Docker
|
|||
|
|
./scripts/deploy/simple_deploy.sh
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
2. **恢复数据**
|
|||
|
|
```bash
|
|||
|
|
# 从备份服务器复制备份文件
|
|||
|
|
scp user@backup-server:/path/to/backups/*.tar.gz ./backups/
|
|||
|
|
|
|||
|
|
# 执行恢复
|
|||
|
|
./scripts/backup/backup.sh --restore
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
3. **验证服务**
|
|||
|
|
```bash
|
|||
|
|
curl http://localhost:8080/api/v1/health
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 联系人
|
|||
|
|
|
|||
|
|
- 运维负责人:[填写]
|
|||
|
|
- DBA(如有):[填写]
|
|||
|
|
- 项目经理:[填写]
|