Skip to content

Liflode Engineering Docs

Incident Response

Incident Response¶

Detect¶

Monitoring alerts via ntfy (push notification)
CI failure email to it@liflode.com → Linear issue auto-created
User reports via hello@liflode.com

Triage¶

Severity	Definition	Response time
P0	Production down, data loss risk	Immediate
P1	Major feature broken	Within 2 hours
P2	Minor feature broken	Within 1 business day
P3	Cosmetic/UX issue	Next sprint

Communicate¶

Create Linear issue with severity label
Notify rachel@liflode.com for P0/P1
Add status comment every 30 min for P0

Mitigate¶

Follow the relevant service runbook
When in doubt: revert and restore from backup
Backup location: Cloudflare R2 (see restic-restore.md for restore procedure)

Post-Mortem¶

After resolution:

Write post-mortem doc in docs/runbooks/<service>-<date>.md
Root cause analysis: what happened, why, what was missed
Action items: create Linear issues for each prevention measure