Discussion-based drill template (no live failover required). Target 60–90 minutes. Fill fields before the exercise; share only the sections participants need.
| Field | Value |
|---|---|
| Date | |
| Facilitator | |
| Participants (names / roles) | |
| Scenario summary | |
| Declared severity |
Read aloud (~2 minutes). In scope: systems listed in section 4. Out of scope (explicit): e.g. physical safety, HR/legal unless your org includes them.
One person is incident commander (IC); others own workstreams.
| Role | Name / backup | Contact |
|---|---|---|
| Incident commander | ||
| Comms (internal + customer) | ||
| Technical lead (recovery) | ||
| Vendor / cloud / carrier | ||
| Security / legal (if applicable) |
Escalation path: IC → manager → executive → board notification threshold (if any).
Add RTO (restore time) and RPO (acceptable data loss) if your org uses them.
| System / app | Owner | Depends on | RTO | RPO | Primary recovery |
|---|---|---|---|---|---|
Trigger:
What we know at start
| Time discovered | |
| Who reported | |
| Customer impact | |
| Evidence |
What we do not know yet (save for injects):
Introduce one inject every 10–15 minutes or when discussion stalls. Do not read all at once.
| Time / trigger | Inject |
|---|---|
| T+10m | |
| T+25m | |
| T+40m |
Work in order. Time-box each block.
| Time | Topic | Questions to answer |
|---|---|---|
| 0–10m | Triage | Severity? Customer impact? Freeze changes? Who is IC? |
| 10–25m | Stabilize | What do we stop? What monitoring / logs do we need? |
| 25–45m | Recover | Restore path (failover, rebuild, vendor)? Order of operations? |
| 45–60m | Comms | Who to notify, when, what to say? Status page? |
| 60–75m | Validate | How do we prove service is good? Smoke tests? Sign-off? |
| 75–90m | Post-incident | Evidence preservation? Timeline? When is the postmortem? |
Fill during the exercise.
| # | Decision | Options considered | Owner | Time |
|---|---|---|---|---|
| 1 | ||||
| 2 |
What went well?
What was unclear? (runbooks, contacts, architecture)
Action items
| Action | Owner | Due |
|---|---|---|
Before next drill: update runbooks, contact list, CMDB, backup/restore test date.