Runbook Generation
AtlasAI’s AI can generate remediation runbooks automatically from RCA results. Generated runbooks contain step-by-step actions, risk classifications, target specifications, and rollback instructions — ready for human review or automated execution.
How Generation Works
When you click Generate Runbook after an RCA completes, the AI:
- Analyzes the root cause hypothesis — Understands what needs to be fixed and which systems are involved
- Retrieves historical runbooks — Searches the RAG knowledge base for runbooks that previously resolved similar issues
- Composes a step sequence — Generates an ordered list of remediation steps
- Classifies risk — Each step is tagged Low, Medium, or High based on the action type and target
- Adds rollback instructions — For reversible steps, generates the corresponding undo action
- Inserts approval gates — Automatically places approval checkpoints before High-risk steps
Generated Runbook Structure
Each generated runbook contains:
Runbook: Resolve high CPU on prod-api-03
Generated from: INC-00042 RCA Hypothesis #1
Step 1: [LOW RISK] Check current CPU usage
Target: prod-api-03 (Edge Agent)
Command: top -bn1 | head -20
Rollback: N/A (read-only)
Step 2: [LOW RISK] Identify top CPU-consuming processes
Target: prod-api-03 (Edge Agent)
Command: ps aux --sort=-%cpu | head -10
Rollback: N/A (read-only)
Step 3: [MEDIUM RISK] Restart the API service
Target: prod-api-03 (Edge Agent)
Command: systemctl restart api-server
Rollback: systemctl restart api-server
⚠️ Requires approval at L1/L2
Step 4: [LOW RISK] Verify service recovery
Target: prod-api-03 (Edge Agent)
Command: curl -s http://localhost:8080/health
Rollback: N/A (read-only)Customizing Generated Runbooks
Generated runbooks are starting points — you can edit them before saving:
- Add steps — Insert additional verification or notification steps
- Remove steps — Delete steps that don’t apply to your environment
- Reorder steps — Drag steps to change execution order
- Change risk levels — Override the AI’s risk classification if you disagree
- Edit commands — Modify commands to match your specific environment
- Add conditions — Insert conditional branches (e.g., “If CPU > 90%, then scale up; else, restart”)
Saving and Reusing
After review, save the runbook to the Runbook Library. Future incidents with matching RCA patterns will automatically suggest this runbook — and if the AI generated it from a high-confidence RCA, the suggested version will already be pre-customized for the specific incident context.
Generation Quality
The quality of generated runbooks improves over time as AtlasAI learns from:
- Operator edits — When you modify a generated runbook, the AI learns your preferences
- Execution outcomes — Successful runbook executions reinforce the step patterns; failures trigger learning
- Feedback — Explicit thumbs-up/thumbs-down on generated runbooks adjusts the generation model