Using MITRE ATT&CK for Incident Response Playbooks

A structured approach to incident response enables you to create consistently repeatable processes. Your incident response playbook defines responsibilities and guides your security team through a list of activities to reduce uncertainty if or when an incident occurs. MITRE ATT&CK Framework outlines the tactics and techniques that threat actors use during different stages of an attack.

By incorporating MITRE ATT&CK into their incident response playbooks, organizations can use insights about attacker motivations and objectives that drive faster investigation and response times.

What is an incident response playbook?

An incident response playbook is the step-by-step guide containing the standard procedures that security teams use to respond to and resolve incidents. The playbook should manage all stages of incident response, including:

Preparation
Identification/detection
investigation/analysis
Containment
Remediation
Recovery
Lessons learned

The playbook includes predefined response actions, decision trees, and checklists tailored to various types of incidents, such as malware infections, data breaches, or unauthorized access attempts. While the specifics of the playbook differ from team to team, most contain similar, fundamental information and steps.

Definition of incident

Since an incident triggers the rest of the activities listed in the playbook, the organization first needs to define the type of event that requires IT and security teams to follow the steps. Additionally, IT and security teams need to consider all types of incidents, not just security incidents. For example, an “incident” could include events that impact service quality, like network speed, outside of the security context.

Assigned roles

Assigning roles and responsibilities in the playbook documents expectations before an incident occurs. By ensuring people know and understand their responsibilities, teams can respond to the incident faster. Some key roles and responsibilities might be:

Incident manager: individual responsible for overseeing the response activities, including deciding when to bring in more staff or keeping responders focused on restoring service
Senior technical responder: individual who works closely with incident manager and is responsible for developing hypotheses about the incident, identifying changes, and managing the technical response team
Communications manager: individual responsible for writing and sending internal and external messages about the incident

Steps and phases

While each incident is unique, the response process typically follows similar activities across each phase. By outlining the processes that responders should follow for each phase, teams can respond faster and reduce the incident’s impact.

For example, an organization might have a workflow across the following:

Detect: systems generate an alert
Open ticket: incident manager initiates response and opens a ticket to assign a responder
Assess: incident manager reviews the incident to assess potential impact
Investigate: responder investigates the incident’s root cause and identifies impacted asset
Respond: responder contains the threat
Remediate: responder takes actions to fix the security weakness that led to the incident

Depending on the incident, the activities that the responders take may look different. For example, the activities that contain a password spray attack might be different than the ones used for responding to a malware attack.

Templates and checklists

Templates and checklists create consistency across activities and communications. Checklists give responders a set of required or expected activities to complete. Some incident response processes that could benefit from checklists include:

Triage and investigation
Containment and eradication

Templates ensure that communications remain consistent across various incidents. Some templates that could help include:

Management reports
Customer emails
Internal emails

Post-incident discussions

Discussing “lessons learned” is critical to identifying potential areas of improvement. However, the organization should have clear expectations about when and how to engage in them. For example, the incident response playbook might include:

Meetings: people involved, timing
Metrics: determining success and areas of improvement
Next steps: prioritizing activities aligned to areas of improvement

What are some common incident response playbook scenarios?

Scenarios outline the security events that would trigger the organization’s incident response.

Distributed Denial of Service (DDoS)

A playbook for responding to a DDoS incident might include the following steps:

Preparation:
- Create an asset inventory
- Establish an escalation and reporting communication strategy
Detection:
- Network activity: Packets/second for layers 3, 4, and 7, number of new TCP and UDP flows from clients to endpoints, total number TCP flows
- Web application firewall (WAF): allowed requests, blocked requests, total counted requests, passed requests
Analysis:
- Review source of incoming traffic during the event
- Review protocols, source ports, and TCP flags
Containment:
- Create Web Application Firewall (WAF) rules that match detected behavior
- Add the conditions to the WAF rules
- Add the rules to a web Access Control List (ACL) and count the requests matching the rules
- Monitor counts and block source
Eradication:
- Not applicable
Recovery:
- Not applicable

Compromised Credentials

A playbook for responding to compromised credentials might include the following steps:

Preparation:
- Implement Identity and Access Management (IAM) best practices
- Establish an escalation and reporting communication strategy
Detection:
- Unusual user creation
- Users with more than one access key
- Unfamiliar roles created or access
- Unusual changes to permissions attached to roles
- Unrecognized or unauthorized resources added to cloud environment
Analysis:
- Review for unusual activity associated with logins
- Correlate user ID with suspicious activities, like creating new cloud resources
- Review user ID to identify the last service accessed
Containment:
- Disable the user account and compromised access key
- Change the access information
- Revoke role or roles in any active sessions, application sessions, or role sessions
- Isolate affected resources by detaching them from other resources and blocking inbound/outbound traffic
Eradication:
- Remove resources created by compromised ID
- Check for and remove unrecognized services
- Remove unnecessary permissions related to cloud resources
- Remove exposed data not necessary for operations
- Scan for vulnerabilities on public facing resources
Recovery:
- Restore data from known clean backups predating the event
- Rebuild systems, if necessary, including redeploying from trusted sources
- Restore appropriate access and permissions
- Address vulnerabilities

Ransomware

A playbook for responding to ransomware might include the following steps:

Preparation:
- Create and maintain asset inventory
- Perform regularly vulnerability scans
- Install and update antivirus on endpoint
- Perform and verify data backups regularly
- Apply security updates regularly
- Disable unnecessary applications and functionalities
- Establish an escalation and reporting communication strategy
Detection:
- Review network traffic for data exfiltration “spikes” in activity
- Review endpoint detection and response (EDR) log data for suspicious activity
Analysis:
- Analyze log data related to number of bytes for source and destination IP addresses and ports
- Review for deleted objects, files, filesystems, and data
- Review for unauthorized activity, like creation of IAM users, policies, roles, or temporary credentials
- Review API calls for requests to delete objects, files, filesystems, and data
Containment:
- Isolate affected resources
- Create network access control lists (NACLs) to limit traffic to and from affected resources
- Add rules to limit traffic by protocol, like HTTP or TCP
- Add rules to limit traffic source (inbound) or destination (outbound)
Eradication:
- Remove compromised systems from network
- Identify forensic data necessary
- Remove compromised Domain Controller metadata from the domain
- Inspect backups for potential infection
Recovery:
- Restore data from known clean backups predating the event
- Rebuild systems, if necessary, including redeploying from trusted sources
- Delete unauthorized users, policies, roles
- Revoke temporary credentials
- Create new resources from trusted source

Using ATT&CK in Incident Response Playbook

Security teams can use ATT&CK to inform the Detection and Analysis sections of an incident playbook.

For example, when creating a playbook that responds to a specific incident type, security teams can use ATT&CK to help:

Detect an incident: ATT&CK defines the tactics, techniques, and procedures (TTPs) threat actors use during an incident. When building detections, like Sigma rules, around TTPs, they can map activity to type of incident, like DDoS, compromised credential, or ransomware attack.
Compare detected activity to common TTPs: Since TTPs provide insight into the why, what, and how of an attack, security teams can use them to make hypotheses about what threat actors plan to do next. For example, detecting a Phishing technique indicates an objective of Initial Access while Account Manipulation indicates Persistence. Initial Access occurs before Persistence, meaning that the TTP activity provides insight into the attack stage.
Perform technical analysis: Correlating anomalous activity with known TTPs provides technical context. TTPs can help security teams identify where in the attack chain the detection rule identifies the suspicious activity, enabling them to prioritize the next response steps.
Correlate events and document timeline: By mapping Tactics and Techniques to the log and event sources, security teams create a knowledge base that they can reference during their response activities. The log and event sources provide insight into the attack stage, giving security teams a way to create timelines for adversary activity.

Graylog: Incorporate ATT&CK into Incident Response Processes

With Graylog Security, you can use prebuilt content to map security events to MITRE ATT&CK. By combining Sigma rules and MITRE ATT&CK, you can create high-fidelity alerting rules that enable robust threat detection, lightning-fast investigations, and streamlined threat hunting. For example, with Graylog’s security analytics, you can monitor user activity for anomalous behavior indicating a potential security incident. By mapping this activity to the MITRE ATT&CK Framework, you can detect and investigate adversary attempts at using Valid Accounts to gain Initial Access, mitigating risk by isolating compromised accounts earlier in the attack path and reducing impact.

Graylog’s risk scoring capabilities enable you to streamline your TDIR by aggregating and correlating the severity of the log message and event definitions with the associated asset, reducing alert fatigue and allowing security teams to focus on high-value, high-risk issues.

The post Using MITRE ATT&CK for Incident Response Playbooks appeared first on Graylog.