Incident Response

2023.3

LifeOmic implements an information security incident response process to consistently detect, respond, and report incidents, minimize loss and destruction, mitigate the weaknesses that were exploited, and restore information system functionality and business continuity as soon as possible.

The incident response process addresses:

Continuous monitoring of threats through intrusion detection systems (IDS) and other monitoring applications;
Establishment of an information security incident response team;
Establishment of procedures to respond to media inquiries;
Establishment of clear procedures for identifying, responding, assessing, analyzing, and follow-up of information security incidents;
Workforce training, education, and awareness on information security incidents and required responses; and
Facilitation of clear communication of information security incidents with internal, as well as external, stakeholders

These policies were adapted from work by the HIPAA Collaborative of Wisconsin Security Networking Group. Refer to the linked document for additional copyright information.

Policy Statements

LifeOmic policy requires that:

(a) All computing environments and systems must be monitored in accordance to the policies and procedures specified in the following LifeOmic policies and procedures:

Auditing
System Access
End-user Computing and Acceptable Use

(b) All alerts must be reviewed to identify security incidents. These investigations are stored in the Github} IR Project.

(d) Incident response team and management must comply with any additional requests by law enforcement in the event of criminal investigation or national security, including but not limited to warranted data requests, subpoenas, and breach notifications.

Controls and Procedures

Security Incident Response Team (SIRT)

The Security Incident Response Team (SIRT) is responsible for:

Review, analyze and log of all received reports and track their statuses.
Performing investigations, creating and executing action plans, post-incident activities.
Collaboration with law enforcement agencies.

Current members of the LifeOmic SIRT:

Security and Privacy Officer
Security Engineers
Director of Engineering
DevOps and Production Support Team

Incident Management Process

The LifeOmic incident response process follows the process recommended by SANS, an industry leader in security.

LifeOmic’s incident response defines security events and incident types as:

Events - Any observable computer security-related occurrence in a system or network with a negative consequence. Examples:
- Hardware component failing causing service outages.
- Software error causing service outages.
- General network or system instability.
- Alerts raised from a security control source based on its monitoring policy, such as
  - Authentication Software
  - Anti-Virus Software
  - Firewall Software and/or Hardware
- Alerts for modified system files or unusual system accesses.
- Antivirus alerts for infected files or devices.
- Excessive network traffic directed at unexpected geographic locations.

Incidents with No Impact - An event that did result in degraded performance, SIRT team activity, or follow up tickets. This may not be a malicious incident, but could be a reported critical vulnerability that requires triage. No customer facing utility was involved/degraded. Examples:
- Brute Force attempts that result in extra authentication filtering rules created
- Missing endpoint protection agents that result in installed agents
- Critical Vulnerabilities requiring triage and patching
- Misconfigurations deployed to production that require redeploy
Incidents with Impact - A confirmed attack / indicator of compromise, often resulting in data breaches. Examples:
- A Denial-of-Service (DoS) attack causing a critical service to become unreachable or increasing our AWS bill significantly
- Unauthorized changes that result in production downtime or data loss
- Malicious software discovered on endpoints that require eradication
- EC2 instance credentials being used outside of intended scope and result in containment procedures
Breach - A confirmed attack or event that resulted in ePHI, CHD, or other regulatory data to be exposed or in some way altered. Examples:
- Lost laptops that had locally stored healthcare data
- Disclosure of AWS DynamoDB Tables through an attack
- Unauthorized change or destruction of ePHI
- Production systems hosting ePHI that are missing encryption

LifeOmic employees must report any unauthorized or suspicious activity seen on production systems or associated with related communication systems (such as email or Slack). In practice this means keeping an eye out for security events, and letting the Security team know about any observed precursors or indications as soon as they are discovered.

Any event escalated to any type of incident shall trigger an associated playbook. Playbooks are stored in the LifeOmic Internal Security Wiki, and follow the basic SANS incident response flow, outlined below:

I - Identification and Triage

Immediately upon observation LifeOmic members report suspected and known Events, Precursors, Indications, and Incidents in one of the following ways:
1. Direct report to management, the Security Officer, Privacy Officer, or other;
2. Email;
3. Phone call;
4. Submit an incident report online via LifeOmic Internal ServiceDesk;
5. Secure chat; or
6. Anonymously through workforce members desired channels.
The individual receiving the report facilitates the collection of additional information about the incident, as needed, and notifies the Security Officer (if not already done).
The Security Officer determines if the issue is an Event, Precursor, Indication, or Incident.
1. If the issue is an event or non-security related the Security Officer forwards it to the appropriate resource for resolution.
  1. Non-Technical Event (minor infringement): the Security Officer creates an appropriate issue in Github and further investigates the event as needed.
  2. Technical Event: Assign the issue to a technical resource for resolution. This resource may also be a contractor or an outsourced technical resource, in the event of a lack of resource or expertise in the area.
2. If the issue is a security incident the Security Officer activates the Security Incident Response Team (SIRT) and notifies senior leadership by email or Slack.
  1. If a non-technical security incident is discovered the SIRT completes the investigation, implements preventative measures, and resolves the security incident.
  2. Once the investigation is completed, progress to Phase V, Follow-up.
  3. If the issue is a technical security incident, commence to Phase II: Containment.
  4. The Containment, Eradication, and Recovery Phases are highly technical. It is important to have them completed by a highly qualified technical security resource with oversight by the SIRT team.
  5. Each individual on the SIRT and the technical security resource document all measures taken during each phase, including the start and end times of all efforts.
  6. The lead member of the SIRT team facilitates initiation of an Incident ticket in Jira IR Board and documents all findings and details in the ticket.
    - The intent of the Incident ticket is to provide a summary of all events, efforts, and conclusions of each Phase of this policy and procedures.
    - Each Incident ticket should contain sufficient details following the SANS Security Incident Forms templates, as appropriate.
The Security Officer, Privacy Officer, or LifeOmic representative appointed notifies any affected Customers and Partners.
In the case of a threat identified, the Security Officer is to form a team to investigate and involve necessary resources, both internal to LifeOmic and potentially external.

II - Containment (Technical)

In this Phase, LifeOmic’s engineers and security team attempts to contain the security incident. It is extremely important to take detailed notes during the security incident response process. This provides that the evidence gathered during the security incident can be used successfully during prosecution, if appropriate.

Review any information that has been collected by the Security team or any other individual investigating the security incident.
Secure the blast radius (i.e. a physical or logical network perimeter or access zone).
Perform a documented forensic analysis, as outlined in the appropriate LifeOmic Playbook
Complete any documentation relative to the security incident containment on the Incident ticket, using SANS IH Containment Form as a template.
Continuously apprise Senior Management of progress.
Continue to notify affected Customers and Partners with relevant updates as needed.

III - Eradication (Technical)

The Eradication Phase represents the SIRT’s effort to remove the cause, and the resulting security exposures, that are now on the affected system(s).

Determine symptoms and cause related to the affected system(s).
Strengthen the defenses surrounding the affected system(s), where possible (a risk assessment may be needed and can be determined by the Security Officer). This may include the following:
1. An increase in network perimeter defenses.
2. An increase in system monitoring defenses.
3. Remediation (“fixing”) any security issues within the affected system, such as removing unused services/general host hardening techniques.
Conduct a detailed vulnerability assessment to verify all the holes/gaps that can be exploited have been addressed.
1. If additional issues or symptoms are identified, take appropriate preventative measures to eliminate or minimize potential future compromises.
Update the Incident ticket with Eradication details, using SANS IH Eradication Form as a template.
Update the documentation with the information learned from the vulnerability assessment, including the cause, symptoms, and the method used to fix the problem with the affected system(s).
Apprise Senior Management of the progress.
Continue to notify affected Customers and Partners with relevant updates as needed.
Move to Phase IV, Recovery.

IV - Recovery (Technical)

The Recovery Phase represents the SIRT’s effort to restore the affected system(s) back to operation after the resulting security exposures, if any, have been corrected.

The technical team determines if the affected system(s) have been changed in any way.
1. If they have, the technical team restores the system to its proper, intended functioning (“last known good”).
2. Once restored, the team validates that the system functions the way it was intended/had functioned in the past. This may require the involvement of the business unit that owns the affected system(s).
3. If operation of the system(s) had been interrupted (i.e., the system(s) had been taken offline or dropped from the network while triaged), restart the restored and validated system(s) and monitor for behavior.
4. If the system had not been changed in any way, but was taken offline (i.e., operations had been interrupted), restart the system and monitor for proper behavior.
5. Update the documentation with the detail that was determined during this phase.
6. Apprise Senior Management of progress.
7. Continue to notify affected Customers and Partners with relevant updates as needed.
8. Move to Phase V, Follow-up.

V - Post-Incident Analysis (Technical and Non-Technical)

The Follow-up phase represents the review of the security incident to look for “lessons learned” and to determine whether the process that was taken could have been improved in any way. It is recommended all security incidents be reviewed shortly after resolution to determine where response could be improved. Timeframes may extend to one to two weeks post-incident.

Responders to the security incident (SIRT Team and technical security resource) meet to review the documentation collected during the security incident.
A “lessons learned’ form is completed and submitted for peer review and discussion following the LifeOmic Blameless PostMortems methodology
- This form should be completed by the technical resource assigned to the ticket, with interviews from key personnel driving detailed documentation
Ensure all incident related information is recorded and retained as described in LifeOmic Auditing requirements and Data Retention standards.
Close the security incident.

Periodic Evaluation

It is important to note that the processes surrounding security incident response should be periodically reviewed and evaluated for effectiveness. This also involves appropriate training of resources expected to respond to security incidents, as well as the training of the general population regarding the LifeOmic’s expectation for them, relative to security responsibilities. The incident response plan is tested annually.

Incident Categories and Playbooks

The IRT reviews and analyzes on the security events on as part of its daily operations.
Based on the initial analysis, an event may be dismissed due to false positives, normal business operations, exceptions that are already in place, permitted per policy, or duplicates. An audit trail will be kept for event dismissal.
A valid security event may be upgraded to a security incident. Upon which, an incident classification and severity is assigned as specified below.
Record of the decision must be stored with details on date(s), name(s) of the person(s) conducted assessment.
A containment, eradication and recovery procedure is triggered based on the Category classification of the incident.
In addition to the general incident management procedures previously described, one or more of the following playbooks are consulted based on the classification of a particular incident.

Classification

Incident Classifications are based of the work done by ENISA in developing a comprehensive IR Taxonomy.

Reference: ENISA Taxonomy Guide

Abusive Content – Spam, Harassment, Child Porn, Sexual or Violent content, etc.
Malicious Code – Virus, Worm, Trojan, Spyware, etc.
Information Gathering – Scanning, Sniffing, Social Engineering
Intrusions and Intrusion Attempts – Brute Force, 0-days, Account Compromise
Availability Attack – DOS, DDOS, Sabotage
Change Management – Unauthorized changes to production systems, deployment of misconfigurations
Vulnerable - Reported vulnerabilities
Fraud - Unauthorized use of resources, Copyright, Masquerade
Other - All other incidents that don’t fit above descriptions

Severity Levels:

Critical – incident that involves immediate and significant interruption to business operations and/or breach of critical or confidential data
Major – incident that involves immediate interruption to business operations but will not likely result in immediate data breach
Minor – all other confirmed incidents

Response Procedures: Special Cases

The following special cases are considered when responding to an incident:

1) PHI/ePHI –

When a data breach occurs that involves unsecured PHI or ePHI, breach notifications must be performed according to HIPAA regulation requirements, including each individual impacted and as applicable, the covered entity and OCR (see Appendix for additional details).

If the breach or potential breach impacts PHI/ePHI that belongs to a Covered Entity to which LifeOmic is a Business Associate of, the IRT and management team will inform the Covered Entity per the timeframe and contact method established in the Business Associate Agreement or as described in Breach Notification. HIPAA §164.410(b)

2) Criminal activities –

In the event of an attack that involves suspected criminal activities, the IRT and management team will inform law enforcement.

3) Insider Threat -

Members of the cross-discipline insider threat incident handling team include:

Security and Privacy Officer,
COO, and
Director of Engineering as appropriate

Emergency Operations Modes

If an incident constitutes an emergency – for example, an ongoing sabotage campaign where data is being deleted - LifeOmic plans to utilize an Emergency Operations Mode. Below outlines the two different operational modes. In both cases activation must be approved by the CISO or higher levels of authority before being enacted.

In emergency operations mode, temporary access may be granted to security and/or engineering team to access the production environments to perform forensics, root cause analysis, eradication/remediation, or other necessary activities for incident recovery.

Read Only Mode

LifeOmic’s Read Only Mode pauses all write activity in a production AWS account. Customers still can read their data, but no further edits can be made.This is accomplished by access policies in production AWS environments.

An example for when this Emergency Mode might be activated: A threat actor is writing continuous short bursts of data, at a large scale, causing increasing costs and system instability. While we investigate and eradicate the threat we will implement a Read-Only Mode in order to prevent the threat from continuing.

System Offline Mode

LifeOmic’s System Offline Mode completely isolates a production system. This is accomplished by a combination of access control policies and firewall policies. During this time period customers will not be able to access their data.

An example for when this Emergency Mode might be activated: A threat actor has been able to compromise a production account and is exfiltrating large amounts of data.

Tabletop Exercise

At least once per year, LifeOmic security and engineering teams jointly performs a Red Team exercise and/or a simulated “drill” of an emergency cyberattack that results in one or more CRITICAL incidents. Depending on the type of exercise, the duration may range from 2-4 hours (simulated “drill”) to a couple of weeks (full Red Teaming exercise).

The exercise will follow a cyberattack playbook. It may be conducted with all internal resources or with the help of an external security consulting firm. The goal of the exercise is to ensure all parties involved receive proper training to handle an actual incident and to test out the documented procedures in order to identify gaps ahead of a real event. Senior leadership team may be invited to participate in the “drill” depending on the nature of the exercise or receive a readout of the outcome.

Incident Tracking and Records

A record is created for each reported incident in Jira. Each incident record contains details about the incident capturing the incident attributes and progression, including the following as applicable:

Summary
Description
Impact
Priority / Urgency
Categorization
Analysis Notes and Comments
Cause / Determination
Outcome / Resolution
Lessons Learned

If a more detailed post-mortem is applicable, the Security and/or DevOps team will create the write-up and link it in the incident record.

Incident Response

Policy Statements#

Controls and Procedures#

Security Incident Response Team (SIRT)#

Incident Management Process#

I - Identification and Triage#

II - Containment (Technical)#

III - Eradication (Technical)#

IV - Recovery (Technical)#

V - Post-Incident Analysis (Technical and Non-Technical)#

Periodic Evaluation#

Incident Categories and Playbooks#

Classification#

Severity Levels:#

Response Procedures: Special Cases#

Emergency Operations Modes#

Read Only Mode#

System Offline Mode#

Tabletop Exercise#

Incident Tracking and Records#