Manage and resolve an incident
Whatever your role in the incident response process—incident commander, resolver, or stakeholder—the Incident Console lets you clearly see what's going on from the moment an incident is initiated. It displays the critical information you and your teams need to stay aligned, allowing you to prioritize tasks and focus on what's important: resolving the incident.
Having everything in one place means you can easily monitor an incident's progress, collaborate with resolvers, and make relevant updates, all from the same screen. Here, we'll walk you through the entire life cycle of an incident, from hitting the initiate incident button, right through to resolution.
- Initiate an incident
- Review an incident's details
- Add and manage resolvers
- Join collaboration channels
- Monitor and record progress
- Resolve and view post-incident metrics
- Create a Post-Incident Report
When an event occurs that impact your business, there are multiple ways you can quickly initiate an incident in xMatters. The way you initiate an incident defines the initial incident, so it's important to remember that the details you enter shape the notification your resolvers receive, depending on how the incident is initiated.
Let's start by initiating our incident. One of the quickest ways to do this is from the Incidents page. If you're working in the web user interface, initiating an incident can be done with the click of a button:
Here, you can easily add key details about the incident, like a summary, description, severity, and include any resolvers you want to target. This information will be available to view and edit in the incident console once the incident is initiated.
Alternatively, we can automatically initiate an incident as part of a flow. The Initiate Incident step in Flow Designer allows us to create an incident at any point in a flow, using information from previous steps in the summary and description. We can also connect collaboration channels, like Slack or an xMatters hosted conference call, to the incident workflow.
To learn more about connecting the Initiate incident step to your flow in Flow Designer, see Initiate Incident step.
The Initiate Incident Widget can also be used to easily initiate an incident by triggering a flow from your dashboard. For more information about configuring and using the widget, see Initiate Incident Widget.
For more details about how to initiate an incident, and the different ways you can do this, see Initiate an incident.
Once you've initiated your incident, you can review and update its details in the Incident Console. The Incident Console displays the key information about the incident, and includes the details that were entered when the incident was initiated (like summary, description, status, and severity). Other details are automatically created (like the incident ID, which is assigned either automatically by xMatters or another system as part of a flow, or manually by the initiator), but some can be edited. For example, by default the impact duration starts when the incident was initiated and ends when the incident status is changed to Mitigated (or the Resolved status, if you bypass Mitigated), but you can change this to match the actual impact duration.
The Incident Console also includes links to any collaboration channels, who is responsible for specific roles, any resolvers who were notified or have engaged, and a timeline that shows any status updates, responses, details, comments, or additional notes received during the incident.
We can edit the details of our incident by clicking Edit next to the information we want to change, or by selecting a new value from a drop-down menu.
Here, we've updated the severity from Medium to Critical, the status from Open to In Progress, and updated the incident owner (by default, this is set to the user who initiated it).
If you'd like to learn more about what specific details mean and how to edit them, see Update incident details.
The resolvers section of the Incident Console shows which users have been notified, what their response was, and if they're engaged in the incident resolution process. If groups or dynamic teams were targeted when the incident was initiated, the names of any users who have responded will also be visible here.
As our incident progresses, this section reflects who is actively engaged in the incident resolution process.
We can see that a user from the IT team is engaged, but there is no response from the other team, so we can choose to add new users, or renotify the current resolvers who haven't responded yet.
For more information about managing resolvers, see Engage resolvers.
The Collaboration section provides links to the active chat channels and conference bridges resolvers are using to communicate.
If an incident was initiated by a flow that has a collaboration step configured in Flow Designer, the collaboration channels will automatically appear in the Collaboration section. If there are no collaboration channels configured (like in the case of our example incident), the area is blank.
To learn more about how to access collaboration channels, or add one to an incident using Flow Designer, see Collaborations.
The best way to monitor an incident's progress is through the incident Timeline. This displays all the changes and comments made during the lifetime of the incident. You can filter the timeline by Notes, Resolvers, or Updates to find specific results.
Here we can see when our incident was created, the response and comments from Emma Rowley and Polly Jones, the change in status from Open to In Progress, and the change in ownership. We can also add a note to the timeline to communicate extra information to resolvers by clicking the Add Note button.
To learn more about working with the timeline, see Timeline.
Once an incident is resolved, the incident metrics are added to the top of the Incident Console, so stakeholders can review how the incident progressed. This allows you see a clear summary of the incident's data and perform a detailed analysis of an incident's lifetime. Once you've reviewed the data, you can easily export it for further reporting or filing purposes.
Now we've updated the incident's status to Resolved, the post-incident metrics are displayed at the top of the screen. Once we've reviewed the data and made any updates where necessary, we can click Export to export the data into a spreadsheet.
For more information about the incident metrics, or to understand how they're calculated, see incident metrics.
Finally, Advanced plan customers can create a Post-Incident Report. This allows you to share information about the response process with stakeholders, and help your team understand what went wrong and what can be improved on to ensure a the incident doesn't reoccur. Fill out the Analysis, Timeline, and Actions sections to complete a detailed retrospective and assign any required post-incident activity.
On the incident console, click Create Post-Incident Report to create the report:
When we first create the report, the sections will be blank. From here, we can add information to the Analysis section, add in relevant entries from the incident timeline, and create post-incident actions which can be tracked in xMatters:
To learn about how to complete the sections of the report, see Post-Incident Report.