Efficient Incident Management with Catchpoint and PagerDuty
The ability to detect and alert performance issues quickly is key to reducing the Mean Time to Resolve (MTTR). Proactive monitoring will catch incidents early on but triggering the right alerts and notifying the relevant incident management team is just as critical. Enterprises rely on multiple disparate tools to monitor different systems so there is a lot of data and noise generated which can render incident management inefficient. For better on-call management and incident response, enterprises use tools such as PagerDuty.
PagerDuty is an alert aggregation and dispatching service for system administrators and support teams. It collects alerts from your monitoring tools and gives you an overall view of incidents impacting system health. When integrated with monitoring tools like Catchpoint, PagerDuty allows you to:
- Aggregate, classify, and correlate events and manage what matters.
- Guarantee alert delivery, to the right team with the right information, every time.
- Configure painless custom on-call schedules, rotations, and escalations.
- Manage the incident workflow on the go.
- Built-in integrations with ChatOps tools and helpdesk services engage the right teams.
- Analyze system efficiency and recognize employee productivity.
Catchpoint-PagerDuty Integration
Catchpoint offers a range of in-built alerting options. This includes advanced features such as alerting based on trends and dynamic thresholds. Integrating Catchpoint with PagerDuty will mean that you can coordinate these alerts along with all of your other monitoring solutions into one easily managed system. There are two ways to feed Catchpoint alerts to PagerDuty: email and Webhook.
Using Email
In PagerDuty, navigate to Configuration/Services and create a new Service with the “Integration type” set to “Integrate via email” and set the “Integration Email” in PagerDuty to the same address as the email address used in the Catchpoint alerts you wish to be delivered to PagerDuty.
If you are using email parsing in PagerDuty, you can leverage the “Initial Trigger” time at the bottom of the alert email to link Catchpoint reminder and improved alerts to the original alert.
Using Alert Webhook
Catchpoint’s Alert Webhook allows pushing data to any other tool when a test triggers an alert. Alert Webhook templates can be customized to fit any format and content-type using Macros.
To get started, log in to the Catchpoint portal, under the Settings menu click on API and create a new alert webhook by enabling “Alert Webhook”.
Then enter the Endpoint URL: https://events.pagerduty.com/generic/2010-04-15/create_event.json
Next, select Template from the Format options.
Click “Add new” in the template selection drop-down menu to create a new template and define the contents in JSON format. PagerDuty accepts JSON data with three required fields – service_key, event_type, and description. Existing templates can be edited by hovering over the template name and selecting the “Edit/View Properties” icon.
The JSON or XML fields can be hardcoded or filled in dynamically with data from the system such as test name, alert severity, conditions that triggered the alert, the node that triggered the alert, etc.
Here is an example of an Alert Webhook Template built with JSON. It includes the timestamp and severity level for an alert, macros are used with the following syntax ${macroName}. The AlertInitialTriggerDateLocal or AlertInitialTriggerDateUtc macro can be used to link reminders and improved alerts to the original trigger alert. This is used in the incident_key field to link related alerts and prevent alert duplication.
{
"service_key": "Your-Integration-Key",
"event_type": "${switch("${NotificationLevelId}","0","trigger","1","trigger","3","resolve")}",
"description": "${switch("${NotificationLevelId}","0","WARNING","1","CRITICAL","3","OK")}: ${TestUrl}",
"incident_key": "${AlertInitialTriggerDateUtc}",
"client": "${TestName}",
"client_url": "${TestUrl}",
"details": {
"NodeName": "${NodeDetails("${NodeName}")}",
"NodeClientAddress": "${NodeDetails("${NodeClientAddress}")}",
"NodeMean": "${NodeDetails("${NodeMean}")}",
"Test Name": "${TestName}",
"Test URL": "${TestUrl}"
}
}
The value for the “service_key” is generated from PagerDuty, follow the instructions here.
Summary
The Catchpoint-PagerDuty integration combines Catchpoint’s existing innovative alert system with PagerDuty’s on-call management and incident response features. Catchpoint integrates with PagerDuty to accelerate troubleshooting using proven workflows, notifications, and benchmarks to reduce mean time to resolve (MTTR). This essentially reduces alert fatigue by consolidating alerts from multiple tools into a single management platform.
With a consolidated tool, DevOps and incident management teams will have more insight into the health of their systems. The speed to respond to an incident is vital to ensure customer and employee experience and this integration fast tracks incident detection and resolution.
Download the integration PagerDuty datasheet for more details or watch a walkthrough of the integration here.