Mastering IPM: Monitor what matters from where it matters
In the first installment of our IPM Best Practices Series, we explored the vast expanse of the Internet Stack and how its complex layers work in unison to keep digital services running. We laid the groundwork for understanding why Internet Performance Monitoring (IPM) is pivotal for the resilience of our interconnected world. This post zeroes in on the imperative of monitoring precisely what matters, and from the right vantage points.
Monitor what matters – but why?
Monitoring is a given in IT: you must monitor your systems to ensure your users have a good experience. However, organizations often encounter a fundamental problem: an overwhelming amount of data from diverse sources, coupled with the difficulty of identifying and monitoring what genuinely matters to their specific operational and business needs.
In the face of this complexity, many organizations default to monitoring only the systems within their direct control. At a glance, this seems logical—it’s simpler. However, this approach overlooks a crucial aspect: it doesn’t reflect the user experience.
Your entire network can look ‘green’, and your users can still be unable to connect. That’s because they’re coming to your systems from different geographic locations and via different routes while using different resources – any of which can degrade their experience. This means you MUST monitor what matters to your users, not just to you!
By monitoring from your users’ perspective, you’ll be positioned to prevent outages and improve user experience. You’ll also better understand your network and how your users are accessing it so you can optimize performance in the future.
How to effectively monitor what matters with IPM
To monitor your digital ecosystem effectively, adopting a comprehensive approach that extends beyond the traditional inside-out perspective of Application Performance Monitoring (APM) is crucial. While APM is invaluable for monitoring the performance and health of your internal infrastructure and applications, it doesn’t provide the complete picture.
For complete, 360-degree visibility of your digital services, it’s essential to integrate APM with Internet Performance Monitoring (IPM). IPM shifts the focus to the end-user perspective, often the missing piece in the monitoring puzzle.
IPM provides deep visibility into the entire Internet Stack, encompassing components like BGP, SASE, DNS, CDNs, WANs, MQTT, and more. It provides a holistic view of how the Internet, with all its complexities, impacts the performance and availability of digital services, enabling businesses to understand the various elements that influence service delivery.
Beyond availability: monitoring for actionable insights
Monitoring what matters is not merely about collecting vast amounts of data; it’s about extracting actionable insights from that data. By focusing on the key metrics that align with strategic goals, organizations can make informed decisions, optimize performance, and ultimately achieve business success.
Businesses often consider their applications as available if they receive an “HTTP 200” response. However, the definition of availability has evolved significantly. It’s no longer solely reliant on “HTTP 200 OK.” True application availability is now contingent on all components functioning flawlessly and as expected.
A case in point: The Adobe outage
Consider the Adobe outage on December 8, 2023, as an illustration. It reaffirmed that even when managed services return HTTP 200 status codes, the user experience can be severely compromised if, for instance, a third-party service fails to load, leaving users facing a blank or incomplete website.
Catchpoint’s IPM and the detection of end-user impact
Catchpoint’s Internet Sonar rapidly pinpointed the Adobe service disruptions, immediately signaling alerts and highlighting the extent of the outage on the dashboard.
Transaction tests at Catchpoint, which emulate the user’s journey and check the loading of specific elements on the page, began to register failures. These were traced back to Adobe requests not loading as expected, leading to the page not loading as intended, as depicted in the screenshot below.
The application teams took proactive measures by temporarily removing Adobe requests. This ensured the delivery of content to end-users as intended. The validation of this adjustment was confirmed through Catchpoint’s tests, as seen in the screenshot below.
Notably, the tests resumed running successfully once the “Adobe” requests were removed.
Monitor from where it matters
Monitoring what matters is critical, but equally important is the location from which we monitor. The right vantage points can shed light on crucial questions during an outage or performance degradation:
- Is the issue localized to a specific geography, or is its impact global?
- Are only users from certain Internet Service Providers (ISPs) affected?
- What is the quality of the user experience over mobile networks?
- Should we solely depend on cloud agent data, considering users aren’t restricted to cloud environments?
Catchpoint’s expansive global observability network provides unmatched visibility, with over 2500 vantage points across users, networks, digital services, and applications. This network allows Catchpoint users to monitor where it matters, adopting an outside-in perspective to authentically gauge the end-user experience.
Unlike other monitoring solutions, our observability network is independent, ensuring the data that fuels our IPM platform is unbiased and reliable. This independence eliminates any ‘fox guarding the henhouse’ concerns. Moreover, having a network decoupled from hosting cloud providers enhances our network’s resilience, enabling us to detect, diagnose, troubleshoot, and confirm issue resolutions, even amid cloud provider outages—delivering extraordinary reliability and assurance to our customers.
Leverage IPM for true application availability
In summary, businesses must embrace the fact that true application availability requires monitoring from the end-user’s perspective, from as many diverse vantage points as possible. The Adobe outage is a poignant reminder of the necessity for a proactive IPM strategy to understand how interconnected components of the Internet Stack impact the end-user experience. It’s not just about data collection; it’s about extracting actionable insights to optimize performance and drive business success.
Stay tuned for our next post in the IPM Best Practices Series, where we’ll delve into the fundamentals of Internet Resilience. We’ll focus on improving customer experience and explore the essential components of reachability, availability, performance, and reliability.