How to Implement Network Monitoring for Full End-to-End Visibility
As we all continue to struggle with a new reality of telecommuting and working from home, it’s become more apparent than ever that we are at the mercy of networks that are ultimately out of our control. Performance, reachability, and latency issues in these network layers now take more of a toll than ever on our daily lives, as they can disrupt the business-critical sites and SaaS applications on which we all rely so heavily. Monitoring these networks is, therefore, more important than ever, but doing so presents a unique set of challenges that must be overcome with the proper Digital Experience Monitoring (DEM) tools.
Catchpoint launched Network Insights in 2019 with this express purpose in mind. By combining DNS, Traceroute, BGP, and Endpoint monitoring into one suite of tools powered by the largest and most diverse monitoring infrastructure in the industry, Catchpoint users can be sure that they are proactively detecting and resolving issues by testing from the perspective that matters the most: that of the end users.
Last week, Catchpoint was honored to be joined by leading DNS provider NS1, as well as Constellation Research, for the webinar Monitoring Your Tech Stack’s Network Layer. In it, we covered many of the challenges faced by network monitoring teams, how Network Insights helps overcome them, and how NS1 ensures their own network performance by implementing the tools within the Catchpoint Digital Experience Monitoring platform.
Network Monitoring Challenges & Strategies
When approaching a network monitoring strategy, it’s important to first recognize that the specific technologies, mechanism, or policies within the network layer are not all that important as individual components. What network architects, network engineers, and network operators must focus on are the effects that these components have on the end-user experience.
Remember, the only visible effect that the network has on customer experience is to lose and delay data. The best performing networks keep this loss and delay to an absolute minimum, and so a properly focused network performance management strategy is designed to ensure that they are kept to a minimum.
However, due to the highly heterogeneous network architectures that are commonplace today in the age of cloud migration, many different things actually affect the network performance, and a comprehensive network monitoring strategy must not only be able to capture data on all of them but also alert for them in real-time and provide the ability to quickly troubleshoot:
- Device / Link availability
- Latency or delay
- Packet loss and retransmissions
- Errors and discards
- Device starving for resources
- Bandwidth / Traffic
- Configuration changes
- Device link traffic
- Application architecture/design
- Network topology
- Number of transmitting nodes
Collecting and analyzing data for these is a key part of network management, which in turn is a vital component of any company that has a digital presence. Digital customer experience is one of the most important areas of focus in our ever-changing world, and network performance is the fundamental basis to excellent customer experience.
And while establishing and maintaining a comprehensive network strategy is not easy, the good news is that it pays for itself in the form of significantly better business results:
Catchpoint Network Insights
Given all of these challenges, Catchpoint helps network operators and engineers solve them with a four-pronged approach that combines DNS Monitoring, BGP Monitoring, Traceroute Monitoring, and Endpoint Monitoring all in a single platform. These data collection and analysis tools are powered by over 825 testing locations around the world, which IT teams can use to emulate the real customer experience from whatever geographies and networks their users are accessing.
It’s important to note that the majority of these monitoring nodes are located on backbone and broadband transit providers that real end user traffic travels through because that’s the only way to understand what customers are actually experiencing. Monitoring from cloud locations the way the vast majority of monitoring vendors do is a start, but it’s ultimately only one piece of the puzzle. Cloud monitoring is really a supplement to the more important backbone, last mile, and wireless networks because that is where user traffic is actually flowing.
DNS Monitoring
As the first engagement that customers have with a brand, a website or application’s DNS resolution process is probably the most important part of digital customer engagement. NS1 has built one of the most advanced DNS services in the world, and they rely on Catchpoint to maintain the incredible performance and availability of that service.
To do so, Catchpoint has two different types of DNS tests: DNS Direct and DNS Experience.
- DNS Direct Name Servers tests directly query the Name Servers to provide availability data and ensure the responses delivered are accurate.
- DNS Experience tests run recursive queries to resolve DNS (just as recursive DNS servers do) to measure latency, performance, and availability of the various DNS servers in the pathway.
However, even with NS1 using a best-of-breed, innovative platform to ensure their service delivery and meet their SLA requirements with their customers, anyone who uses a third-party DNS service must be monitoring that service as well because you can’t put blind trust in your vendors to report on their SLA metrics. Vendors and customers must agree on a neutral third-party to measure and report on that data.
BGP Monitoring
As the default protocol for how packets are routed around the world, BGP is one of the pillars of the modern internet, yet one that often seems like a house of cards itself. That’s because while BGP works very well as its core, it was also designed without any security measures in mind, which makes it very hard to maintain consistent performance due to the prevalence of malicious actors and simple human error.
We saw an example of the former just last week when a BGP hijack took down large swaths of the internet, and there were even more significant issues last year, such as a route leak caused by a misconfigured Autonomous System that spread to Verizon, Cloudflare, Facebook, and most major US financial institutions.
Catchpoint helps network teams overcome this problem with real-time BGP monitoring and alerting. Unlike other tools that only provide open source BGP data every 15 minutes, Catchpoint has established a combination of public and private route collectors around the world that can alert you to different types of BGP issues:
- Route hijacks
- Policy configuration issues
- Route flaps
- Peering issues
Traceroute Monitoring
Given the prominence of cloud / multi-cloud / hybrid-cloud architectures, the importance of end-to-end visibility of your entire digital customer journey is an important step to understanding your true customer experience.
Catchpoint’s traceroute visualizations are the most effective way to consume and analyze that data, as they provide visibility into every hop and Autonomous System (AS) in the network path so that you can quickly and easily detect and identify the root cause of any performance, reachability, or latency issues along the way.
There are two types of traceroute visualizations in the Catchpoint DEM platform:
- Logical AS Sankey view shows the different Autonomous Systems in the path so that you can share date with the correct provider to solve the issue
- IP hop-by-hop view is more granular, showing each router within those AS’s as well; hop-by-hop is more useful for internal monitoring such as that done by NS1, as they need to be able to see the specific router that may be causing problems so that they can remediate
These traceroute tests can be run as standalone monitors, or in conjunction with other types of synthetic tests such as Transaction, Browser, DNS, BGP, etc. Moreover, they can be run on different network protocols such as UDP, ICMP, TCP, etc.
Endpoint Monitoring
Visibility of private enterprise networks such as LANs, WANs, and SD-WANs are also an important component of a comprehensive network monitoring strategy, as they play a key role in a company’s ability to visualize the entire delivery chain for their business-critical SaaS applications. Catchpoint does this with Endpoint monitoring, which combines internal network data from behind the firewall, end-user device data from a browser plug-in, and out-of-box SaaS templates for application performance and availability.
Click here to see the full webinar for Monitoring Your Tech Stack’s Network Layer to learn more about these network monitoring challenges, how Catchpoint helps enterprises overcome them, and how NS1 has put these solutions into action.