Blog Post

UltraDNS Went Down and Took Netflix (and Half the Internet) With it

Published
October 16, 2015
#
 mins read
By 

in this blog post

Neustar UltraDNS experienced a major outage due to a suspected DDoS attack on Thursday, October 15, at approximately 4:20 PM EST and lasted for roughly 90 minutes. It was later discovered that the outage was the result of a technical malfunction rather than a DDoS attack; however, it still prohibited the company from providing services to its customers for a significant amount of time, causing some of its biggest clients, like Netflix, to experience DNS failures.

UltraDNS outage

The outage affected the response times and availability of UltraDNS, causing Netflix to experience dramatic drops in availability throughout the duration of this outage as well, even interrupting its services to its 65.5 million subscribed users in several instances.

Netflix DNS outage response

UltraDNS netflix waterfall

The ramifications expand at great lengths, affecting businesses at various degrees. Ensighten, a third party tag management company, was among the companies affected, which means some of their users experienced the impact of this outage as well via potential malfunctions with their third party tags.

The company announced that the outage was due to a technical malfunction, refuting initial reports of a DDoS attack, which didn’t seem like a far-fetched theory considering this wouldn’t have been the first this company has seen recently. Last year, they were hit with a 100 Gbps attack, resulting in latency issues for a large portion of their clients. UltraDNS manages over 14 billion daily DNS queries for clients such as AllState, Rackspace, Nike Store, Mercedes, Forever 21, BBC News, CNN Money, and ETrade.

According to Threatpost.com, large-scale DDoS attacks are occurring more frequently and, while the motivations are varied and often undetermined, attackers can use it as a disguise for other illegal behavior such as intellectual property theft and financial fraud.

Regardless of the cause of Thursday’s event, UltraDNS’s outage serves as a harsh reminder of the countless variables that reside within the complexities of DNS, and how difficult it can be to manage them. As IT professionals, we’re taught that redundancy is your network’s lifeline; however, the architecture of DNS makes it incredibly expensive to build a reliable backup strategy. The only way to completely mitigate a failure is to have your backup work in tandem with your primary service, and since this is simply impossible for most companies to afford, your SLA with your DNS provider is crucial to compensating for the revenue you lose when an outage does occur.

DNS providers typically have hundreds of servers in several points of presence (POPs) across the globe, so a micro-outage that is contained in a small geographic area may go undetected by the provider, but wreak havoc on your site’s performance. Therefore, deploying a synthetic monitoring solution that has the capability to discover an issue and alert you of the problem is crucial to the protection of your profits and the success of your business.

In the aftermath of an outage of this magnitude, we can’t help but think that perhaps it’s time to rethink the DNS specifications to better handle these types of global outages. After all, it only took 20 years for HTTP/2 to arrive—30 years is plenty of time for a DNS 2.0 to be born.

Neustar UltraDNS experienced a major outage due to a suspected DDoS attack on Thursday, October 15, at approximately 4:20 PM EST and lasted for roughly 90 minutes. It was later discovered that the outage was the result of a technical malfunction rather than a DDoS attack; however, it still prohibited the company from providing services to its customers for a significant amount of time, causing some of its biggest clients, like Netflix, to experience DNS failures.

UltraDNS outage

The outage affected the response times and availability of UltraDNS, causing Netflix to experience dramatic drops in availability throughout the duration of this outage as well, even interrupting its services to its 65.5 million subscribed users in several instances.

Netflix DNS outage response

UltraDNS netflix waterfall

The ramifications expand at great lengths, affecting businesses at various degrees. Ensighten, a third party tag management company, was among the companies affected, which means some of their users experienced the impact of this outage as well via potential malfunctions with their third party tags.

The company announced that the outage was due to a technical malfunction, refuting initial reports of a DDoS attack, which didn’t seem like a far-fetched theory considering this wouldn’t have been the first this company has seen recently. Last year, they were hit with a 100 Gbps attack, resulting in latency issues for a large portion of their clients. UltraDNS manages over 14 billion daily DNS queries for clients such as AllState, Rackspace, Nike Store, Mercedes, Forever 21, BBC News, CNN Money, and ETrade.

According to Threatpost.com, large-scale DDoS attacks are occurring more frequently and, while the motivations are varied and often undetermined, attackers can use it as a disguise for other illegal behavior such as intellectual property theft and financial fraud.

Regardless of the cause of Thursday’s event, UltraDNS’s outage serves as a harsh reminder of the countless variables that reside within the complexities of DNS, and how difficult it can be to manage them. As IT professionals, we’re taught that redundancy is your network’s lifeline; however, the architecture of DNS makes it incredibly expensive to build a reliable backup strategy. The only way to completely mitigate a failure is to have your backup work in tandem with your primary service, and since this is simply impossible for most companies to afford, your SLA with your DNS provider is crucial to compensating for the revenue you lose when an outage does occur.

DNS providers typically have hundreds of servers in several points of presence (POPs) across the globe, so a micro-outage that is contained in a small geographic area may go undetected by the provider, but wreak havoc on your site’s performance. Therefore, deploying a synthetic monitoring solution that has the capability to discover an issue and alert you of the problem is crucial to the protection of your profits and the success of your business.

In the aftermath of an outage of this magnitude, we can’t help but think that perhaps it’s time to rethink the DNS specifications to better handle these types of global outages. After all, it only took 20 years for HTTP/2 to arrive—30 years is plenty of time for a DNS 2.0 to be born.

This is some text inside of a div block.

You might also like

Blog post

Consolidation and Modernization in Enterprise Observability

Blog post

Catchpoint named a leader in the 2024 Gartner® Magic Quadrant™ for Digital Experience Monitoring

Blog post

When SSL Issues aren’t just about SSL: A deep dive into the TIBCO Mashery outage