Network World From the data center to the edge Thu, 13 Jul 2023 20:48:06 +0000 http://backend.userland.com/rss092 Copyright (c) 2024 IDG Communications, Inc. en-US HPE Aruba adds genAI search tools to network management platform Tue, 26 Mar 2024 14:31:03 +0000

HPE Aruba is deploying genAI-driven search tools in its management platform to help customers access detailed responses to queries about network configurations, documentation, and other IT operational issues.

The company is incorporating multiple HPE-trained large language models (LLM) into the search tools that are part of HPE Aruba Networking Central, its core cloud-based management and orchestration platform for wired and wireless networks spanning campus, branch and data center sites.

By using its own LLMs – and not relying on publicly available LLMs – the tools have access to one of the largest data lakes in the industry, which includes telemetry data from some four million network-managed access points, switches, and edge devices that support one billion unique customer endpoints, said Alan Ni, senior director of edge marketing for HPE Aruba.

“Our data science teams went back and looked at some of the three million questions we’ve collected over time, and there’s a significant amount of questions that we see time and time again, around topics such as configuration management. Or ‘how do I do this in my network environment,’ and ‘what happens when I add some feature that I’m not familiar with?’” Ni said. “In the past, we would point you to a document that might be five pages, or it could be 50 pages, right? And it was incumbent on the end user to kind of comb through that documentation to figure out how to configure something or locate that specific section. Now, we generate specific, optimized responses and specific documentation to vastly improve accuracy, speed and detail,” Ni said.

“Accurately understanding the intent of a user’s question is paramount for better responses,” Ni added. “This can be a significant time saver for network operators trying to find a documentation answer they’re looking for.” 

The genAI LLM search support is available now. It’s built into HPE Aruba Networking Central’s AI Search feature and expands upon existing ML-based AI capabilities to provide deeper insights, better analytics, and more proactive skills, Ni said.

HPE Aruba Networking Central ensures customer data security with proprietary, purpose-built LLMs that remove personally identifiable information, Ni added.

Data Center Management, Generative AI, Network Management Software
]]>
https://www.networkworld.com/article/2074758/hpe-aruba-adds-genai-search-tools-to-network-management-platform.html 2074758
EdgeConnect SD-WAN with SWG: building a SASE foundation Mon, 25 Mar 2024 21:55:14 +0000

In today’s dynamic cybersecurity landscape, safeguarding modern business networks demands a robust, unified solution. In recent years, organizations have faced a dramatic increase in web-based threats with over 490 million ransomware attacks worldwide1 and around 30 percent of adults worldwide encountering phishing scams in 20222.

Traditional standalone secure web gateway (SWG) solutions often struggle to offer a cohesive security approach for both managed and unmanaged devices, leaving organizations vulnerable. The explosion of unmanaged devices in organizations (including IoT, BYOD, and guest devices) accessing enterprise networks amplifies the challenge of preventing access to malicious websites.

In this blog, we’ll explore the benefits of integrating SWG into a secure SD-WAN for a unified, efficient, and comprehensive approach to network security.

Understanding SWG and secure SD-WAN

A secure web gateway (SWG) stands as a frontline defense against web-based threats, including malware, phishing attacks, and malicious websites. It conducts several security inspections, encompassing URL filtering, malicious code detection, and web access control. With a three-layer protection system—DNS filtering, URL filtering, and content filtering—SWG effectively blocks domains and IPs, and filters web access and content, based on policies. Advanced SWG solutions can even prevent unauthorized transmission of sensitive data through data loss prevention (DLP).

Secure SD-WAN revolutionizes network connectivity and security by seamlessly protecting local branches with a built-in next-generation firewall and connecting branch locations to the data center and multi-cloud environments through internet links or using a combination of multiple links (MPLS, Internet, 4G/5G, satcom).

The need for protecting all devices, managed and unmanaged

Standalone SWG solutions often fall short in providing comprehensive security for both managed devices and unmanaged devices in the enterprise network. Even if managed devices running an SSE agent are generally well protected, unmanaged devices remain unprotected, leading to increased security risks.

Unmanaged devices such as guests, third-party contractors, or BYODs can reach malicious websites as they connect to the enterprise network, introducing new threats in the organization. IoT devices are also prone to web-based threats as they generate web traffic when they communicate with cloud services for updates, telemetry, or other purposes. And because managed and unmanaged devices share the same enterprise network, enterprises face additional cybersecurity risks by not protecting unmanaged devices.

Comprehensive security with secure SD-WAN and SWG integration

The integration of SWG to a secure SD-WAN ensures consistent and comprehensive protection for all devices on the enterprise network. As devices connect to the enterprise network, secure SD-WAN automatically directs the traffic to an SWG through dedicated tunnels without requiring an SSE agent.

Unmanaged devices, often challenging to secure, receive the same level of protection as managed devices. Whether they are guest devices, third-party contractors, or IoT devices, the integrated solution fortifies the network against potential vulnerabilities.

Additionally, the secure SD-WAN’s built-in next-generation firewall adds an additional layer of security by providing advanced security features such as IDS/IPS, DDoS defense and Zero Trust segmentation. Regardless of the device type or managed status, every user or device connecting to the enterprise network benefits from advanced threat detection and prevention capabilities.

To fortify security and align with evolving digital needs, the integrated SWG and SD-WAN solution can seamlessly extend capabilities to include Zero Trust Network Access (ZTNA) and Cloud Access Security Broker (CASB). ZTNA ensures a Zero Trust-centric model, rigorously verifying every user, device, or application attempting to access the enterprise network. CASB protects sensitive data hosted in SaaS applications and prevents data loss, while enforcing policies related to access controls. This comprehensive integration transforms the solution into a robust SASE architecture, securing the entire spectrum of data access and usage.

HPE Aruba Networking secure SD-WAN augmented with SWG

The HPE Aruba Networking EdgeConnect SD-WAN family (EdgeConnect SD-WAN, EdgeConnect SD-Branch and EdgeConnect Microbranch) now integrates SWG, part of HPE Aruba Networking SSE through a SASE SWG site license. The solution offers comprehensive protection to all users and things on the network. It is easy to deploy and doesn’t require an agent installed on each device. To do so, EdgeConnect SD-WAN forms a bandwidth-licensed tunnel between SD-WAN and HPE Aruba Networking SWG, while the traffic from managed devices (with an HPE Aruba Networking SSE user-based license) is sent directly to HPE Aruba Networking SSE, bypassing this tunnel.

Protect all devices with integrated SWG in the EdgeConnect SD-WAN fabric

In addition, HPE Aruba Networking can protect devices for organizations with third-party SD-WANs by establishing an IPsec bandwidth-licensed tunnel from the SD-WAN solution to HPE Aruba Networking SWG. It enables organizations to easily protect all devices but also fills the gap of unprotected devices (guests, third-party contractors, IoT).

Protect all devices with third-Party SD-WAN integrated with SWG, without the need for an SSE agent

Advanced threat protection with HPE Aruba Networking SD-WAN

EdgeConnect SD-WAN’s built-in next-generation firewall enables organizations to go beyond web content filtering and malware protection. The solution provides IDS/IPS, DDoS defense and role-based segmentation, enforcing Zero Trust in the organization. IDS/IPS operates on a signature-based system, actively monitoring network traffic to identify patterns indicative of specific attack signatures. For immediate response, an IDS/IPS inline mode is available, swiftly blocking traffic upon intrusion detection. In addition, the DDoS defense mechanism identifies and thwarts various attacks, including protocol attacks, SYN floods, IP spoofing attacks, and more. EdgeConnect SD-WAN also includes robust support for role-based segmentation, aligning with Zero Trust principles to minimize lateral movements. This approach adheres to the principles of least privilege access, ensuring that both users and IoT devices establish communications solely with destinations consistent with their roles in the business.

EdgeConnect SD-WAN also securely breaks out internet traffic by identifying and classifying applications and web domains based on the first packet, enabling automatic traffic steering to HPE Aruba Networking SSE. Using multiple techniques, the solution can identify more than 10,000 applications and more than 300 million web domains.

EdgeConnect SD-WAN also monitors and optimizes network performance with AppExpress. The feature leverages synthetic polling and real-time user traffic observations to steer traffic to the closest SSE Point of Presence (PoP) while selecting the best path across multi-cloud environments.

Expanding SD-WAN and SWG to HPE Aruba Networking unified SASE

By implementing a secure SD-WAN solution augmented with SWG capabilities, organizations can seamlessly transition to HPE Aruba Networking unified SASE by including ZTNA and CASB capabilities. This integrated approach streamlines the security framework, enabling organizations to consolidate their diverse security services into a cohesive platform. This platform not only accelerates deployment, but also ensures unified security policies, centralized management, consistent Zero Trust access, and the ability to adapt seamlessly to the evolving threat landscape. With EdgeConnect SD-WAN and HPE Aruba Networking SWG as the foundation of HPE Aruba Networking unified SASE, enterprises can adopt a future-proof strategy for their security.

Deploy EdgeConnect SD-WAN with the cloud-native HPE Aruba Networking SSE solution for a unified SASE platform

To learn more, please watch this lightboard video on SWG.

Other resources:

1Annual number of ransomware attacks worldwide from 2017 to 2022, Statista

2Phishing – Statistics & Facts, Statista

SD-WAN
]]>
https://www.networkworld.com/article/2074605/edgeconnect-sd-wan-with-swg-building-a-sase-foundation.html 2074605
4 reasons to consider a network digital twin Mon, 25 Mar 2024 19:09:13 +0000

The use of digital twins – digital representations of physical objects or systems – is on the rise. Enterprises can use digital twins to replicate their IT environments, including infrastructure, network equipment, and Internet of Things (IoT) devices, and then run simulations to test the impact of changes and to optimize performance. They can be used to validate the current state of a network, for example, and test configuration changes, firmware updates, or adjustments to security policies.

Digital twin technologies are gaining traction because of their potential to bridge the gap between physical and virtual worlds, according to Grand View Research, which says the global digital-twin market is forecast to expand at a compound annual growth rate (CAGR) of 38% from 2023 to 2030. Incorporating technologies such as artificial intelligence (AI), cloud computing and IoT into digital twin systems is expected to boost market growth in the forecast period, Grand View says.

Digital twins have had appeal in certain industries – manufacturing, oil and gas, utilities, mining – “basically physical, high-capital, asset-intensive verticals,” says Jonathan Lang, research director, worldwide IT/OT convergence strategies, at research firm IDC.

In these settings, the rationale for digital twins has been clear, thanks to potential benefits that include better visibility into the health of assets, improved reliability, cost savings, and the ability to ensure stable operations, Lang says. “IT environments such as infrastructure, network equipment, connected devices, etc., have the same value drivers,” he says.

Digital twin news

Although digital twins have been around for some time, it’s still an early adopter technology. But the number of vendors that offer digital twin solution is growing, and recent upgrades to digital twin offerings include:

  • Forward Networks launched AI Assist, a generative AI feature built into its Forward Enterprise digital twin platform. The addition is designed to give network and security operations professionals comprehensive insights into network performance via natural language prompts. With AI assist, network engineers of varying skill levels can conduct sophisticated network queries, so they can quickly assess network behavior and identify potential issues.
  • Juniper Networks introduced Marvis Minis, an AI-native networking digital experience twin that uses the company’s Mist AI technology to proactively simulate user connections. That way it can instantly validate network configurations and detect problems without users being present. The Minis product simulates end-user, client, device and application traffic to learn the network configuration through unsupervised ML, and to proactively highlight network issues. Data from Minis is continuously fed back into Mist AI, providing an additional source of insight for the best responses.
  • Nokia extended the capabilities of its existing Nokia Network Digital Twin to include all Android devices, the company announced late last year. Coverage and performance data for Wi-Fi, private and public cellular networks can be automatically collected in real time and processed on Nokia’s edge platform to give enterprises a view of how changes in their operations impact network performance.

Here are some key reasons why organizations should consider deploying digital twins.

Stronger security

Enhancing cybersecurity is always a high priority for organizations, and network digital twins can improve the security posture of IT infrastructures in a number of ways.

“Today’s security, network, and cloud operations teams lack access to a single source of truth for network topology, behavior, configuration, segmentation and policy information,” says Chiara Regale, senior vice president, product and user experience at Forward Networks. “This means they are expected to manage and secure a network they cannot see, and [that] often includes unknown devices.”

Enterprise networks might support tens to hundreds of thousands of devices running billions of lines of configuration code, and include multiple clouds, Regale says. “Even if an organization had unlimited resources and money, there’s no way that the human brain can keep up with that complexity,” he says.

Network, security and other teams might have several monitoring tools at their disposal, “but because they are siloed and have varying degrees of data accuracy and timeliness, they create more complexity,” Regale says.

Unlike mapping, verification, or observability tools, network digital twins help make sense of all network behavior, providing contextualized, reliable and actionable data to operations engineers, Regale says. They collect configuration and state information across all network devices, including load balancers, routers, firewalls, and switches, as well as cloud environments.

“This data is then used to calculate all possible paths within the network, analyze detailed behavioral information, and make network configuration and behavior searchable and verifiable,” Regale says. “Network digital twins offer noteworthy security benefits, including critical vulnerability identification and prioritized remediation plans specific to individual device configurations and features in use.”

Network digital twins also have the ability to accelerate incident response analysis by defining the reach of a compromised host in an instant, Regales says, significantly reducing post-incident remediation time and limiting exposure by isolating the host much faster. 

“Because a network digital twin collects on every device in the network, you not only get an always current topology—with the ability to view the entire network or drill down to a single location or device—but you also get current information on network inventory,” Regale says.

Improved documentation

Digital twin technology can provide insights into the infrastructure beyond just configurations, including what the environment is doing at any given time. This is essential for successful documentation.

“Every enterprise is terrible at documentation, due to priorities around delivery, lack of standards on how to record infrastructure changes, and sprawl,” says Michael Wynston, director of network architecture and automation at financial services firm Fiserv.

The firm is using information from Forward Networks’ digital twin platform to help with documentation. This has enabled the company to identify devices that had never been properly decommissioned or onboarded, unmanaged devices for removal or remediation, circuits that had never been properly decommissioned, and equipment that had been decommissioned.

“Without knowing what your infrastructure is doing, you know nothing about your infrastructure,” Wynston says. “Without a digital twin, you do not know your perimeter, you do not know the risks from CVE [common vulnerabilities and exposures], you cannot implement automation. It can paralyze the infrastructure.”

Lack of up-to-date documentation for an infrastructure leads to a lack of understanding of what was built, why it was built and what it should be doing, Wynston says. “This leads further to an overall lack of infrastructure hygiene, as we cannot sustain and secure what we do not know exists,” he says.

Better digital experiences

Much has been made of the importance of providing excellent user experiences in digital environments, whether it be customers navigating an online purchase process or employees trying to access vital information from a server.

Twins can help enhance digital experience, which is the sum of a user’s digital-based interactions with a product, service, device, etc. Given how many points of interaction exist within a typical enterprise, digital experience is a priority.

“Digital experience twins are a new concept that virtualizes an end user, application, or IoT device to validate the network experience and predict problems before they impact user experience,” says Bob Friday, chief AI officer at Juniper Networks.

“These digital experience twins are versatile, seamlessly integrating into live networks operating on existing IT infrastructure,” Friday says. “In today’s world, ensuring seamless connectivity and optimizing user, device, and application experiences are paramount for driving business success.”

Digital twins and digital experience twins are vital tools to ensure the expected behavior of the network, validate security, and assure user experience before users or devices experience issues on the network that can impact the business Friday.

Greater efficiency

Digital twins enable simulation of data across multiple business systems. IDC research has shown that IT organizations are losing lots of time searching for necessary information to perform a job function, Lang says.

“By unifying the data in a single interface, as well as performing analysis across multiple data sets, digital twins improve worker efficiency and the quality and accuracy of analytical outputs,” he says.

Digital twins offer user interfaces into complex processes and data sets that are more intuitive and approachable to interact with for non-technical audiences, Lang says. “This means the skilled labor barrier is lower,” he says. “Lines of business can be more self-sufficient, and people can more rapidly and accurately interpret data to drive improved decision-making.”

One example of increased efficiency from digital twins comes from a large multi-national automotive manufacturer cited by Dan Issacs, general manager and CTO of the Digital Twins Consortium, a global ecosystem of users who are driving best practices for digital twin usage and defining requirements for new digital twin standards.

The automotive company’s IT infrastructure includes more than 5,000 servers, with each twin of a server having more than 400 data points from multiple systems and running 2,000 events per second, Issacs says.

The digital twin of the IT infrastructure brings “the integration of the multiple disparate IT management systems into a single view for cross system event monitoring, prediction and action triggering to achieve optimized outcomes,” Issacs says. It enables operational efficiency, through the ability to predict infrastructure and even help prevent unplanned downtime, he says.

In general, digital twins provide a comprehensive view of network performance and usage patterns, potentially providing improved analysis, greater coverage, more accurate predictive analytics, and enhanced management approaches, Isaacs says.

IoT Security, Network Management Software, Network Security, Networking
]]>
https://www.networkworld.com/article/2074545/4-reasons-to-consider-a-network-digital-twin.html 2074545
Cisco taps former Microsoft, Broadcom exec to grow networking hardware portfolio Mon, 25 Mar 2024 15:06:04 +0000

Cisco today said it has hired industry veteran Martin Lund to run its Common Hardware Group. Lund, whose past experience includes executive roles at Microsoft and Broadcom, will be an executive vice president at Cisco and report directly to CEO Chuck Robbins.

The Common Hardware Group at Cisco is responsible for delivering the silicon, optics, and hardware platforms for Cisco’s switching, routing, and wireless products. The group also is involved in ASIC design, system/board design, circuit board layout, hardware automation, validation and testing, signal integrity, and power design, according to Cisco. 

“I am very pleased to announce that Martin Lund, an industry veteran with decades of experience driving innovation and business growth in the networking and semiconductor industries, is joining Cisco,” Robbins wrote in a blog about executive changes.

As the SVP and GM of Broadcom’s network switching business, Martin helped acquire Dune Networks in 2009 and most recently was corporate vice president of Microsoft’s Azure for Operators, where he was responsible for delivering Azure-based solutions for public and private 5G, packet core, voice, and AI operations, Robbins stated.

Martin was also CEO of Metaswitch Networks, a cloud-native communications software company, before its acquisition by Microsoft. “In addition to his 12 years at Broadcom, where he was instrumental in building the Network Switching brand and growing the business to more than $1 billion, Martin has also held leadership positions at companies such as Cadence Design Systems and Intel,” Robbins stated.

Prior to Lund’s hiring, Eyal Dagan ran the Common Hardware Group, and he has now been promoted to executive vice president, strategic projects, according to Robbins. Dagan was the CEO and founder of Dune Networks and has also worked with Broadcom; he has been senior vice president of Cisco engineering for the past eight years, according to his LinkedIn profile. 

Robbins said that Dagan will work with his team on critical projects that require “deep technical expertise to ensure our innovation and leadership in the technology industry remains strong.”

Careers, Networking
]]>
https://www.networkworld.com/article/2074497/cisco-taps-former-microsoft-broadcom-exec-to-grow-networking-hardware-portfolio.html 2074497
2024 global network outage report and internet health check Fri, 22 Mar 2024 20:27:15 +0000

The reliability of services delivered by ISPs, cloud providers and conferencing services is critical for enterprise organizations. ThousandEyes, a Cisco company, monitors how providers are handling any performance challenges and provides Network World with a weekly roundup of events that impact service delivery. Read on to see the latest analysis, and stop back next week for another update.

(Note: We have archived prior-year updates, including the 2023 outage report and our coverage during the Covid-19 years, when we began tracking the performance of cloud providers and ISPs.)

Internet report for March 11-17, 2024

After weeks of decreasing, global outages increased significantly last week. ThousandEyes reported 206 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of March 11-17. That’s up 45% from 142 outages the week prior. Specific to the U.S., there were 87 outages, which is up 38% from 63 outages the week prior. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages increased from 91 to 131 outages, a 44% increase compared to the week prior. In the U.S., the number of ISP outages climbed slightly from 44 to 46 outages.

Public cloud network outages: Globally, cloud provider network outages increased from six to 10 outages. In the U.S., they increased from four to six outages.

Collaboration app network outages: Globally, collaboration app network outages spiked from six to 34 outages. In the U.S., collaboration app network outages jumped from 3 to 28 outages.

Two notable outages

On March 16, Cogent Communications, a U.S. based multinational transit provider, experienced an outage that impacted multiple downstream providers as well as Cogent customers across multiple regions, including the U.S., Ireland, the U.K., Sweden, Austria, Germany, and Italy. The outage, lasting a total of 12 minutes, was divided into two occurrences over a one-hour and ten-minute period. The first occurrence was observed at around 6:30 PM EDT and appeared to initially be centered on Cogent nodes located in Baltimore, MD and New York, NY. Five minutes into the first occurrence, the nodes located in New York, NY, were replaced by nodes located in Philadelphia, PA, in exhibiting outage conditions. One hour after the issue initially appeared to have cleared, a second occurrence was observed. This second occurrence lasted approximately four minutes and appeared to be centered around nodes located in Baltimore, MD, Philadelphia, PA, New York, NY, and Newark, NJ. The outage was cleared around 7:45 PM EDT. Click here for an interactive view.

On March 12, Hurricane Electric, a network transit provider headquartered in Fremont, CA, experienced an outage that impacted customers and downstream partners across the U.S. and Canada. The outage, first observed around 2:00 AM EDT, lasted 7 minutes in total and was divided into two occurrences over a thirty-minute period. The first occurrence appeared to initially center on Hurricane Electric nodes located in Chicago, IL. Twenty minutes after appearing to clear, the nodes located in Chicago, IL, were joined by nodes located in Seattle, WA in exhibiting outage conditions. This increase in impacted nodes appeared to coincide with an increase in the number of impacted downstream customers and partners. The outage was cleared at around 2:30 AM EDT. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Internet report for March 4-10, 2024

ThousandEyes reported 142 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of March 4-10. That’s down 8% from 155 outages the week prior. Specific to the U.S., there were 63 outages, which is down 10% from 70 outages the week prior. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 95 to 91 outages, a 4% decrease compared to the week prior. In the U.S., the number of ISP outages stayed the same at 44 outages.

Public cloud network outages: Globally, cloud provider network outages fell from 13 to six outages. In the U.S., they decreased from seven to four outages.

Collaboration app network outages: Globally, collaboration app network outages decreased from eight outages to six. In the U.S., collaboration app network outages stayed at the same level as the week before: three outages.

Three notable outages

On March 5, several Meta services, including Facebook and Instagram, experienced a disruption that impacted users attempting to login, preventing them from accessing those applications. The disruption was first observed around 10:00 AM EST. During the disruption, Meta’s web servers remained reachable, with network paths to Meta services showing no significant error conditions, suggesting that a backend service, such as authentication, was the cause of the issue. The service was fully restored around 11:40 AM EST. More detailed analysis here.

On March 5, Comcast Communications experienced an outage that impacted a number of downstream partners and customers as well as the reachability of many applications and services, including Webex, Salesforce, and AWS. The outage, lasting 1 hour and 48 minutes, was first observed around 2:45 PM EST and appeared to impact traffic as it traversed Comcast’s network backbone in Texas, with Comcast nodes located in Dallas, TX and Houston TX, exhibiting outage conditions. The outage was completely cleared around 4:40 PM EST. More detailed analysis here.

On March 6, LinkedIn experienced a service disruption that impacted its mobile and desktop global user base. The disruption was first observed around 3:45 PM EST, with users experiencing service unavailable error messages. The major portion of the disruption lasted around one hour, during which time no network issues were observed connecting to LinkedIn web servers, further indicating the issue was application related. At around 4:38 PM EST, the service started to recover and was totally clear for all users around 4:50 PM EST. More detailed analysis here.

Additional details from ThousandEyes are available here.

Internet report for February 26-March 3, 2024

ThousandEyes reported 155 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of February 26-March 3. That’s down 6% from 165 outages the week prior. Specific to the U.S., there were 70 outages, which is up 19% from 59 outages the week prior. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 111 to 95 outages, a 14% decrease compared to the week prior. In the U.S., ISP outages increased 10%, climbing from 40 to 44 outages.

Public cloud network outages: After weeks of decreasing, cloud provider network outages began increasing again last week. Globally, cloud provider network outages climbed from eight to 13 outages. In the U.S., they increased from four to seven outages.

Collaboration app network outages: Globally, collaboration app network outages increased from five outages to eight. In the U.S., collaboration app network outages rose from two to three outages.

Two notable outages

On February 27, Level 3 Communications, a U.S. based Tier 1 carrier acquired by Lumen, experienced an outage that impacted multiple downstream partners and customers across the U.S. The outage, lasting a total of 18 minutes over a twenty-five-minute period, was first observed around 2:25 AM EST and appeared to be centered on Level 3 nodes located in Cleveland, OH. The outage was cleared around 2:50 AM EST. Click here for an interactive view.

On February 28, Time Warner Cable, a U.S. based ISP, experienced a disruption that impacted a number of customers and partners across the U.S. The outage was first observed at around 2:00 PM EST and appeared to center on Time Warner Cable nodes located in New York, NY.  Five minutes into the outage, the number of nodes located in New York, NY, exhibiting outage conditions increased. The outage lasted 14 minutes and was cleared at around 2:15 PM EST. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Internet report for February 19-25, 2024

ThousandEyes reported 165 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of February 19-25. That’s down significantly from 243 outages in the week prior – a decrease of 32%. Specific to the U.S., there were 59 outages, which is down 34% from 90 outages the week prior. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 121 to 111 outages, an 8% decrease compared to the week prior. In the U.S., ISP outages decreased from 48 to 40 outages, a 17% decrease compared to the previous week.

Public cloud network outages: Globally, cloud provider network outages decreased significantly from 42 to eight outages, a 81% decrease compared to the week prior. In the U.S., they fell from eight to four outages.

Collaboration app network outages: Globally, collaboration app network outages decreased from seven outages to five. In the U.S., collaboration app network outages remained at the same level as the week prior: two outages.

Two notable outages

On February 22, Hurricane Electric, a network transit provider headquartered in Fremont, CA, experienced an outage that impacted customers and downstream partners across multiple regions, including the U.S., Australia, China, the U.K., Japan, Singapore, India, France, and Canada. The outage, first observed around 9:10 AM EST, lasted 32 minutes in total and was divided into two occurrences over a forty-five-minute period. The first occurrence appeared to initially center on Hurricane Electric nodes located in New York, NY, Phoenix, AZ and Indianapolis, IN. Ten minutes after appearing to clear, the nodes located in New York, NY, were joined by nodes located in San Jose, CA in exhibiting outage conditions. Five minutes into the second occurrence, the disruption appeared to radiate out, and the nodes located in New York, NY, Phoenix, AZ and Indianapolis, IN, were joined by nodes located in Seattle, WA, Denver, CO, Ashburn, VA, Kansas City, MO and Omaha, NE in exhibiting outage conditions. This increase in impacted nodes appeared to coincide with an increase in the number of impacted downstream customers and partners. The outage was cleared at around 9:55 AM EST. Click here for an interactive view.

On February 21, Time Warner Cable, a U.S. based ISP, experienced a disruption that impacted a number of customers and partners across the U.S. The outage was first observed at around 2:45 PM EST and appeared to center on Time Warner Cable nodes located in New York, NY.  Fifteen minutes into the outage, the number of nodes located in New York, NY, exhibiting outage conditions increased. The outage lasted 23 minutes and was cleared at around 3:10 PM EST. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Internet report for February 12-18, 2024

ThousandEyes reported 243 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of February 12-18. That’s down from 319 outages in the week prior – a decrease of 24%. Specific to the U.S., there were 90 outages, which is down slightly from 91 the week prior. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 134 to 121 outages, a 10% decrease compared to the week prior. In the U.S., ISP outages decreased from 60 to 48 outages, a 20% decrease compared to the previous week.

Public cloud network outages: Globally, cloud provider network outages decreased significantly from 107 to 42 outages, a 61% decrease compared to the week prior. In the U.S., they doubled from four to eight outages.

Collaboration app network outages: Globally, collaboration app network outages decreased from 11 outages to seven. In the U.S., collaboration app network outages decreased from 5 to 2 outages.

Two notable outages

On February 16, Hurricane Electric, a network transit provider headquartered in Fremont, CA, experienced an outage that impacted customers and downstream partners across multiple regions, including the U.S., Egypt, Sweden, the U.K., Japan, Mexico, Australia, Argentina, the Netherlands, Belgium, and Canada. The outage, first observed around 8:25 AM EST, lasted 23 minutes in total and was divided into two occurrences over a thirty-minute period. The first occurrence appeared to initially center on Hurricane Electric nodes located in New York, NY. Fifteen minutes into the first occurrence, the nodes located in New York, NY, were joined by nodes located in Paris, France and Amsterdam, the Netherlands in exhibiting outage conditions.  Five minutes after appearing to clear, nodes located in New York, NY once again began exhibiting outage conditions. The outage was cleared at around 8:55 AM EST. Click here for an interactive view.

On February 17, AT&T experienced an outage on their network that impacted AT&T customers and partners across the U.S. The outage, lasting around 14 minutes, was first observed around 3:40 PM EST, appearing to center on AT&T nodes located in Little Rock, AR. Five minutes after first being observed, the number of nodes exhibiting outage conditions located in Little Rock, AR, appeared to rise. This increase in nodes exhibiting outage conditions appeared to coincide with a rise in the number of impacted partners and customers. The outage was cleared at around 3:55 PM EST. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Internet report for February 5-11, 2024

ThousandEyes reported 319 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of February 5-11. That’s up from 265 outages in the week prior – an increase of 20%. Specific to the U.S., there were 91 outages. That’s up from 45 outages the week prior, an increase of 102%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages increased from 106 to 134 outages, a 26% increase compared to the week prior. In the U.S., ISP outages more than doubled from 28 to 60 outages, a 114% increase compared to the previous week.

Public cloud network outages: Globally, cloud provider network outages decreased slightly from 117 to 107, a 9% decrease compared to the week prior. In the U.S., they decreased from five to four outages.

Collaboration app network outages: Globally, collaboration app network outages climbed from three outages to 11. In the U.S., there were five collaboration app network outages, up from zero the week prior.

Two notable outages

On February 7, Time Warner Cable, a U.S. based ISP, experienced a disruption that impacted a number of customers and partners across multiple regions, including the U.S., Ireland, the U.K., Canada, India, Australia, Singapore, Japan, the Netherlands, France, Germany, Indonesia, Hong Kong, South Korea, China, and Brazil. The outage was observed across a series of occurrences over the course of forty-five minutes. First observed at around 4:50 PM EST, the outage, consisting of five equally spaced four-minute periods, appeared to initially center on Time Warner Cable nodes in New York, NY. Five minutes after appearing to clear, nodes located in New York, NY, were again observed exhibiting outage conditions, joined by nodes located in San Jose, CA. By the third period, the nodes located in San Jose, CA, had appeared to clear and were instead replaced by nodes located in Los Angeles, CA, in exhibiting outage conditions, in addition to nodes located in New York, NY. The outage lasted a total of 20 minutes and was cleared at around 5:35 PM EST. Click here for an interactive view.

On February 6, NTT America, a global Tier 1 ISP and subsidiary of NTT Global, experienced an outage that impacted some of its customers and downstream partners in multiple regions, including the U.S., Germany, the U.K., the Netherlands, and Hong Kong The outage, lasting 24 minutes, was first observed around 8:10 PM EST and appeared to initially center on NTT nodes located in Chicago, IL and Dallas, TX. Around five minutes into the outage, the nodes located in Chicago, IL and Dallas, TX, were joined by nodes located in Newark, NJ, in exhibiting outage conditions. The apparent increase of nodes exhibiting outage conditions appeared to coincide with an increase in the number of impacted downstream customers and partners. The outage was cleared around 8:35 PM EST. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Internet report for January 29- February 4, 2024

ThousandEyes reported 265 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of January 29- February 4. That’s more than double the number of outages in the week prior (126). Specific to the U.S., there were 45 outages. That’s down from 55 outages the week prior, a decrease of 18%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages was 106, an increase of 15% compared to 92 outages the previous week. In the U.S., ISP outages decreased by 28%, dropping from 39 to 28 outages.

Public cloud network outages: Globally, cloud provider network outages skyrocketed from five to 117 last week (the increase appeared to be a result of an increase in outages in the APJC region). In the U.S., they increased from two to five outages.

Collaboration app network outages: Globally, collaboration app network outages decreased from five outages to three. In the U.S., collaboration app network outages decreased from one outage to zero.

Two notable outages

On January 31, Comcast Communications experienced an outage that impacted a number of downstream partners and customers across multiple regions including the U.S., Malaysia, Singapore, Hong Kong, Canada, Germany, South Korea, Japan, and Australia. The outage, lasting 18 minutes, was first observed around 8:00 PM EST and appeared to be centered on Comcast nodes located in Ashburn, VA. Ten minutes into the outage, the nodes exhibiting outage conditions, located in Ashburn, VA, appeared to increase. The apparent increase of nodes exhibiting outage conditions appeared to coincide with an increase in the number of impacted downstream customers and partners. The outage was cleared around 8:20 PM EST. Click here for an interactive view.

On February 2, NTT America, a global Tier 1 ISP and subsidiary of NTT Global, experienced an outage that impacted some of its customers and downstream partners in multiple regions, including the U.S., Germany, the Netherlands, and the U.K. The outage, lasting 23 minutes, was first observed around 1:25 PM EST and appeared to center on NTT nodes located in Dallas, TX and Chicago, IL. The outage was cleared around 1:50 PM EST. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Internet report for January 22-28, 2024

ThousandEyes reported 126 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of January 22-28. That’s down from 156 the week prior, a decrease of 19%. Specific to the U.S., there were 55 outages. That’s down from 91 outages the week prior, a decrease of 40%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages was 92, a decrease of 14% compared to 107 outages the previous week. In the U.S., ISP outages decreased by 35%, dropping from 60 to 39 outages.

Public cloud network outages: Globally, cloud provider network outages dropped from 14 to five last week. In the U.S., they decreased from seven to two outages.

Collaboration app network outages: Globally, collaboration app network outages remained the same as the week prior: five outages. In the U.S., collaboration app network outages decreased from four outages to one.

Three notable outages

On January 26, Microsoft experienced an issue that affected its customers in various regions around the globe. The outage was first observed around 11:00 AM EST and seemed to cause service failures in Microsoft Teams, which affected the usability of the application for users across the globe. While there was no packet loss when connecting to the Microsoft Teams edge servers, the failures were consistent with reported issues within Microsoft’s network that may have prevented the edge servers from reaching the application components on the backend. The incident was resolved for many users by 6:10 PM EST. Click here for an interactive view.

On January 24, Akamai experienced an outage on its network that impacted content delivery connectivity for customers and partners using Akamai Edge delivery services in the Washington D.C. area. The outage was first observed around 12:10 PM EST and appeared to center on Akamai nodes located in Washington D.C. The outage lasted a total of 24 minutes. Akamai announced that normal operations had resumed at 1:00 PM EST. Click here for an interactive view.

On January 23, Internap, a U.S based cloud service provider, experienced an outage that impacted many of its downstream partners and customers in multiple regions, including the U.S., and Singapore. The outage, which was first observed around 2:30 AM EST, lasted 18 minutes in total and appeared to be centered on Internap nodes located in Boston, MA. The outage was at its peak around fifteen minutes after being observed, with the highest number of impacted regions, partners, and customers. The outage was cleared around 2:55 AM EST. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Internet report for January 15-21, 2024

ThousandEyes reported 156 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of January 15-21. That’s up from 151 the week prior, an increase of 3%. Specific to the U.S., there were 91 outages. That’s up significantly from 63 outages the week prior, an increase of 44%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages was 107, an increase of 8% compared to 83 outages the previous week, and in the U.S. ISP outages increased by 58%, climbing from 38 to 60 outages.

Public cloud network outages: Globally, cloud provider network outages dropped from 30 to 14 last week. In the U.S., they increased from six to seven outages.

Collaboration app network outages: Globally, collaboration app network outages decreased from seven to five outages. In the U.S., collaboration app network outages stayed at the same level: four outages.

Two notable outages

On January 16, Oracle experienced an outage on its network that impacted Oracle customers and downstream partners interacting with Oracle Cloud services in multiple regions, including the U.S., Canada, China, Panama, Norway, the Netherlands, India, Germany, Malaysia, Sweden, Czech Republic, and Norway. The outage was first observed around 8:45 AM EST and appeared to center on Oracle nodes located in various regions worldwide, including Ashburn, VA, Tokyo, Japan, San Jose, CA, Melbourne, Australia, Cardiff, Wales, London, England, Amsterdam, the Netherlands, Frankfurt, Germany, Slough, England, Phoenix, AZ, San Francisco, CA, Atlanta, GA, Washington D.C., Richmond, VA, Sydney, Australia, New York, NY, Osaka, Japan, and Chicago, IL. Thirty-five minutes after first being observed, all the nodes exhibiting outage conditions appeared to clear. A further ten minutes later, nodes located in Toronto, Canada, Phoenix, AZ, Frankfurt, Germany, Cleveland, OH, Slough, England, Ashburn, VA, Washington, D.C., Cardiff, Wales, Amsterdam, the Netherlands, Montreal, Canada, London, England, Sydney, Australia, and Melbourne, Australia began exhibiting outage conditions again.  The outage lasted 40 minutes in total and was cleared at around 9:50 AM EST. Click here for an interactive view.

On January 20, Hurricane Electric, a network transit provider headquartered in Fremont, CA, experienced an outage that impacted customers and downstream partners across multiple regions, including the U.S., Thailand, Hong Kong, India, Japan, and Australia. The outage, first observed around 7:15 PM EST, lasted 11 minutes in total and was divided into two occurrences over a one-hour five-minute period. The first occurrence appeared to center on Hurricane Electric nodes located in Los Angeles, CA. Fifty minutes after the first occurrence appeared to clear, the second occurrence was observed. Lasting 8 minutes, the outage initially appeared to center on nodes located in Los Angeles, CA. Around five minutes into the second occurrence, the nodes in Los Angeles, CA were joined by nodes located in San Jose, CA, in exhibiting outage conditions. The outage was cleared at around 8:20 PM EST. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Internet report for January 8-14, 2024

ThousandEyes reported 151 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of January 8-14. That’s up from 122 the week prior, an increase of 24%. Specific to the U.S., there were 63 outages. That’s up from 58 outages the week prior, an increase of 9%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages was 83, an increase of 8% compared to the previous week, and in the U.S. they increased by 6%, climbing from 36 to 38 outages.

Public cloud network outages: Globally, cloud provider network outages jumped from 19 to 30 last week. In the U.S., they decreased from 10 to six outages.

Collaboration app network outages: Globally, collaboration app network outages increased from five to seven outages. In the U.S., collaboration app network outages increased from one to four outages.

Two notable outages

On January 14, Zayo Group, a U.S. based Tier 1 carrier headquartered in Boulder, Colorado, experienced an outage that impacted some of its partners and customers across multiple regions including the U.S., Canada, Sweden, and Germany. The outage lasted around 14 minutes, was first observed around 7:10 PM EST, and appeared to initially center on Zayo Group nodes located in Houston, TX. Ten minutes after first being observed, nodes located in Houston, TX, were joined by nodes located in Amsterdam, the Netherlands, in exhibiting outage conditions. This rise of the number of nodes exhibiting outage conditions appeared to coincide with an increase in the number of impacted downstream partners and customers. The outage was cleared around 7:25 PM EST. Click here for an interactive view.

On January 13, Time Warner Cable, a U.S. based ISP, experienced a disruption that impacted a number of customers and partners across the U.S. The outage was first observed at around 12:45 PM EST and appeared to center on Time Warner Cable nodes located in New York, NY.  Fifteen minutes into the outage, the number of nodes located in New York, NY, exhibiting outage conditions increased. The outage lasted 19 minutes and was cleared at around 1:05 PM EST. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Internet report for January 1-7, 2024

ThousandEyes reported 122 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week ofJanuary 1-7. Over the prior three weeks, all outage categories continuously decreased for two weeks before increasing in the last week. Specific to the U.S., there were 58 outages. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages was 77, an increase of 43% compared to the previous week, and in the U.S. they nearly doubled from 20 to 36.

Public cloud network outages: Globally, cloud provider network outages increased from 13 to 19 last week. In the U.S., they increased from 6 to 10.

Collaboration app network outages: Globally, collaboration app network outages increased from one to five outages. In the U.S., collaboration app network outages increased from zero to one. 

Two notable outages

On January 4, Time Warner Cable, a U.S. based ISP, experienced a disruption that impacted a number of customers and partners across the U.S. The outage was first observed at around 10:45 AM EST and appeared to center on Time Warner Cable nodes located in New York, NY.  Five minutes into the outage, the number of nodes located in New York, NY, exhibiting outage conditions increased. The outage lasted 13 minutes and was cleared at around 11:00 AM EST. Click here for an interactive view.

On January 4, Telecom Italia Sparkle, a Tier 1 provider headquartered in Rome, Italy, and part of the Italian-owned Telecom Italia, experienced an outage that impacted many of its downstream partners and customers in multiple regions, including the U.S., Argentina, Brazil, and Chile. The outage lasted 28 minutes in total and was divided into two episodes over a 35-minute period. It was first observed around 4:00 AM EST. The first period of the outage, lasting around 24 minutes, appeared to be centered on Telecom Italia Sparkle nodes located in Miami, FL. Five minutes after appearing to clear, nodes located in Miami, FL, again exhibited outage conditions. The outage was cleared around 4:35 AM EST. Click here for an interactive view.

Additional details from ThousandEyes are available here.

Cloud Computing, Internet Service Providers, Network Management Software, Networking
]]>
https://www.networkworld.com/article/2071380/2024-global-network-outage-report-and-internet-health-check.html 2071380
2023 global network outage report and internet health check Fri, 22 Mar 2024 20:26:56 +0000

Editor’s note: This is an archive of 2023 incidents as tracked by Cisco subsidiary ThousandEyes. For current trends, see the 2024 outage report and internet health check, which is updated weekly. We’ve also archived our coverage from the Covid-19 years, when we began tracking the performance of cloud providers and ISPs and reporting network outages and internet disruptions.

Internet report for December 11-17, 2023

ThousandEyes reported 175 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of December 11-17. That’s up from 157 the week prior, an increase of 11%. Specific to the U.S., there were 71 outages. That’s down from 78 outages the week prior, a decrease of 9%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages dropped from 119 to 110 outages, a decline of 8%, and in the U.S. they decreased from 58 to 40, a decline of 31%.

Public cloud network outages: Globally, cloud provider network outages increased from seven to 19 last week. In the U.S., they increased from five to nine.

Collaboration app network outages: Globally, collaboration app network outages increased from three to four outages. In the U.S., collaboration app network outages increased from one to three. 

Two notable outages

On December 15, TATA Communications (America) Inc., a global ISP and part of the Indian-owned TATA Communications, experienced an outage that impacted many of its downstream partners and customers in multiple regions, including the U.S., Hong Kong, Vietnam, Japan, Singapore, India, South Korea, Sri Lanka, Indonesia, Malaysia, the U.K., Philippines, and Brazil. The outage lasted one hour and 8 minutes in total and was divided into two episodes over a one hour and 30-minute period. First observed around 11:05 AM EST, the first occurrence lasted around one hour and 4-minutes and appeared to be initially centered on TATA nodes located in Singapore. Ten minutes into the first occurrence, the nodes located in Singapore were joined by nodes located in Hong Kong in exhibiting outage conditions. Around 30 minutes after first being observed, the nodes exhibiting outage conditions increased to include nodes located in Singapore, Seville, Spain, Hong Kong, Chicago, IL, and Tokyo, Japan. The final part of the first occurrence saw all nodes, with the exception of those located in Singapore, appear to clear. Twenty minutes after appearing to clear, nodes located in Singapore once again exhibited outage conditions. The outage was cleared around 12:35 PM EST.  Click here for an interactive view.

On December 17, Time Warner Cable, a U.S. based ISP, experienced a disruption that impacted a number of customers and partners across the U.S. The outage was observed across two periods over the course of twenty minutes. First observed at around 8:40 AM EDT, the first occurrence lasted eight minutes and appeared to center on Time Warner Cable nodes located in New York, NY. Five minutes after appearing to clear, nodes located in New York, NY, were again observed exhibiting outage conditions. The outage lasted a total of 11 minutes and was cleared at around 9:00 AM EDT. Click here for an interactive view.

Internet report for December 4-10, 2023

After a few weeks of trending downward, outages jumped in many categories during the week of December 4-10. ThousandEyes reported 157 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service). That’s up from 97 the week prior – a significant 62% increase. Specific to the U.S., there were 78 outages. That’s more than double (160% increase) the 30 outages in the U.S. during the week prior. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages jumped from 65 to 119 outages, an increase of 83%, and in the U.S., they skyrocketed from 16 to 58, an increase of 263%.

Public cloud network outages: Globally, cloud provider network outages increased from five to seven last week. In the U.S., they increased from one to five.

Collaboration app network outages: Globally, collaboration app network outages remained the same at three outages. In the U.S., collaboration app network outages increased from zero to one outage. 

Two notable outages

On December 6, Cogent Communications, a U.S. based multinational transit provider, experienced an outage that impacted multiple downstream providers as well as Cogent customers across multiple regions, including the U.S., Canada, South Africa, the U.K., Turkey, South Korea, Japan, China, Taiwan, and Australia. The outage, lasting a total of 36 minutes, was divided into four occurrences over a one-hour and fifteen-minute period. The first occurrence was observed around 11:40 PM EST and appeared to initially be centered on Cogent nodes located in San Francisco, CA. Fifteen minutes into the first occurrence, the nodes located in San Francisco, CA were joined by nodes located in San Jose, CA, and Seattle, WA, in exhibiting outage conditions. Around five minutes after the issue initially appeared to have cleared, a second occurrence was observed. This second occurrence lasted approximately four minutes and appeared to be centered around nodes located in San Francisco, CA and Oakland, CA. Fifteen minutes after the second occurrence appeared to clear, nodes located in San Francisco, CA were joined by nodes located in Salt Lake City, UT, in exhibiting outage conditions.  Another ten minutes later, a fourth occurrence was observed, this time appearing to be centered around nodes located in Oakland, CA. The outage was cleared around 12:55 AM EST. Click here for an interactive view.

On December 8, AWS experienced some disruption that appeared to impact users and customers leveraging CloudFront services located in Los Angeles, CA. First observed around 7:10 AM EST, the disruption lasted 19 minutes and appeared to center on Amazon nodes located in Los Angeles, CA. The outage was cleared around 7:30 AM EST. Click here for an interactive view.

Internet report for November 27- December 3, 2023

ThousandEyes reported 97 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week ofNovember 27- December 3. That’s down from 114 the week prior, a decrease of 15%. Specific to the U.S., there were 30 outages. That’s down from 38 outages the week prior, a decrease of 21%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages dropped from 83 to 65 outages, a decline of 22%, and in the U.S. they decreased from 24 to 16, a decline of 33%.

Public cloud network outages: Globally, cloud provider network outages increased from three to five last week. In the U.S., they decreased from three to one.

Collaboration app network outages: Globally, collaboration app network outages decreased from four to three outages. In the U.S., collaboration app network outages remained at zero for the second week in a row. 

Two notable outages

On November 29, Comcast Communications experienced an outage that impacted a number of downstream partners and customers across the U.S. The outage, lasting 7 minutes, was first observed around 1:30 AM EST and appeared to be centered on Comcast nodes located in Dallas, TX. Five minutes into the outage the nodes exhibiting outage conditions, located in Dallas, TX, appeared to increase. The apparent increase appeared to coincide with an increase in the number of impacted downstream customers and partners. The outage was cleared around 1:40 AM EST. Click here for an interactive view.

On November 28, Cogent Communications, a U.S. based multinational transit provider, experienced an outage that impacted multiple downstream providers as well as Cogent customers across multiple regions, including the U.S., Austria, and Singapore. The outage, lasting 19 minutes, was first observed around 7:55 PM EST and appeared to center on Cogent nodes located in Marseille, France. After approximately five minutes into the outage there was an observed rise in the number of nodes located in Marseille, France, exhibiting outage conditions. This increase coincided with an increase in the number of impacted downstream customers and partners. The outage was cleared around 8:15 PM EST. Click here for an interactive view.

Internet report for November 20-26, 2023

ThousandEyes reported 114 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of November 20-26. That’s down from 153 the week prior, a decrease of 25%. Specific to the U.S., there were 38 outages. That’s down from 57 outages the week prior, a decrease of 33%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 101 to 83 outages, an 18% decrease, and in the U.S., they decreased from 33 to 24 outages, a decline of 27%.

Public cloud network outages: Cloud provider network outages decreased to three, both globally and in the U.S.

Collaboration app network outages: Globally, there were four collaboration app network outages, which is the same as the week prior. In the U.S., collaboration app network outages dropped down to zero.

Two notable outages

On November 22, Cogent Communications, a U.S. based multinational transit provider, experienced an outage that impacted multiple downstream providers as well as Cogent customers across multiple regions, including the U.S., Spain, Germany, Brazil, Nigeria, Canada, South Africa, the U.K., Australia, and Bulgaria. The outage, lasting a total of 28 minutes, was divided into three occurrences over a forty-minute period. The first occurrence was observed around 5:15 AM EST and appeared to initially be centered on Cogent nodes located in Cleveland, OH. Ten minutes after first being observed, nodes located in Cleveland, OH appeared to clear and were replaced by nodes located in San Jose, CA, and Denver, CO, exhibiting outage conditions Around five minutes after the issue initially appeared to have cleared, a second occurrence was observed. This second occurrence lasted approximately four minutes and appeared to be centered around nodes located in Oakland, CA, and Seattle, WA. Another five minutes later, a third occurrence was observed, this time appearing to be centered around nodes located in San Jose, CA, Oakland, CA, and San Francisco, CA. The outage was cleared around 5:55 AM EST. Click here for an interactive view.

On November 25, Time Warner Cable, a U.S. based ISP, experienced an outage that impacted multiple downstream providers as well as Time Warner Cable customers within the U.S. The outage was first observed at around 4:20 AM EST and appeared to center on Time Warner Cable nodes located in New York, NY. The outage lasted a total of 14 minutes and was cleared at around 4:35 AM EST. Click here for an interactive view.

Internet report for November 13-19, 2023

ThousandEyes reported 153 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of November 13-19. That’s down from 182 the week prior, a decrease of 16%. Specific to the U.S., there were 57 outages. That’s down from 70 outages the week prior, a decrease of 19%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages declined from 122 to 101 outages, a decline of 17%, and in the U.S. they decreased from 36 to 33, a decline of 8%.

Public cloud network outages: Globally, cloud-provider network outages remained the same as the week prior, at 12 outages. In the U.S., they decreased from 10 to 6 outages.

Collaboration app network outages: Globally, collaboration app network outages fell from 13 to four outages. In the U.S., collaboration app network outages climbed from zero to two outages.

Two notable outages

On November 15, Oracle experienced an outage on its network that impacted Oracle customers and downstream partners interacting with Oracle Cloud services in the UK South (London) region. The outage was first observed around 5:30 AM EST and appeared to center on Oracle nodes located in London, England and Slough, England. Forty minutes after being observed, the nodes located in London, England and Slough, England, were joined by nodes located in York, England and Marseille, France, in exhibiting outage conditions for around five minutes. The disruption lasted a total of 46 minutes, and connectivity appeared to be restored around 6:45 AM EST. Click here for an interactive view.

On November 14, NTT America, a global Tier 1 ISP and subsidiary of NTT Global, experienced an outage that impacted some of its customers and downstream partners in multiple regions, including the U.S., Germany, Denmark, Taiwan, China, Romania, the Netherlands, Malaysia, Singapore, the U.K., Singapore, Austria, the United Arab Emirates, and France. The outage, lasting 8 minutes, was first observed around 11:10 AM EST and appeared to initially center on NTT nodes located in Frankfurt, Germany. Around five minutes into the outage, the nodes located in Frankfurt, Germany, were joined by nodes located in Vienna, Austria, and the Capital Region of Denmark, in exhibiting outage conditions. The apparent increase of nodes exhibiting outage conditions appeared to coincide with an increase in the number of impacted downstream customers and partners. The outage was cleared around 11:20 AM EST. Click here for an interactive view.

Internet report for November 6-12, 2023

ThousandEyes reported 182 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of November 6-12. That’s up from 165 the week prior, an increase of 10%. Specific to the U.S., there were 70 outages. That’s down from 84 outages the week prior, a decrease of 17%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages fell slightly from 124 to 122 outages, a dip of 2%, and in the U.S. they decreased from 54 to 36, a decline of 33%.

Public cloud network outages: Globally, cloud-provider network outages decreased from 14 to 12 outages, and in the U.S., they decreased from 12 to ten outages.

Collaboration app network outages: Globally, collaboration app network outages jumped from one to 13 outages. In the U.S., collaboration app network outages remained at zero for the second week in a row.

Three notable outages

On November 9, Time Warner Cable, a U.S. based ISP, experienced an outage that impacted multiple downstream providers, as well as Time Warner Cable customers within the U.S. The outage was first observed at around 1:20 PM EST and appeared to center on Time Warner Cable nodes located in New York, NY. The outage lasted a total of 18 minutes and was cleared at around 1:40 PM EST. Click here for an interactive view.

On November 8, Optus, an Australian telecommunications company, experienced an outage that began around 4:05 AM AEDT and lasted for several hours until connectivity began to return between approximately 12 PM and 1 PM AEDT, with service levels appearing to be at normal levels for most users by 2 PM AEDT. According to a statement from Optus, “the Optus network received changes to routing information from an international peering network following a software upgrade. These routing information changes propagated through multiple layers in our network and exceeded preset safety levels on key routers. This resulted in those routers disconnecting from the Optus IP Core network to protect themselves.”

On November 8, NTT America, a global Tier 1 ISP and subsidiary of NTT Global, experienced an outage that impacted some of its customers and downstream partners across multiple regions including, the U.S., Singapore, Japan, India, Malaysia, and France. The outage, lasting 19 minutes, was first observed around 11:15 AM EST, and appeared to initially center on NTT nodes located in Tokyo, Japan. Fifteen minutes into the outage nodes located in Tokyo, Japan appeared to clear, and were replaced by nodes located in San Jose, CA in exhibiting outage conditions. The outage was cleared around 11:35 AM EST. Click here for an interactive view.

Internet report for October 30- November 5, 2023

ThousandEyes reported 165 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of October 30- November 5. That’s down from 221 the week prior, a decrease of 25%. Specific to the U.S., there were 84 outages. That’s down from 103 outages the week prior, a decrease of 18%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages fell from 129 to 124 outages, a dip of 4%, and in the U.S. they remained at the same level as the week prior, coming in at 54 outages.

Public cloud network outages: Globally, cloud-provider network outages decreased from 20 to 14 outages, and in the U.S., they increased from nine to 12 outages.

Collaboration app network outages: Globally, collaboration app network outages decreased from nine to just one outage. In the U.S., collaboration app network outages declined all the way from six to zero outages.

Two notable outages

On November 4, Oracle experienced an outage on its network that impacted Oracle customers and downstream partners interacting with Oracle Cloud services in the US West (San Jose) region. The outage was first observed around 7:47 PM EDT and appeared to initially center on Oracle nodes located in San Jose, CA. Twenty-five minutes after first being observed, the nodes located in San Jose, CA, were joined by nodes located in San Francisco, CA, Los Angeles, CA, and Santa Clara, CA, in exhibiting outage conditions. The disruption appeared to last a total of 43 minutes and was cleared at around 9:05 PM EDT. Click here for an interactive view.

On October 31, Cogent Communications, a U.S. based multinational transit provider, experienced an outage that impacted multiple downstream providers as well as Cogent customers across multiple regions, including the U.S. and Taiwan. The outage, lasting 24 minutes, was first observed around 3:20 AM EDT and appeared to initially be centered on Cogent nodes located in San Francisco, CA. Five minutes into the outage, the nodes located in San Francisco, CA, appeared to clear but were replaced by nodes located in Oklahoma City, OK, in exhibiting outage conditions. The outage was cleared around 3:45 AM EDT. Click here for an interactive view.

Internet report for October 23-29, 2023

ThousandEyes reported 221 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of October 23-29. That’s up from 163 the week prior, an increase of 36%. Specific to the U.S., there were 103 outages. That’s up from 75 outages the week prior, an increase of 37%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages climbed from 105 to 129 outages, a 23% increase, and in the U.S. they increased from 40 to 54, an increase of 35%.

Public cloud network outages: Globally, cloud-provider network outages increased from 17 to 20 outages, and in the U.S., they decreased from 13 to nine outages.

Collaboration app network outages: Globally, collaboration app network outages increased from four to nine outages. In the U.S., collaboration app network outages jumped from one to six outages.

Two notable outages

On October 29, TATA Communications (America) Inc., a global ISP and part of the Indian-owned TATA Communications, experienced an outage that impacted many of its downstream partners and customers in multiple regions, including the U.S., Hong Kong, Vietnam, Japan, Germany, Singapore, Mexico, Australia, the Philippines, Thailand, and China. The outage lasted 51 minutes in total and was divided into four episodes over a one hour and 20-minute period, first observed around 1:55 AM EDT. The outage was cleared around 3:15 AM EDT. Click here for an interactive view.

On October 24, NTT America, a global Tier 1 ISP and subsidiary of NTT Global, experienced an outage that impacted some of its customers and downstream partners in multiple regions including the U.S., China, South Korea, the Netherlands, Singapore, Switzerland, France, Poland, Germany, Denmark, the U.K., Japan, and Ireland. The outage, lasting 9 minutes, was first observed around 3:05 PM EDT and appeared to initially center on NTT nodes located in San Jose, CA, Los Angeles, CA, Seattle, WA, Tokyo, Japan and Osaka, Japan. Around five minutes into the outage, the nodes located in San Jose, CA, Los Angeles, CA, Seattle, WA, Tokyo, Japan and Osaka, Japan, were joined by nodes located in Ashburn VA, and Dallas, TX, in exhibiting outage conditions. The apparent increase of nodes exhibiting outage conditions appeared to coincide with an increase in the number of impacted downstream customers and partners. The outage was cleared around 3:15 PM EDT. Click here for an interactive view.

Internet report for October 16-22, 2023

ThousandEyes reported 163 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of October 16-22. That’s down from 192 the week prior, a decrease of 15%. Specific to the U.S., there were 75 outages, which is the same number as the week prior. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages fell from 159 to 105 outages, a 34% decline, and in the U.S. they fell from 61 to 40, a decrease of 34%.

Public cloud network outages: Globally, cloud-provider network outages more than doubled from seven to 17 outages, and in the U.S., they climbed from six to 13 outages.

Collaboration app network outages: Globally, collaboration app network outages increased from three  to four outages. In the U.S., collaboration app network outages fell from two to one.

Two notable outages

On October 16, Hurricane Electric, a network transit provider headquartered in Fremont, CA, experienced an outage that impacted customers and downstream partners across multiple regions, including the U.S., Australia, China, Canada, Argentina, Hong Kong, Malaysia, and South Africa. The outage, first observed around 12:00 AM EDT, lasted 32 minutes in total and was divided into three occurrences over a 55-minute period. The first occurrence appeared to center on Hurricane Electric nodes located in New York, NY and San Jose, CA. Fifteen minutes after the first occurrence appeared to clear, the second occurrence was observed. Lasting 24 minutes, the outage initially appeared to center on nodes located in San Jose, CA, New York, NY and Paris, France. Around fifteen minutes into the second occurrence, the nodes located in San Jose, CA, appeared to clear leaving nodes located in New York, NY and Paris France exhibiting outage conditions. Five minutes after the second occurrence appeared to clear, the third occurrence was observed, appearing to center on nodes located in New York, NY. The outage was cleared at around 12:55 AM EDT. Click here for an interactive view.

On October 19, Comcast Communications experienced an outage that impacted a number of downstream partners and customers across the U.S. The outage, lasting a total of 4 minutes, consisted of two occurrences over a 14-minute period. The first occurrence was observed around 8:35 PM EDT and appeared to center on Comcast nodes located in Dallas, TX. Five minutes after appearing to clear, the nodes located in Dallas, TX, once again appeared to exhibit outage conditions. The outage was cleared around 8:49 PM EDT. Click here for an interactive view.

Internet report for October 9-15, 2023

ThousandEyes reported 192 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of October 9-15. That’s up from 184 the week prior, a increase of 4%. Specific to the U.S., outages increased from 69 to 75, up 9%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages climbed from 124 to 159 outages, a 28% increase, and in the U.S. they increased from 39 to 61, an increase of 56%.

Public cloud network outages: Globally, cloud-provider network outages dropped from 23 to seven outages, and in the U.S., they fell from 15 to six outages.

Collaboration app network outages: Globally, collaboration app network outages increased from two to three outages. In the U.S., collaboration app network outages went from zero to two.

Two notable outages

On October 14, Level 3 Communications, a U.S. based Tier 1 carrier, experienced an outage that impacted multiple downstream partners and customers in multiple regions including the U.S., France, Spain, Austria, Germany, Ireland, and India. The outage, lasting a total of 20 minutes over a 55-minute period, was divided into a series of occurrences. First observed around 4:40 AM EDT, the outage appeared to center on nodes located in Portland, OR. The outage appeared to clear completely around 5:35 AM EDT. Click here for an interactive view.

On October 10, Microsoft experienced an outage on its network that impacted some downstream partners and access to services running on Microsoft environments in multiple regions including the U.S., France, South Africa, Brazil, the Netherlands, Singapore, China, and the U.K.  The outage, which lasted 9 minutes, was first observed around 11:20 AM EDT and appeared to center on Microsoft nodes located in Des Moines, IA. The outage was cleared around 11:30 AM EDT. Click here for an interactive view.

Internet report for October 2-8, 2023

ThousandEyes reported 184 global outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of October 2-8. That’s up from 136 the week prior, an increase of 35%. U.S. outages climbed 19% from 58 to 69. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages climbed from 94 to 124 outages, a 32% increase, and in the U.S. they increased from 32 to 39, an increase of 22%.

Public cloud network outages: Globally, cloud-provider network outages jumped from 10 to 23 outages, and in the U.S., they more than doubled from 7 to 15 outages.

Collaboration app network outages: Globally, collaboration app network outages decreased from eight to two outages. In the U.S., collaboration app network outages dropped back to zero.

Two notable outages

On October 5, Hurricane Electric, a network transit provider headquartered in Fremont, CA, experienced an outage that impacted customers and downstream partners across multiple regions, including the U.S., Argentina, Malaysia, South Africa, Indonesia, Canada, Kenya, China, and Australia. The outage, first observed at around 6:25 PM EDT, lasted 33 minutes in total and was divided into three occurrences over one hour. The first occurrence appeared to center on Hurricane Electric nodes located in New York, NY.  Fifteen minutes after the first occurrence appeared to clear, the second occurrence was observed. Lasting 29 minutes, the outage initially appeared to center on nodes located in San Jose, CA. Around five minutes into the second occurrence, the nodes located in San Jose, CA, were joined by nodes located in New York, NY and Chicago, IL, in exhibiting outage conditions. Around 7:00 PM EDT, the nodes located in New York, NY and Chicago, IL, appeared to clear, leaving just the nodes located in San Jose, CA, in exhibiting outage conditions. Five minutes later, the nodes in San Jose, CA, appeared to clear, and nodes located in New York, NY, Chicago, IL and Paris, France, began exhibiting outage conditions. A further five minutes later, nodes located in New York, NY and Chicago, IL, appeared to clear. The nodes located in Paris, France, were joined by nodes located in Marseille, France, in exhibiting outage conditions. Five minutes after the second occurrence appeared to clear, the third occurrence was observed, appearing to center on nodes located in New York, NY. The outage was cleared at around 7:25 PM EDT. Click here for an interactive view.

On October 4, Level 3 Communications, a U.S. based Tier 1 carrier, experienced an outage that impacted multiple downstream partners and customers across multiple regions including the U.S., Canada, Brazil, the U.K., Singapore, and India. The outage, lasting 8 minutes, was first observed around 12:16 AM EDT and appeared to be centered on Level 3 nodes located in Philadelphia, PA. The outage was cleared around 12:25 AM EDT. Click here for an interactive view.

Internet report for September 25 – October 1, 2023

ThousandEyes reported 136 global outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of September 25 – October 1. That’s a significant drop from 194 the week prior, a decrease of 30%. U.S. outages likewise fell 30%, decreasing from 83 to 58. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 137 to 94, a decline of 31%, and in the U.S. they decreased from 47 to 32, a drop of 32%.

Public cloud network outages: Globally, cloud-provider network outages decreased from 16 to 10, and in the U.S., they fell from 15 to 7.

Collaboration app network outages: Globally, collaboration app network outages increased from zero to eight. In the U.S., there were seven collaboration-app network outages, up from zero the week prior.

Two notable outages

On September 27, Rackspace Technology, a managed cloud computing provider headquartered in San Antonio, Texas, experienced a series of outages over a period of fifteen minutes that impacted multiple downstream providers, as well as Rackspace customers within multiple regions including the U.S. and India. The outage, lasting a total of 8 minutes, was first observed around 3:50 AM EDT and appeared to center on Rackspace nodes located in Dallas, TX. The outage was cleared around 4:05 AM EDT. Click here for an interactive view.

On September 28, Lumen, a U.S. based Tier 1 carrier (previously known as CenturyLink), experienced two outage occurrences over a fifteen-minute period that impacted downstream partners and customers across the U.S. The outage, lasting a total of 8 minutes, was first observed around 8:20 PM EDT and appeared to initially be centered on CenturyLink nodes located in New York, NY and Washington, DC. Ten minutes after first being observed, the nodes located in New York, NY, appeared to clear, leaving just the nodes located in Washington, DC in exhibiting outage conditions. The outage was cleared around 8:35 PM EDT. Click here for an interactive view.

Internet report for September 18-24, 2023

ThousandEyes reported 194 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of September 18-24. That’s down 15% from 229 outage events the week prior. Specific to the U.S., outages fell from 107 to 83, a decrease of 22%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 157 to 137, a decline of 13%, and in the U.S. they decreased from 67 to 47, a drop of 30%.

Public cloud network outages: Globally, cloud-provider network outages climbed from 13 to 16, and in the U.S., they climbed from 11 to 15.

Collaboration app network outages: Both globally and in the U.S., there were zero collaboration app network outages last week.

Two notable outages

On September 20, NTT America, a global Tier 1 ISP and subsidiary of NTT Global, experienced an outage that impacted some of its customers and downstream partners across multiple regions, including, the U.S., Germany, the Netherlands, the U.K., and France. The outage, lasting around 19 minutes, was observed around 3:25 AM EDT and appeared to initially center on NTT nodes located in Dallas, TX, and Paris, France. Around fifteen minutes into the outage, the nodes located in Dallas, TX, appeared to clear, leaving just the nodes located in Paris, France, exhibiting outage conditions. The drop in nodes exhibiting outage conditions appeared to coincide with a decrease in the number of downstream customers and partners impacted. The outage was cleared at around 3:45 AM EDT. Click here for an interactive view.

On September 22, Rackspace Technology, a U.S. managed cloud computing provider headquartered in San Antonio, Texas, experienced a series of outages over a period of forty-five minutes that impacted multiple downstream providers as well as Rackspace customers within multiple regions, including the U.S., and Canada. The outage, lasting a total of 18 minutes, was first observed around 10:40 PM EDT and appeared to center on Rackspace nodes located in Dallas, TX. The outage was cleared around 11:25 PM EDT. Click here for an interactive view.

Internet report for September 11-17, 2023

ThousandEyes reported 229 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of September 11-17. That’s up 24% from 184 outage events the week prior. Specific to the U.S., outages climbed from 91 to 107, an increase of 18%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages increased from 141 to 157, a climb of 11%, and in the U.S. they decreased from 69 to 67, a drop of 3%.

Public cloud network outages: Globally, cloud-provider network outages climbed from seven to 13, and in the U.S., they climbed from six to 11.

Collaboration app network outages: Globally, collaboration app network outages increased from four to nine. In the U.S., there were two collaboration-app network outages, which is the same as the week prior.

Two notable outages

On September 11, Arelion (formerly known as Telia Carrier), a global Tier 1 ISP headquartered in Stockholm, Sweden, experienced an outage that impacted customers and downstream partners across the U.S. and Canada. The disruption, lasting a total of 8 minutes, was first observed around 10:50 PM EDT and appeared to center on nodes located in Newark, NJ. Five minutes after first being observed, the number of nodes exhibiting outage conditions located in Newark, NJ, appeared to decrease. This drop also appeared to coincide with a decrease in the number of downstream customers, partners, and regions impacted. The outage was cleared around 11:00 PM EDT. Click here for an interactive view.

On September 17, Comcast Communications experienced an outage that impacted a number of downstream partners and customers across the U.S. The outage, lasting a total of 9 minutes, was first observed around 7:25 AM EDT and appeared to center on Comcast nodes located in Ashburn, VA.  Five minutes into the outage the number of nodes exhibiting outage conditions located in Ashburn, VA appeared to increase. This rise also appeared to coincide with an increase in the number of downstream partners and customers impacted. The outage was cleared around 7:35 AM EDT.  Click here for an interactive view.

Internet report for September 4-10, 2023

ThousandEyes reported 184 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of September 4-10. That’s up 12% from 164 outage events the week prior. Specific to the U.S., outages climbed from 66 to 91, an increase of 38%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages increased from 113 to 141, a climb of 25%, and in the U.S. they jumped from 44 to 69, an increase of 57%.

Public cloud network outages: Globally, cloud-provider network outages dropped by half from 14 to seven, and in the U.S., they fell from 11 to six.

Collaboration app network outages: Globally, collaboration app network outages increased from two to four. In the U.S., there were two collaboration-app network outages, compared to zero the week prior.

Two notable outages

On September 9, NTT America, a global Tier 1 ISP and subsidiary of NTT Global, experienced an outage that impacted some of its customers and downstream partners across multiple regions, including, the U.S., Hong Kong, South Korea, Indonesia, Singapore, China, Taiwan, the Philippines, and Japan. The outage, lasting around 23 minutes, was observed around 2:45 AM EDT and appeared to center on NTT nodes located in Dallas, TX. Around twenty minutes into the outage, some of the nodes located in Dallas, TX, appeared to clear. The drop in nodes and exhibiting outage conditions appeared to coincide with a decrease in the number of downstream customers and partners impacted. The outage was cleared at around 3:10 AM EDT. Click here for an interactive view.

On September 6, Cogent Communications, a U.S. based multinational transit provider, experienced an outage that impacted multiple downstream providers as well as Cogent customers across multiple regions including the U.S., India, Singapore, Spain, Mexico, Hong Kong, Australia, Indonesia, the U.K., Costa Rica, Switzerland, Canada, Egypt, Malaysia, Germany, Japan, and China. The outage, lasting a total of 57 minutes, was divided into three occurrences over a one hour and fifteen-minute period. The first occurrence was observed around 4:55 PM EDT and appeared to initially be centered on Cogent nodes located in Houston, TX. Five minutes after first being observed, nodes exhibiting outage conditions expanded to include nodes located in Phoenix, AZ, and El Paso, TX. This rise of the number of nodes exhibiting outage conditions appeared to coincide with an increase in the number of impacted regions, downstream partners, and customers. Around ten minutes after appearing to clear, the second occurrence was observed. In the second occurrence the nodes located in El Paso, TX, Houston, TX and Phoenix, AZ, were joined by Cogent nodes located in Columbus, OH, in exhibiting outage conditions. Five minutes into the second occurrence, all the nodes, except for the nodes located in Phoenix, AZ, and Columbus, OH, appeared to recover. Five minutes after appearing to clear, the third occurrence was observed, initially appearing to center on nodes located in Phoenix, AZ. Around ten minutes into the third occurrence, nodes located in Phoenix, AZ, were joined by nodes located in Los Angeles, CA, and Houston, TX, in exhibiting outage conditions. Five minutes later, nodes located in Los Angeles, CA, appeared to clear. With nodes located in Phoenix, AZ, and Houston, TX, joined by nodes located in El Paso, TX, in exhibiting outage conditions. The outage was cleared around 6:10 PM EDT. Click here for an interactive view.

Internet report for August 28- September 3, 2023

ThousandEyes reported 164 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of August 28- September 3. That’s up 27% from 129 outage events the week prior. Specific to the U.S., outages increased from 63 to 66, up 5%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages increased from 77 to 113, a climb of 47%, and in the U.S. they spiked from 26 to 44, an increase of 69%.

Public cloud network outages: Globally, cloud-provider network outages increased from 11 to 14, and in the U.S., they increased from 10 to 11.

Collaboration app network outages: Globally, collaboration app network outages decreased from 10 to two. In the U.S., there were zero collaboration-app network outages, compared to nine the week prior.

Two notable outages

On September 2, Level 3 Communications, a U.S. based Tier 1 carrier, experienced an outage that impacted multiple downstream partners and customers in multiple countries including the U.S., South Africa, Germany, Mexico, New Zealand, Argentina, Spain, Brazil, Poland, Canada, the U.K., the Netherlands, France, Australia, Egypt, Ireland, Hong Kong, India, Turkey, Singapore, and Japan. The outage, lasting a total of 64 minutes, over a two hour and 30-minute period, was divided into a series of occurrences. First observed around 12:10 PM EDT, the outage initially appeared to center on nodes located in Raleigh, NC. Thirty-five minutes into the first occurrence, nodes exhibiting outage conditions increased to include nodes located in San Francisco, CA. This increase in nodes exhibiting outage conditions also appeared to coincide with an increase in the number of downstream customers, partners and regions impacted. This first occurrence was the longest, lasting 44 minutes. Around five minutes after appearing to clear, nodes located in Raleigh, NC, once again appeared exhibiting outage conditions. This brief second occurrence, lasting around 4 minutes, was followed ten minutes later, by another 4-minute disruption, this time appearing to center on nodes located in San Francisco, CA. Around 25 minutes after appearing to clear, the nodes located in San Francisco, CA, appeared to exhibit outage conditions again. Forty minutes after the nodes located in San Francisco, CA, appeared to clear, the last occurrence of the outage was observed, appearing to center on nodes located in Raleigh, NC. The outage appeared to clear completely around 2:40 PM EDT. Click here for an interactive view.

On August 28, Time Warner Cable, a U.S. based ISP, experienced a disruption that impacted a number of customers and partners in multiple countries including the U.S., the U.K., Brazil, the Netherlands, Germany, India, Canada, Japan, France, Ireland, Australia, and Hong Kong. The outage was first observed at around 6:50 PM EDT and appeared to center on Time Warner Cable nodes located in New York, NY. The outage lasted a total of 20 minutes, with five evenly distributed occurrences, each lasting around 4 minutes each, over a period of 45 minutes. The outage was cleared at around 7:35 PM EDT. Click here for an interactive view.

Internet report for August 21-27, 2023

ThousandEyes reported 129 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of August 21-27. That’s down 12% from 147 outage events the week prior. Specific to the U.S., outages increased from 62 to 63, up 2%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 85 to 77, down 9%, and in the U.S. they decreased from 31 to 26, down 16%.

Public cloud network outages: Globally, cloud-provider network outages decreased from 13 to 11, and in the U.S., they jumped from four to 10.

Collaboration app network outages: Globally, collaboration app network outages climbed from two to 10. In the U.S., there were nine collaboration-app network outages, up from two the week prior. 

Two notable outages

On August 24, Rackspace Technology, a U.S. managed cloud computing provider headquartered in San Antonio, Texas, experienced a series of outages over a period of fifteen minutes that impacted multiple downstream providers, as well as Rackspace customers within multiple regions including the U.S., France, Japan, Spain, South Africa, Vietnam, the U.K., Peru, Chile, Canada, Switzerland, Australia, the Netherlands, Singapore, Turkey, and Brazil. The outage, lasting a total of 8 minutes, was first observed around 1:00 AM EDT and appeared to center on Rackspace nodes located in Chicago, IL. Ten minutes after first being observed, an increased number of Rackspace nodes located in Chicago, IL, once again exhibited outage conditions, increasing the number of impacted customers and partners, before clearing at around 1:15 AM EDT. Click here for an interactive view.

On August 21, Arelion (formerly known as Telia Carrier), a global Tier 1 ISP headquartered in Stockholm, Sweden, experienced an outage that impacted customers and downstream partners across the U.S. The disruption, lasting a total of 8 minutes, was first observed around 6:10 AM EDT and appeared to center on nodes located in Newark, NJ. Five minutes after first being observed, the number of nodes exhibiting outage conditions located in Newark, NJ, appeared to decrease. This drop also appeared to coincide with a decrease in the number of downstream customers, partners and regions impacted. The outage was cleared around 6:20 AM EDT. Click here for an interactive view.

Internet report for August 14-20, 2023

ThousandEyes reported 147 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of August 14-20. That’s down 14% from 171 outage events the week prior. Specific to the U.S., outages increased from 59 to 62, up 5%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 105 to 85, down 19%, and in the U.S. they increased from 29 to 31, up 7%.

Public cloud network outages: Globally, cloud-provider network outages decreased from 17 to 13, and in the U.S., they dropped from 14 to four.

Collaboration app network outages: Globally, collaboration app network outages decreased from six to two. In the U.S., there were two collaboration-app network outages, up from one the week prior. 

Two notable outages

On August 20, Cogent Communications, a U.S. based multinational transit provider, experienced an outage that impacted multiple downstream providers as well as Cogent customers across multiple regions including the U.S., India, and Canada. The outage, lasting a total of 15 minutes, was divided into three occurrences over a thirty-five-minute period. The first occurrence was observed around 4:00 AM EDT and appeared to initially be centered on Cogent nodes located in Boise, ID and Phoenix, AZ. Five minutes after appearing to clear, the second occurrence was observed. In the second occurrence the only nodes located in Boise, ID, appeared to exhibit outage conditions. Five minutes after appearing to clear the nodes located in Boise, ID, once again began exhibiting outage conditions. The outage was cleared around 4:35 AM EDT. Click here for an interactive view.

On August 16, Level 3 Communications, a U.S. based Tier 1 carrier, experienced an outage that impacted multiple downstream partners and customers across the U.S. The outage, lasting 9 minutes, was first observed around 1:00 PM EDT and appeared to be centered on Level 3 nodes located in Raleigh, NC. The outage was cleared around 1:10 PM EDT. Click here for an interactive view.

Internet report for August 7-13, 2023

ThousandEyes reported 171 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of August 7-13. That’s the same as the week prior. Specific to the U.S., outages decreased from 63 to 59, down 5%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages increased from 94 to 105, up 12%, and in the U.S. they decreased from 31 to 29, down 12%.

Public cloud network outages: Globally, cloud-provider network outages increased from 16 to 17, and in the U.S., they doubled from seven to 14.

Collaboration app network outages: Globally, collaboration app network outages tripled from two to six. In the U.S., there was one collaboration-app network outage, up from zero the week prior. 

Two notable outages

On August 10, NTT America, a global Tier 1 ISP and subsidiary of NTT Global, experienced an outage that impacted some of its customers and downstream partners across multiple regions including, the U.S., Singapore, Japan, Thailand, India, the Netherlands, South Korea, Malaysia, Brazil, Denmark and Germany. The outage, lasting 10 minutes, was first observed around 8:00 AM EDT and appeared to initially center on NTT nodes located in Los Angeles, CA, Singapore, Tokyo, Japan, and Vienna, Austria. Five minutes into the outage, nodes located in Los Angeles, CA, and Vienna, Austria, appeared to clear, leaving just nodes located in Tokyo, Japan, and Singapore in exhibiting outage conditions. The outage was cleared around 8:15 AM EDT. Click here for an interactive view.

On August 9, Google experienced an outage on its network that impacted access to services running on Google environments in multiple regions including the U.S. and Canada. The outage, lasting a total of 13 minutes, was divided into two occurrences spread over a thirty-minute period. The first occurrence, lasting 9 minutes, was observed around 6:15 AM EDT and appeared to be centered on Google nodes located in Los Angeles, CA, San Jose, CA and San Francisco, CA. Fifteen minutes after the first occurrence cleared, the second occurrence was observed, this time appearing to center on nodes located in Dallas, TX. The outage was cleared around 6:45 AM EDT. Click here for an interactive view.

Internet report for July 31- August 6, 2023

ThousandEyes reported 171 global network outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of July 31- August 6. That’s up from 156 the week prior, a increase of 10%. Specific to the U.S., outages increased from 60 to 63, up 5%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 108 to 94, down 13%, and in the U.S. they decreased from 39 to 31, down 21%.

Public cloud network outages: Globally, cloud-provider network outages jumped from nine to 16, and in the U.S. decreased from eight to seven.

Collaboration app network outages: Globally, collaboration-app network outages dropped from three to two, and in the U.S. they dropped from one to zero.

Two notable outages

On August 1, Zayo Group, a U.S. based Tier 1 carrier headquartered in Boulder, Colorado, experienced an outage that impacted some of its partners and customers across the U.S. and Canada. The outage lasted around 14 minutes, was first observed around 1:40 AM EDT, and appeared to initially center on Zayo Group nodes located in Paris, France. Five minutes into the outage, the nodes located in Paris, France, appeared to clear, replaced by nodes located in Minneapolis, MN, in exhibiting outage conditions. Around 1:50 AM EDT, the nodes located in Minneapolis, MN, appeared to clear, replaced by nodes located in Seattle, WA, exhibiting outage conditions for the remainder of the outage. The outage was cleared around 1:55 AM EDT. Click here for an interactive view.

On August 3, NTT America, a global Tier 1 ISP and subsidiary of NTT Global, experienced an outage that impacted some of its customers and downstream partners across multiple regions including, the U.S., South Korea, Japan, and the U.K. The outage, lasting 19 minutes, was first observed around 7:45 PM EDT, and appeared to center on NTT nodes located in Los Angeles, CA. The outage was cleared around 8:05 PM EDT. Click here for an interactive view.

Internet report for July 24-30, 2023

ThousandEyes reported 156 global outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of July 24-30. That’s down from 186 the week prior, a decline of 16%. Specific to the U.S., outages decreased from 74 to 60, down 19%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages decreased from 118 to 108, down 8%, and in the U.S. they increased slightly from 37 to 39, up 5%.

Public cloud network outages: Globally, cloud provider-network outages decreased from 10 to nine, and in the U.S. they decreased from nine to eight.

Collaboration app network outages: Globally, collaboration-app network outages dropped from four to three, and in the U.S. they decreased from three to one.

Two notable outages

On July 24, Microsoft experienced an issue that impacted connectivity to SharePoint Online and OneDrive for Business services. First observed around 3:05 PM EDT, it appeared to impact connectivity for users globally. Users encountered a certificate error when attempting to access SharePoint Online and OneDrive due to an erroneous change in the SSL certificate that prevented the establishment of a secure connection to the services. Approximately ten minutes later, at around 3:15 PM EDT, it appeared to be replaced with a valid certificate, and SharePoint and OneDrive service reachability was restored for most users by around 3:20 PM EDT. Around 5:34 PM EDT, Microsoft announced that the outage was the result of a configuration issue and had been resolved. Click here for an interactive view.

On July 26, GTT Communications, a Tier 1 ISP headquartered in Tysons, VA, experienced an outage that impacted some of its partners and customers across multiple regions, including the U.S., Canada, the U.K. and the Republic of Korea. The outage lasted 8 minutes in total and was divided into two episodes over a 15-minute period, first observed around 8:30 AM EDT. The first period of the outage, lasting around 4 minutes, appeared to be centered on GTT nodes located in Seattle, WA, and New York, NY. Five minutes after appearing to clear, the second occurrence was observed, this time with just nodes located in New York, NY, exhibiting outage conditions. This reduction in nodes exhibiting outage conditions appeared to coincide with a decrease in the number of regions and partners impacted. The outage was cleared around 8:45 AM EDT. Click here for an interactive view.

Internet report for July 17-23, 2023

ThousandEyes reported 186 global outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of July 17-23. That’s down from 192 the week prior, a decrease of 3%. Specific to the U.S., outages decreased from 96 to 74, a decline of 23%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages increased from 115 to 118, up 3%, and in the U.S. they decreased from 60 to 37, a decline of 38%.

Public cloud network outages: Globally, cloud provider-network outages decreased from 17 to ten. In the U.S., however, cloud provider network outages increased from four to nine.

Collaboration app network outages: Globally, collaboration-app network outages increased from two to four, and in the U.S. they increased from one to three.

Two notable outages

On July 21, Cogent Communications, a U.S. based multinational transit provider, experienced an outage that impacted multiple downstream providers as well as Cogent customers across multiple regions including the U.S., Brazil, Singapore, India, Spain, the U.K., Hong Kong, Portugal, Germany, Switzerland, France, Mexico, Australia, Canada, China, Japan, Columbia, Costa Rica, Greece, Malaysia, and Luxembourg. The outage, lasting a total of 33 minutes, was divided into two occurrences over a fifty-minute period. The first occurrence was observed around 6:55 AM EDT and appeared to initially be centered on Cogent nodes located in El Paso, TX, Atlanta, GA, and Denver, CO. Five minutes after first being observed, nodes exhibiting outage conditions expanded to include nodes located in Cleveland, OH, Baltimore, MD, Miami, FL, and Phoenix, AZ. This rise of the number of nodes exhibiting outage conditions appeared to coincide with an increase in the number of impacted regions, downstream partners, and customers. Around fifteen minutes after appearing to clear, the second occurrence was observed. In the second occurrence, the nodes located in Atlanta, GA, El Paso, TX, Denver, CO, Miami, FL, and Phoenix, AZ, were joined by Cogent nodes located in Houston, TX, and Mexico City, Mexico, in exhibiting outage conditions. Fifteen minutes into the second occurrence, all the nodes, with the exception of the nodes located in El Paso, TX, appeared to recover. The outage was cleared around 7:45 AM EDT. Click here for an interactive view.

On July 21, Zayo Group, a U.S. based Tier 1 carrier headquartered in Boulder, Colorado, experienced an outage that impacted some of its partners and customers in multiple countries including the U.S., Germany, and Malaysia. The outage lasted around 8 minutes in total and was divided into two occurrences over a fifteen-minute period. First observed around 7:20 AM EDT, the outage appeared to center on Zayo Group nodes located in Phoenix, AZ. The outage was cleared around 7:35 AM EDT. Click here for an interactive view.

Internet report for July 10-16, 2023

ThousandEyes reported 192 global outage events across ISPs, cloud service provider networks, collaboration app networks and edge networks (including DNS, content delivery networks, and security as a service) during the week of July 10-16. That’s a jump from 117 the week prior, an increase of 64%. Specific to the U.S., outages spiked from 61 to 96, an increase of 57%. Here’s a breakdown by category:

ISP outages: Globally, the number of ISP outages increased from 68 to 115, up 69%, and in the U.S. they spiked from 32 to 60, up 88%.

Public cloud network outages: Globally, cloud-provider network outages increased from three to 17, and in the U.S. they increased from three to four.

Collaboration app network outages: Globally, collaboration-app network outages dropped from five to two, and in the U.S. there was one collaboration app network outage, same as the week prior. 

Two notable outages

On July 12, Hurricane Electric, a network transit provider headquartered in Fremont, CA, experienced an outage that impacted customers and downstream partners across the U.S. The outage, first observed at around 1:20 PM EDT, lasted a total of 32 minutes and was divided into three occurrences over a one-hour and 5-minute period. The first occurrence, lasting 24 minutes, appeared to center on Hurricane Electric nodes located in Dallas, TX, and Kansas City, MO. Around 1:25 PM EDT, the nodes located in Kansas City, MO, appeared to temporarily clear before once again exhibiting outage conditions again around 1:30 PM EDT. Around 1:40 PM EDT, the nodes located in Dallas, TX appeared to clear, leaving the nodes located in Kansas City, MO, exhibiting outages for the remainder of the first occurrence. Around 2:00 PM EDT, fifteen minutes after the nodes located in Kansas City, MO appeared to clear, the second occurrence was observed and appeared to center on nodes located in Dallas, TX. Five minutes after the Dallas, TX nodes cleared, the third occurrence was observed and appeared to center on nodes located in Kansas City, MO. The outage was cleared at around 2:15 PM EDT. Click here for an interactive view.

On July 14, Zayo Group, a U.S. based Tier 1 carrier headquartered in Boulder, Colorado, experienced an outage that impacted some of its partners and customers across the U.S. The outage lasted around 14 minutes, was first observed around 3:50 AM EDT, and appeared to center on Zayo Group nodes located in Seattle, WA. Five minutes after first being observed, the number of nodes located in Seattle, WA, exhibiting outage conditions appeared to increase. This rise of the number of nodes exhibiting outage conditions appeared to coincide with an increase in the number of impacted downstream partners and customers. The outage was cleared around 4:05 AM EDT. Click here for an interactive view.

Updated June 27, 2023

Global outages across all three categories last week decreased from 210 to 146, down 30% compared to the week prior. In the US they decreased from 99 to 57, down 42%. Globally, the number of ISP outages decreased from 125 to 95, down 24%, and in the US they decreased from 52 to 37, down 29%. Globally, cloud provider-network outages decreased from 12 to seven, and in the US they decreased from six to four. Globally, collaboration-app network outages dropped from 12 to three, and in the US they decreased from nine to two.

Two notable outages

On June 24, GTT Communications experienced an outage affecting partners and customers across the US, Italy, and Canada. The 14-minute outage was first observed around 1:30 a.m. EDT and appeared centered on GTT nodes in Atlanta, Georgia. The outage was cleared around 1:35 a.m. EDT. Click here for an interactive view.

On June 23, Cogent Communications, experienced an outage affecting multiple downstream providers and Cogent customers across the US and Singapore. The eight-minute outage was first observed around 2:20 a.m. EDT and appeared centered on Cogent nodes in Houston, Texas, and Oklahoma City, Oklahoma. Five minutes after being observed, the Oklahoma nodes appeared to recover. The outage was cleared around 2:30 a.m. EDT. Click here for an interactive view.

Updated June 20, 2023

Global outages across all three categories last week jumped from 130 to 210, up 62% compared to the week prior. In the US, outages jumped from 52 to 99, up 90%. Globally, the number of ISP outages increased from 82 to 125, up 52%, and in the US they increased from 28 to 52, up 86%. Globally, cloud-provider network outages increased from four to 12, and in the US increased from four to six.Globally, collaboration-app network outages increased from six to 12, and in the US they increased from three to nine.

Two notable outages

On June 13, Amazon Web Services (AWS) experienced an incident affecting services in its US-EAST-1 region. The incident, which lasted more than two hours, was first detected around 2:50 p.m. EDT and affecting the availability of applications hosted within AWS. This was confirmed by AWS, announcing via its status page that it had identified the root cause to be an issue with a subsystem responsible for capacity management for AWS Lambda, which caused errors directly for customers and indirectly through the use of other AWS services. The issue was mostly resolved by 4:40 p.m. EDT, with availability returning to normal levels for a majority of AWS services, as well as subsequently affected applications. Click here for an interactive view, and here for a detailed analysis.

On June 14, GTT Communications experienced an outage affecting partners and customers across the US and Canada. The nine-minute outage was first observed around 3:25 p.m. EDT initially centered on GTT nodes located in Seattle, Washington. Five minutes later, nodes in Chicago, Illinois also exhibited outage conditions. The outage was cleared around 3:35 PM EDT. Click here for an interactive view.

Updated June 13, 2023

Global outages across all three categories last week decreased from 176 to 130, a decline of 26% compared to the week prior. In the US, they decreased from 86 to 52, a drop of 40%. Globally, ISP outages decreased from 95 to 82, down 14%, and in the US they decreased from 44 to 28, down 36%. Globally, cloud-provider outages decreased from 10 to 4, and in the US decreased from seven to four. Globally, collaboration-app network outages decreased from three to six, and in the US dropped from zero to two.

Two notable outages

On June 5, Microsoft experienced an outage impacting Microsoft 365 services. First observed around 10:15 a.m. EDT, the outage was made up of several sustained periods of disruption spread over 27 hours. The first occurrence lasted around an hour and 39 minutes, manifesting as decreased application availability across some Microsoft 365 services for global users. Around 3 hours and 30 minutes after appearing to clear, the outage reappeared, exhibiting the same symptoms as the previous outage. This second occurrence lasted around 3 hours and 3 minutes. Services appeared to return with access completely restored around 6:30 p.m. EDT. However, eight and a half hours later, a third occurrence lasted 68 minutes and appeared to clear around 5:10 a.m. EDT (June 6), before reoccurring at 11:00 a.m. EDT for 52 minutes and again around 12:10 a.m. EDT for 24 minutes. The disruption appeared to be cleared around 1:05 p.m. EDT. Click here for a more detailed description of the outage.

On June 6, Rackspace Technology, a U.S. managed cloud computing provider headquartered in San Antonio, Texas, experienced a series of outages over a period of fifteen minutes that impacted multiple downstream providers as well as Rackspace customers within the U.S. and India. The outage, lasting a total of 8 minutes, was first observed around 8:20 p.m. EDT and appeared to center on Rackspace nodes located in Dallas, TX. Ten minutes after first being observed, a reduced number of Rackspace nodes located in Dallas, TX once again exhibited outage conditions reducing the number of impacted customers and partners, before clearing at around 8:35 p.m. EDT. Click here for an interactive view.

Updated June 5, 2023

Global outages across all three categories last week increased from 170 to 176, up 4% compared to the week prior. In the US, they increased from 78 to 86, up 10%. Globally, ISP outages decreased from 108 to 95, down 12%, and in the US they decreased from 45 to 44, down 2%. Globally, cloud-provider outages increased from nine to 10, and in the US decreased from eight to seven. Globally, collaboration-app network outages decreased from six to four, and in the US dropped from two to zero.

Two notable outages

On May 31, Cloudflare suffered an interruption affecting customers in the US, Australia, the UK, India, China, and Canada. First observed around 6:55 a.m EDT, the outage appeared centered on Cloudflare nodes located in New York, New York; Chicago, Illinois; Newark, New Jersey; Dallas, Texas; Kansas City, Missouri; London, England; and Mumbai, Pune and New Delhi, India. Five minutes into the outage, nodes in Melbourne and Brisbane, Australia; Los Angeles, California; and Montreal, Canada; were also affected. The outage lasted 19 minutes in total and was cleared around 7:15 a.m. EDT. Click here for an interactive view.

On May 31, GTT Communications experienced an outage affecting some partners and customers across the US. The nine-minute outage was first observed around 2:25 a.m. EDT and appeared centered on GTT nodes in Miami, Florida. The outage cleared around 2:35 a.m. EDT. Click here for an interactive view.

Updated May 29, 2023

Global outages across all three categories last week decreased from 174 to 170, down 2% compared to the week prior. In the US they increased from 71 to 78, up 10%. Globally, ISP outages increased from 102 to 108, up 6%, and in the US they remained the same at 45. Globally, cloud-provider network outages increased from eight to nine, and in the US increased from four to eight. Globally, collaboration-app network outages increased from two to six, and in the US they increased from zero to two.

Two notable outages

On May 25, TATA Communications (America) Inc., experienced an outage affecting downstream partners and customers in the US, Hong Kong, New Zealand, Australia, Singapore, Argentina, Chile, Mexico, Canada, Sweden, and China. First observed around 5:26 p.m. EDT, the outage lasted 32 minutes in total divided into two episodes over a 39-minute period. The first period lasted around 28 minutes and appeared centered on TATA nodes in New York, New York. Five minutes after it cleared, the outage returned. The outage was cleared around 6:05 p.m. EDT. Click here for an interactive view.

On May 24, Hurricane Electric experienced an 18-minute outage affecting customers and downstream partners in the US, Germany, Canada, the UK, France, Ireland, Sweden, Belgium, South Africa, and Australia. The outage was first observed around 3:15 p.m. EDT centered on Hurricane Electric nodes in Singapore. Five minutes later those nodes appeared to clear and those in San Jose, California exhibited outage conditions. Ten minutes after first being observed, nodes exhibiting outage conditions included those in San Jose and New York, New York. The outage was cleared around 3:35 PM EDT. Click here for an interactive view.

Updated May 22, 2023

Global outages across all three categories last week dropped from 574 to 174, down 70% compared to the week prior. In the US, they dropped from 231 to 71, down 69%. Globally, ISP outages decreased from 393 to 102, down 74%, and in the US they dropped from 187 to 45, down 76%. Globally, cloud-provider network outages dropped from 17 to eight, and in the US they dropped from 10 to four. Globally, collaboration-app network outages decreased from three to two, and in the US they remained at zero for the second week.

Two notable outages

On May 18, PCCW experienced an outage affecting customers and networks in the US, UK, Luxembourg, Hong Kong, South Africa, and Mexico. The outage lasted around 16 minutes in total and was divided into four periods over a 35-minute span. The outage was first observed around 10:40 a.m. EDT and appeared centered on PCCW infrastructure in Ashburn Virginia; Dallas, Texas; and London, England. The first period of the outage lasted around four minutes before recurring five minutes later. Fifteen minutes after first being observed, the Ashburn, Dallas, and London nodes appeared to recover. However, five minutes later, Ashburn nodes appeared to exhibit outage conditions again. The outage was cleared around 11:15 a.m. EDT. Click here for an interactive view.

On May 17, Arelion experienced an outage affecting customers and downstream partners across the US, UK, South Africa, Poland, Canada, Sweden, and Norway. The nine-minute disruption was first observed around 11:45 a.m. EDT and appeared centered on nodes in Newark, New Jersey. Five minutes later, the number of affected New Jersey nodes appeared to increase. The outage was cleared around 11:55 a.m. EDT. Click here for an interactive view.

Updated May 15, 2023

Global outages across all three categories last week increased from 310 to 574, up 85% compared to the week prior. In the US, they increased from 175 to 231, up 32%. Globally, ISP outages jumped from 200 to 393, up 97%, and in the US they increased from 128 to 187, up 46%. Globally, cloud-provider network outages remained the same at 17, and in the US they increased from 9 to ten. Globally, collaboration-app network outages decreased from five to three, and there were none in the US, down from two the week before.

Two notable outages

On May 11, TATA Communications (America) Inc., experienced an outage affecting downstream partners and customers in countries including the US, the UK, Singapore, the Netherlands, Germany, Australia, Indonesia, the Philippines, and India. The 19-minute outage was first observed around 11:45 a.m. EDT apparently centered on TATA nodes in Newark, New Jersey, and London, England. Five minutes later, nodes in Marseille, France also exhibited outage conditions. Fifteen minutes into the outage, the Marseille and London nodes appeared to clear, and nodes in Laredo, Texas; Los Angeles, California; and Paris, France; exhibited outage conditions. The outage was cleared around 1:05 p.m. EDT. Click here for an interactive view.

On May 11, GTT Communications experienced an outage affecting some partners and customers across the US and the Netherlands. The 13-minute outage was first observed around 4:05 a.m. EDT, and was divided into two episodes over a 20-minute period. The first lasted around 9 minutes, and appeared centered on GTT nodes in Seattle, Washington. Five minutes later, nodes in Amsterdam, the Netherlands, also exhibited outage conditions. Five minutes after appearing to clear, the Seattle nodes started exhibiting outage conditions again. The outage was cleared around 4:25 a.m. EDT. Click here for an interactive view.

Updated May 8, 2023

Global outages across all three categories last week increased from 213 to 310, up 46% compared to the week prior. In the US, they increased from 95 to 175, up 84%. Globally, ISP outages increased from 139 to 200, up 44%, and in the US, they increased from 60 to 128, up 113%. Globally, cloud-provider network outages increased from six to 17, and in the US they increased from five to nine. Globally, collaboration-app network outages increased from four to five, and in the US, they increased from zero to two.

Two notable outages

On May 6, Cogent Communications, experienced an outage affecting multiple downstream providers and customers in countries including the US, UK, Australia, Singapore, Japan, Argentina, Brazil, China, Thailand, Mexico, Turkey, Taiwan, Republic of Korea, Canada, South Africa, Germany, Spain, Poland, Denmark, and Luxembourg. The 23-minute outage was divided into two occurrences over a 30-minute period. The first occurrence was observed around 6:45 a.m. EDT, apparently centered on Cogent nodes in San Francisco and San Jose, California. Five minutes later, nodes in Los Angeles, California, and Seattle, Washington, also exhibited outage conditions. Five minutes after appearing to clear, the second occurrence was observed, initially centering on San Francisco, San Jose, and Los Angeles nodes. The outage was cleared around 7:15 a.m. EDT. Click here for an interactive view.

On May 2, Cox Communications, experienced a disruption affecting Cox Communications customers and partners. The outage was first observed at around 5:41 a.m. EDT apparently centered on Cox nodes in Ashburn, Virginia. Twenty minutes later, Cox nodes in Ohio were also observed exhibiting outage conditions. The 21-minute outage was cleared around 6:05 a.m. EDT. Click here for an interactive view.

Updated May 1, 2023

Global outages across all three categories last week decreased from 239 to 213, down 11% compared to the week prior. US outages decreased from 109 to 95, down 13%. Globally, ISP outages decreased from 146 to 139, down 5%, and in the US decreased from 63 to 60, down 5%. Globally, cloud-provider network outages dropped from 13 to six, and in the US from nine to five. Globally, collaboration-app network outages decreased from eight to four and in the US they remained at zero.

Two notable outages

On April 26, Qwest Communications experienced an outage affecting downstream partners and customers across the US. The 14-minute outage was first observed around 12:20 a.m. EDT, apparently centered on Qwest nodes in Atlanta, Georgia. The outage was cleared around 12:35 a.m. EDT. Click here for an interactive view.

On April 26, Microsoft experienced an outage on its network affecting some downstream partners and access to services running in Microsoft environments in multiple countries including the US, Taiwan, and India. The nine-minute outage was first observed around 1:20 a.m. EDT and appeared to initially center on Microsoft nodes in Atlanta, Georgia, and Cleveland, Ohio.  Around five minutes later the Cleveland nodes appeared to clear. The outage was cleared around 1:30 a.m.EDT. Click here for an interactive view.

Updated April 24, 2023

Global outages across all three categories last week increased from 235 to 239, up 2% compared to the week prior. In the US, outages increased from 100 to 109, up 9%. Globally, the number of ISP outages increased from 137 to 146, up 7%, and in the US they increased from 60 to 63, up 5%. Globally, cloud-provider network outages increased from 10 to 13 and in the US from eight to nine. Globally, collaboration-app network outages increased from seven to eight and in the US dropped from one to zero.

Two notable outages

On April 18, NTT America experienced an outage affecting some customers and downstream partners across the US. The 14-minute outage was first observed around 12:30 a.m. EDT and appeared centered on NTT nodes in San Jose and Los Angeles, California. Around five minutes later, the San Jose nodes appeared to recover. Ten minutes after first being observed, the Los Angeles nodes appeared to recover, too, but nodes in New York, New York, exhibited outage conditions. The outage was cleared around 12:45 a.m.EDT. Click here for an interactive view.

On April 19, Oracle experienced an outage on its network affecting Oracle customers and downstream partners interacting with Oracle Cloud services in multiple countries including the US, Japan, the United Arab Emirates, and Canada. The outage was first observed around 3:25 a.m. EDT and appeared to center on Oracle nodes in Ashburn, Virginia; Washington, DC; Toronto, Canada; Dubai, United Arab Emirates; and Tokyo, Japan. Five minutes later the Dubai and Toronto nodes appeared to recover, but nodes located in Montreal, Canada showed outage conditions. A further five minutes later Dubai nodes once again exhibited outage conditions. Fifteen minutes after first being observed, the Ashburn, Washington, Toronto, Dubai and Tokyo nodes exhibited outage conditions again. The outage lasted 19 minutes in total and was cleared around 3:45 a.m. EDT. Click here for an interactive view.

Updated April 17, 2023

Global outages across all three categories last week decreased from 242 to 235, down 3% compared to the week prior. In the US they decreased from 105 to 100, down 5%. Global ISP outages decreased from 157 to 137, down 13%, and in the US, decreased from 72 to 60, down 17%. Global cloud-provider network outages increased from nine to 10, and in the US they increased from four to eight. Global collaboration-app network outages decreased from 12 to seven, and in the US they decreased from four to one.

Two notable outages

On April 14, Cogent Communications experienced a 14-minute outage affecting downstream providers and customers in countries including the US, UK, Germany, Luxemburg, South Africa, India, Israel, Spain, France, Singapore, Ireland, Austria, Australia, Italy, and Brazil. First observed around 7:30 p.m. EDT, the outage centered on Cogent nodes in Washington, DC; London; New York; Paris; Frankfurt, Germany; Boston, Massachusetts; and Houston, Texas. Five minutes later, only Paris, Frankfurt, and Marseille, France, nodes showed outage conditions. Five minutes after that, the Paris, Frankfurt, and Marseille, nodes all appeared to clear, but those in London exhibited outage conditions. The outage cleared around 7:45 p.m. EDT. Click here for an interactive view.

On April 15, Oracle experienced an outage affecting its customers and downstream partners interacting with Oracle Cloud services in countries including the US, Canada, the United Arab Emirates, China, and Japan. The outage was first observed around 8:20 p.m. EDT and appeared centered on Oracle nodes in Ashburn, Virginia; Washington, DC; Montreal, Canada; Dubai, United Arab Emirates; and Tokyo, Japan. Fifteen minutes later, all the nodes except those in Montreal appeared to recover. The outage lasted 18 minutes and was cleared around 8:40 p.m. EDT. Click here for an interactive view.

Updated April 10, 2023

Global outages across all three categories last week decreased from 265 to 242, down 9% compared to the week prior. In the US, they increased from 95 to 105, up 11%. Globally, the number of ISP outages decreased from 170 to 157, down 8%, and in the US they increased from 62 to 72, up 16%. Globally, cloud-provider network outages remained the same as the week prior, at nine, and in the US decreased from six to four. Globally, collaboration-app network outages more than doubled, increasing from five to 12, and in the US decreased from five to four. 

Three notable outages

On April 8, Level 3 Communications, a U.S. based Tier 1 carrier, experienced an outage that impacted multiple downstream partners and customers across the U.S. and India. The outage, lasting 24 minutes, was first observed around 4:05 AM EDT and appeared to initially be centered on Level 3 nodes located in Kansas City, MO and San Francisco, CA. Fifteen minutes after first being observed, the nodes located in Kansas City appeared to clear, leaving just the nodes located in San Francisco exhibiting outage conditions. The outage was cleared around 4:30 AM EDT. Click here for an interactive view.

On April 3 and 4, Virgin Media UK, a British based ISP, experienced two outages that impacted the reachability of its network and services to the global Internet. The two outages shared similar characteristics and appeared to impact access for Virgin Media customers in the U.K. predominantly. The first incident took place between ~8:30 PM EDT and ~3:00 AM EDT, while the second began at ~11:20 AM EDT and was resolved around 1:30 PM EDT. A more detailed explanation of the outage can be found here. Click here for an interactive view.

On April 5, Cogent Communications, a U.S. based multinational transit provider, experienced an outage that impacted multiple downstream providers as well as Cogent customers in multiple regions, including the U.S., the Netherlands, Germany, and the Czech Republic. The outage, lasting a total of 19 minutes, was divided into two occurrences distributed over a thirty-minute period. The first occurrence was observed around 4:00 PM EDT and appeared to initially be centered on nodes located in Boise, ID. Five minutes after appearing to clear, the nodes located in Boise appeared to exhibit outage conditions once again. Twenty-five minutes after first being observed, the Cogent nodes exhibiting the outage conditions extended to include nodes located in Cleveland, OH. As the Cogent nodes impacted increased, so did the number of customer networks and providers impacted. The outage was cleared around 4:30 PM EDT. Click here for an interactive view.

Updated April 3, 2023

Global outages across all three categories last week increased from 181 to 265, up 46% compared to the week prior. In the US, they increased from 65 to 95, also up 46%. Globally, the number of ISP outages increased from 121 to 170, up 40%, and in the US they increased from 44 to 62, up 41%. Globally, cloud-provider network outages increased from six to nine, and in the US from one to six. Globally, collaboration-app network outages increased from one to five and in the US increased from zero to five. 

Two notable outages

On April 1, Cogent Communications experienced an outage affecting multiple downstream providers as well as Cogent customers in countries including the US, Canada, the UK, Israel, China, Mexico, Singapore, Australia, Germany, Portugal, India, Spain, Switzerland, New Zealand, Philippines, Costa Rica, Brazil, and Japan. The nine-minute outage was first observed around 1:25 a.m. EDT centered on Cogent nodes in Washington, DC; Atlanta, Georgia; and Bilbao, Spain. Five minutes later, nodes in Philadelphia, Pennsylvania; New York, New York; and El Paso and Houston, Texas; also showed outage conditions. The outage was cleared around 1:35 a.m. EDT. Click here for an interactive view.

On April 1, Oracle experienced a nine-minute network outage affecting Oracle customers and downstream partners interacting with Oracle Cloud services in multiple countries including the US, Luxembourg, Finland, Mexico, South Africa, the UK, Germany, and Canada. First observed around 2:55 a.m. EDT, the outage appeared centered on Oracle nodes in Ashburn, Virginia; Washington, DC; Phoenix; and Cleveland, Ohio. Five minutes later, the Cleveland nodes appeared to recover, along with some nodes in Ashburn, Washington, and Phoenix. This reduced the number of affected countries to the US, South Africa, Finland, the UK, and Canada. The outage was cleared around 3:05 a.m. EDT. Click here for an interactive view.

Updated March 27, 2023

Global outages across all three categories last week decreased from 247 to 181, down 27% compared to the week prior. In the US, outages decreased from 82 to 65, down 21%. Globally, the number of ISP outages decreased from 163 to 121, down 26%, while in the US they decreased from 61 to 44, down 28%. Globally, cloud-provider network outages decreased from 11 to six, and in the US they dropped from five to one. Globally, collaboration-app network outages decreased from seven to one, and in the US they dropped from four to zero.

Two notable outages

On March 23, TATA Communications (America) experienced an outage affecting downstream partners and customers in multiple countries including the US, the UK, Canada, Australia, and China. First observed around 12:05 p.m. EDT, the outage, lasted 13 minutes in total, divided into two episodes over a 20-minute period. The first, nine-minute outage initially appeared centered on TATA nodes in Los Angeles. Five minutes into the outage, nodes exhibiting outage conditions included those in San Francisco. Five minutes after the first occurrence cleared, the outage reappeared, with San Francisco and Newark, New Jersey nodes exhibiting outage conditions. The outage was cleared around 12:25 p.m. EDT. Click here for an interactive view.

On March 26, GTT Communications experienced an outage affecting some of its partners and customers across multiple countries, including the US, Spain, and the UK. The eight-minute outage was first observed around 9:30 a.m. EDT and appeared to initially center on GTT nodes  in Atlanta and Wisconsin. Five minutes after first being observed, the Wisconsin ndes appeared to clear. The outage was cleared around 9:40 a.m. EDT. Click here for an interactive view.

Updated March 20, 2023

Global outages across all three categories last week decreased from 271 to 247, down 9% compared to the week prior. In the US, outages decreased from 105 to 82, down 22%. Global ISP outages decreased from 191 to 163, down 15%, and in the US they decreased from 83 to 61, down 27%. Global cloud-provider network outages jumped from three to 11, and in the US increased from two to five. Global collaboration-app network outages increased from three to seven, and in the US from one to four.

Two notable outages

On March 14, NTT America experienced an outage affecting customers and downstream partners across multiple countries including the US, Brazil, the UK, Australia, and China. The outage, lasting a total of 18 minutes, was divided into two occurrences over a 35-minute period. Initially observed around 11:40 a.m. EDT, the first occurrence appeared to center on NTT nodes in New York and Dallas. Around five minutes after the New York and Dallas nodes appeared to clear, nodes in San Jose, California, began exhibiting outage conditions. The outage cleared around 12:205 p.m. EDT. Click here for an interactive view.

On March 18, Cogent Communications experienced a series of outages over a period of an hour and 15 minutes affecting multiple downstream providers and Cogent customers across multiple countries including the US, Canada, China, and Mexico. The outage, lasting a total of 32 minutes, was first observed around 1:30 a.m. EDT, apparently centered on Cogent nodes in Oakland, California, and lasted six minutes. The Cogent environment was stable for 10 minutes before experiencing a nine-minute outage observed on Cogent nodes in Atlanta, Georgia. Five minutes after appearing to clear, they began exhibiting outage conditions again. Fifty minutes after the initial Oakland outage was observed, the nodes appeared to exhibit outage conditions again, as did nodes  in Phoenix and El Paso, Texas. Ten minutes later, after appearing to clear, the Oakland and Phoenix nodes began exhibiting outage conditions, as did nodes in Denver, Colorado. The outage cleared around 2:45 a.m. EDT. Click here for an interactive view.

Updated March 13, 2023

Global outages in all three categories last week decreased from 337 to 271, down 20% decrease compared to the week prior. In the US they increased from 95 to 105, up 11%. Globally, ISP outages decreased from 205 to 191, down 7%, and in the US they increased from 66 to 83, up 26%. Globally, cloud-provider network outages dropped from 12 to three, and in the US they decreased from six to two. Globally, collaboration-app network outages dropped from 11 to three, and in the US they dropped from nine to one.

Two notable outages

On March 6, Twitter experienced a service disruption affecting users globally. First observed around 11:45 a.m. EST, many users were unable to access service, although the application remained reachable from a network perspective. During the incident, users were receiving 403 forbidden errors, which is indicative of a backend application issue. Around 12:19 p.m. EST, Twitter announced the disruption was caused by an internal system change that had unintentional consequences, and they were working to resolve it. The disruption lasted 60 minutes, with access restored to users around 12:50 p.m. EST. Click here for an interactive view.

On March 12, Okta experienced an issue that disrupted access to its service for a number of users globally. First observed around 11:56 a.m. EDT, connection requests appeared to return HTTP 403 forbidden status codes, indicating an application issue rather than internet or network problems connecting to Okta. The incident appeared to resolve for most users approximately 48 minutes later around 12:45 p.m. EDT. Click here for an interactive view.

Updated March 6, 2023

Global outages across all three categories last week increased from 316 to 337, up 7% from the week prior. In the US, outages increased from 62 to 95, up 53%. Globally, ISP outages decreased from 228 to 205, down 10%, while in the US they increased from 47 to 66, up 40%. Globally, cloud-provider network outages increased from 10 to 12, and in the US they increased from four to six. Globally, collaboration-app network outages jumped from two to 11, and in the US jumped from one to nine.

Two notable outages

On February 27, TATA Communications (America) Inc. experienced an outage affecting many of its downstream partners and customers in multiple countries, including the US, Germany, the UK, France, the Netherlands, India, Canada, Singapore, Switzerland, Norway, China, Mexico, Portugal, and Hong Kong. The outage lasted 53 minutes, divided into three segments over an hour and 55-minute period. The initial period of the outage was observed around 1:50 a.m. EST and appeared centered on TATA Los Angeles nodes. Five minutes after that occurrence cleared, nodes in Los Angeles and San Francisco, appeared to exhibit outage conditions and cleared after three minutes. Ten minutes later, the third occurrence was observed, affecting nodes in Los Angeles; Newark, New Jersey; and Paris. Ten minutes into the third occurrence, nodes in London and Amsterdam exhibited outage conditions. Twenty-five minutes later, Newark and Amsterdam nodes appeared to clear. The remainder of the outage was cleared around 3:05 a.m. EST. Click here for an interactive view.

On March 1, Cogent Communications experienced an outage affecting downstream providers and Cogent customers in countries including the US, China, Singapore, Taiwan, Indonesia, Turkey, the Republic of Korea, India, Germany, the UK, and Australia. The 23-minute outagewas first observed around 12:05 a.m. EST, initially centered on Cogent nodes in San Francisco and San Jose. Five minutes later, nodes in Oakland, California, Kansas City, Missouri, and Washington, DC exhibited outage conditions. Twenty minutes into the outage, the Kansas City, Washington, DC, and San Francisco nodes appeared to recover. The outage was cleared around 12:30 a.m. EST. Click here for an interactive view.

Updated Feb. 27, 2023

Global outages across all three categories last week decreased, from 339 to 316, down 7% compared to the week prior. In the US, they decreased from 76 to 62, down 18%. Globally, the number of ISP outages decreased from 258 to 228, down 12%, and in the US they decreased from 53 to 47, down 11%. Globally, cloud-provider network outages decreased from 14 to 10, and in the US from eight to four. Globally collaboration-app network outages decreased from four to two, and in the US they decreased from two to one.

Two notable outages

On February 21, Arelion experienced an outage affecting customers and downstream partners across the US. The disruption, lasting a total of 24 minutes, was first observed around 4:40 p.m. EST and appeared to center on nodes located in San Jose, California. Fifteen minutes the number of nodes appeared to reduce, and the outage was cleared around 5:05 p.m. EST. Click here for an interactive view.

On February 24, Oracle experienced an outage on its network affecting customers and downstream partners interacting with Oracle Cloud services in the US and Canada. First observed around 10 p.m. EST, it appeared initially to center on Oracle nodes in Ashburn, Virginia, and Washington, DC. Five minutes later they appeared to clear and nodes in Toronto and Montreal, Canada, exhibited outage conditions. Around 10:10 p.m. EST, the Ashburn and Washington nodes exhibeted outage conditions again. The outage lasted 17 minutes and was cleared at around 10:20 p.m. EST. Click here for an interactive view.

Updated Feb. 20, 2023

Global outages across all three categories last week increased from 301 to 339, up 13% compared to the week prior. In the US, outages increased from 73 to 76, up 4%. Global ISP outages increased from 215 to 258, up 20%, and in the US they remained the same at 53. Global cloud provider network outages increased from 10 to 14, and they increased from three to eight in the US. Global collaboration app network outages decreased from five to four, and increased from zero to two in the US.

Two notable outages

On February 18, Zayo Group experienced an outage affecting partners and customers in the US, Canada, China, Australia, and Malaysia. The 14-minute outage was first observed around 4 p.m. EST, and appeared centered on Zayo Group nodes located in Houston, Texas; New York, New York; and Newark, New Jersey. Five minutes after being observed, the New York and Newark nodes appeared to recover. The outage cleared around 4:15 p.m. EST. Click here for an interactive view.

On February 14, Hurricane Electric experienced an outage affecting customers and downstream partners in the US, Japan, Mexico, and Hong Kong. The outage, first observed at around 8:01 p.m. EST, lasted a total of 19 minutes and was divided into two occurrences over a 29-minute period. The first occurrence, lasting six minutes, appeared centered on Hurricane Electric nodes in Paris, France. Around 8:15 p.m. EST, five minutes after the Paris nodes appeared to clear, the second occurrence was observed. It lasted around 13 minutes and appeared centered on nodes in New York, New York. As the second occurrence progressed, the number of New York nodes exhibiting outage conditions dropped, which coincided with a reduction in the number of affected regions and customers. The outage was cleared around 7:35 a.m. EDT. Click here for an interactive view.

Updated Feb. 13, 2023

Global outages across all three categories last week decreased from 331 to 301, down 9% compared to the week prior. In the US, outages decreased from 117 to 73, down 38%. Globally, ISP outages decreased from 231 to 215, down 7%, and in the US they dropped from 85 to 53, down 38%. Globally, cloud-provider network outages decreased from 12 to 10, and in the US they decreased from seven to three. Globally, collaboration-app outages dropped from 10 to five, and in the US dropped from three to none. Global outages across all three categories last week decreased from 331 to 301, down 9% compared to the week prior. In the US, outages decreased from 117 to 73, down 38%. Globally, ISP outages decreased from 231 to 215, down 7%, and in the US they dropped from 85 to 53, down38%.

Two notable outages

On February 6, Microsoft experienced an outage that affected services in North America, Europe, and Asia. The outage, first observed around 10:55 p.m. EST, appeared to affect user access to Microsoft Outlook services. Microsoft confirmed that a change to some Microsoft 365 systems had contributed to the outage, and used targeted restarts to parts of their infrastructure to restore service. The bulk of the incident lasted about an hour and 39 minutes, although access issues could be seen for about four hours after that. The incident was similar to the January 25th event in terms of global reach and duration, but it did not appear to be network related, as no significant packet loss or latency or unusual routing behavior was observed. Click here for an interactive view.

On February 12, Comcast Communications experienced an outage that affected a number of downstream partners and customers across the US. The 21-minute outage was first observed around 1:30 p.m. EST and appeared to center on Comcast nodes in Denver. The outage was cleared around 1:55 p.m. EST. Click here for an interactive view.

Updated Feb. 6, 2023

Globally, outages across all three categories last week decreased compared to the week prior, from 373 to 331, down 11%. In the US they increased from 102 to 117, up 15%. Globally, the number of ISP outages decreased from 278 to 231, down 17%, and in the US they increased from 81 to 85, up 5%. Globally, cloud-provider network outages increased from 10 to 12 and increased from two to seven in the US. Globally, collaboration-app network outages increased from four to 10 and increased from one to three in the US.

Two notable outages

On January 31, Level 3 Communications experienced an outage affecting multiple downstream partners and customers in countries including the US, Canada, Brazil, the Philippines, China, Mexico, the UK, Japan, India, Singapore, Taiwan, and Australia. The outage lasted a total of 47 minutes divided into three occurrences. First observed around 4:25 a.m. EST, the outage initially appeared centered on nodes in San Jose, California. Five minutes later, nodes in Los Angeles and San Francisco, California; Chicago, Illinois; Denver, Colorado; Dallas, Texas; and São Paulo and Rio De Janeiro, Brazil, also exhibited outage conditions. Fourteen minutes after initially being observed, the outages appeared to clear. Around 10 minutes after that, San Jose, Los Angeles, Chicago, Dallas, Denver, and San Francisco nodes appeared to exhibit outage conditions again. The second occurrence lasted 19 minutes and appeared to clear around 5:10 a.m. EST. Twenty minutes after that, a third occurrence  was observed, this time initially appearing to center on nodes located in San Jose, Los Angeles, San Francisco, Rio de Janeiro, and Sao Paul. Five minutes into the third occurrence, the Rio de Janeiro and São Paulo nodes appeared to clear, leaving the San Jose, Chicago, Los Angeles, and Denver nodes exhibiting outage conditions. The outage appeared to clear completely around 5:45 a.m. EST. Click here for an interactive view.

On February 3, Okta experienced an issue that disrupted access to its service for a number of users globally. First observed around 1:10 p.m. EST, connection requests appeared to return HTTP 403 forbidden status codes, indicating an application issue rather than internet or network problems connecting to Okta. The incident appeared to resolve for most users approximately 30 minutes later at around 1:40 p.m. EST. Click here for an interactive view.

Updated Jan. 30, 2023

Global outages across all three categories last week increased from 245 to 373, up 52% over the week prior. In the US, they jumped from 57 to 102, up 79%. Globally, ISP outages increased from 187 to 278, up 49%, and in the us they increased from 42 to 81, up 93%. Globally, cloud-provider network outages increased from six to 10 outages, while in the US decreased from three to two. Globally, collaboration-app network outages increased from one to four, and in the US they remained the same at one.

Two notable outages

On January 25, Microsoft experienced a significant disruption affecting connectivity to many of its services, including Microsoft Teams, Outlook, and SharePoint. First observed around 2:05 a.m. EST, the disruption appeared to impact connectivity for users globally. Around 3:15 a.m. EST, Microsoft announced a potential issue within its network configuration. Around 4:26 a.m. EST, Microsoft announced it had rolled back the network configuration change and was monitoring the services as they recovered. The bulk of the incident lasted approximately 90 minutes, although residual connectivity issues could be seen into the following day. A more detailed analysis of the outage can be found here. Click here for an interactive view.

On January 25, Cogent Communications experienced an outage affecting downstream providers as well as Cogent customers in the US, UK, and South Africa. The outage, lasting an hour and 18 minutes, was first observed around 6:40 p.m. EST and appeared centered on Cogent nodes located in New York, New York. The last 55 minutes of the outage saw a number of the New York nodes appearing to clear and coincided with a reduction in the number of affected downstream partners, customers, and regions. The outage was cleared around 8 p.m. EST. Click here for an interactive view.

Updated Jan. 23, 2023

Global outages across all three categories last week decreased from 252 to 245, down 3% from the week prior. In the US they decreased from 68 to 57, down 16%. Globally, ISP outages decreased from 189 to 187, down less than1%, but in the US they decreased from 53 to 42, down 21%. Globally, cloud-provider outages remained the same at six, but in the US they increased from two to three. Globally, collaboration-app network outages dropped from 12 to one and decreased from three to one in the US.

Two notable outages

On January 18, Time Warner Cable experienced a disruption affecting customers and partners across the US that came in two waves over the course of an hour and 10 minutes. First observed at around 1:40 p.m. EST, the first part of the outage lasted four minutes and appeared centered on Time Warner nodes in Chicago, Illinois. Fifty-five minutes after appearing to clear, the Chicago nodes again exhibited outage conditions for nine more minutes before being cleared around 2:50 p.m. EST. Click here for an interactive view.

On January 17, Qwest Communications experienced an outage affecting downstream partners and customers across the US. The 20-minute outage was first observed around 8:25 a.m. EST and appeared centered on nodes in Atlanta, Georgia. The outage was cleared around 8:50 AM EST. Click here for an interactive view.

Updated Jan 16, 2023

Global outages across all three categories last week increased from 217 to 252, up 16%, compared to the week prior. In the US they increased from 57 to 68, up 19%. Globally, ISP outages increased from 166 to 189, up 14%, and in the US they increased from 48 to 53, up 10%. Globally, cloud-provider network outages increased from four to six, and in the US from one to two. Globally, collaboration-app network outages increased from five to 12, and in the US from zero to five.

Three notable outages

On January 10, Cogent Communications, experienced an outage affecting multiple downstream providers and Cogent customers in countries, including the US, Singapore, Germany, and Canada. The outage, lasting a total of 43 minutes, was first observed around 1:05 a.m. EST, initially centered on Cogent nodes in Seattle, Washington. Five minutes later, the Seattle nodes appeared to clear, and nodes in Oakland, California exhibited outage conditions. Fourteen minutes after first being observed, the Oakland nodes appeared to clear. Forty minutes after that, those nodes again exhibited outage conditions. Around 2:30 a.m. EST, about an hour and 15 minutes after appearing to clear, Seattle nodes again exhibited outage conditions. The outage was cleared around 2:35 a.m. EST. Click here for an interactive view.

On January 10, Hurricane Electric experienced an outage affecting customers and downstream partners across the US, the Netherlands, Germany, Belgium, the UK, Ireland, Australia, Canada, Sweden, France, and Brazil. The 17-minute outage was first observed around 5:20 a.m. EST, apparently centered Hurricane Electric nodes in New York, New York. Five minutes later the outage included nodes in San Jose, California. Twenty minutes after first being observed, nodes exhibiting outage conditions expanded to include nodes in Los Angeles and  San Jose, California. The outage was cleared around 5:40 AM EST. Click here for an interactive view.

On January 13, Spotify experienced an outage that prevented some users globally from streaming songs or using the service. First observed around 7:40 p.m. EST, the disruption lasted around an hour and 55 minutes, with the major period of the disruption lasting about an hour and 11 minutes. During the outage, Spotify service was contactable, with a number of requests either timing out or returning “service unreachable” or “unauthorized” messages, which is indicative of backend system issues. The outage was cleared around 9:35 p.m. EST when. Around 11:16 PM EST Spotify announced that all services had been restored. Click here for an interactive view.

Updated Jan. 9, 2023

Global outages across all three categories last week increased from 120 to 217, up 81% compared to the week prior. In the US, outages increased from 31 to 57, up 84%. Globally, ISP outages increased, from 76 to 166, up 118%, and in the US they increased from 18 to 48, up 167%. Globally, cloud-provider network outages dropped from eight to four, and in the US they decreased from five to one. Globally, collaboration-app network outages increased from four to five, and in the US they dropped from one to zero.

Two notable outages

On January 4, Time Warner Cable experienced a series of outages over a period of two hours and 15 minutes affecting multiple downstream providers and Time Warner Cable customers in the US and Canada. An outage was first observed around 1:50 a.m. EST and appeared to center on Time Warner Cable nodes in Dallas, Texas. The ensuing series of outages lasted a total of eight minutes and were cleared at around 4:05 a.m. EST. Click here for an interactive view.

On January 6, Qwest Communications experienced an outage affecting downstream partners and customers across the US. The 16-minute outage was first observed around 12:25 a.m. EST and appeared centered on Qwest nodes in Atlanta, Georgia. It was cleared around 12:45 a.m. EST. Click here for an interactive view.

Cloud Computing, Internet Service Providers, Network Management Software, Networking
]]>
https://www.networkworld.com/article/2071381/2023-global-network-outage-report-and-internet-health-check.html 2071381
Cisco, Intel expand Wi-Fi 7 partnership Thu, 21 Mar 2024 22:52:14 +0000

Cisco and Intel have expanded their partnership to focus on jointly developing Wi-Fi 7 technologies and strengthening interoperability between Cisco access points and Intel client devices.

The agreement includes investments by both companies to develop next-gen Wi-Fi 7 solutions and deliver product interoperability through joint labs, early code sharing and testing, according to Thomas Hannaford, a communications manager with Intel, who wrote a blog about the agreement. Additional efforts are focused on technologies to handle latency-sensitive applications and enhanced traffic prioritization for more reliable connectivity, Hannaford stated.  

Wi-Fi 7 also known as 802.11be, is expected to reduce latency, increase network capacity, boost efficiency, and support more connected devices. It’s still early days for Wi-Fi 7, however. The IEEE is expected to agree to the final spec later this year, and the Wi-Fi Alliance has only just begun its official certification program for Wi-Fi 7 devices and products.

Wi-Fi 7 will utilize Extremely High Throughput (EHT) to deliver peak data rates of more than 40Gbps, making it significantly faster than previous generations of the Wi-Fi standard. The technology targets mostly physical (PHY) and medium access control (MAC) improvements capable of supporting a maximum throughput of at least 30Gbps, experts say.

Another feature of Wi-Fi 7 is multi-link operation (MLO), which allows devices to simultaneously send and receive data across different frequency bands and channels, enhancing the efficiency of wireless connections. Additional features such as encryption and authentication over WPA3 Enterprise further strengthen Wi-Fi security, Hannaford stated.

“Through this cooperation, Intel and Cisco will provide a significantly higher level of end-to-end reliability, robust high throughput, low latency and deterministic Wi-Fi 7 performance by optimizing MLO and Quality of Service Management, as well as utilizing 6ghz spectrum on Low Power Indoor and standard power Automated Frequency Coordination,” Hannaford stated.  

Intel said WiFi7 development will complement its recently announced AI PC platforms, which feature a built-in neural processing unit and power-efficient AI acceleration and local inference on the PC. The integration of Wi-Fi 7 technology into Intel’s AI PC platform promises improved network performance, reduced latency, and increased bandwidth – all core requirements to support AI applications that require high-speed data transfer and low-latency connections.

Cisco and Intel have a history in wireless

Cisco and Intel have worked together for many years to enhance wireless services. 

Last year, the vendors announced they would create reference architectures for 5G services that could be used for internet of things (IoT), manufacturing, supply chain, or smart sites, for example. The companies intend to make the architectures available to managed service provider partners.

Cisco’s subscription-based, private 5G managed service draws from its mobile-core technology and its IoT portfolio, which includes Cisco IoT Control Center and Cisco P5G Packet Core as well as IoT sensors and gateways. It also includes device management software and monitoring tools, all available via a single portal, according to Cisco.

Cisco and Intel also have worked with 5G device and radio access network (RAN) manufacturers, as well as enterprise application-software developers, to offer validated and customized services. They will work together on edge AI frameworks based on Intel’s Xeon processors, Intel Smart Edge for multi-access edge computing, and RAN offerings, the companies stated.

The vendors also operate 5G innovation labs in California, Germany, and Japan, where customers can test applications before putting them into production.

Networking, Wi-Fi
]]>
https://www.networkworld.com/article/2071488/cisco-intel-expand-wi-fi-7-partnership.html 2071488
Making bash aliases easy to manage Thu, 21 Mar 2024 14:07:13 +0000

I’ve undoubtedly said this before, but the most effective aliases on Linux are those that save you a lot of time or help you avoid typing errors – especially those errors that might cause problems on your system. Aliases allow you to run both complicated and frequently used commands with minimal effort. If you type a command like alias rec=ls -ltr | tail -10‘ in your terminal session, you will have created an alias that will display the ten most recently created or updated files in your current directory. This makes it easier to remember what you’ve most recently been working on and to make necessary updates.

To preserve your aliases for future use, you can add them to your .bashrc file. If you do this, it’s a good idea to group them at the end of the file so that they’re easier to find, review and modify as needed. On the other hand, some of my techie friends prefer to store their aliases in a separate file and ensure that their shell will source that file whenever they log in by using a command like source aliases or simply . aliases. Note that the word “source” and the single character “.” do the same thing.

The list below shows a group of aliases. Some are extremely simple, like the one that allows you to type just a “c” instead of typing the word “clear”. I actually use that one very frequently and appreciate that I only have to type a single letter to clear my screen. One shows the largest files in the current directory, one shows the most recently updated files, and another installs system updates. There are many good reasons to use aliases to simplify commands without losing track of what those commands do.

alias big5='du -h | sort -h | tail -5'
alias c='clear'
alias install='sudo dnf install'
alias myprocs='ps -ef | grep `whoami`’V
alias myps='ps -ef | grep `whoami` | awk '{print \$2}'
alias recent='history | tail -10'
alias rec='ls -ltr | tail -5'
alias update='sudo dnf upgrade –refresh'

You can list aliases with the alias command and, when it’s helpful, sort them or use a sort or grep command to list only those containing certain strings.

$ alias | sort | head -2
alias big5='du -h | sort -h | tail -5'
alias c='clear'
$ alias | grep rec
alias recent='history | tail -10'
alias rec='ls -ltr | tail -5'

If you want an alias to go away temporarily for some reason, you can use the unalias command. As long as your alias is included in your .bashrc file, or a file that you source to make your aliases available to you, they’ll all be easily ready when you need to use them again.

$ unalias big5

The aliases below can save you a little time when you need to back up a directory or two. Just remember that aliases will not be available on your next login unless you save them in your .bashrc or separate aliases file.

alias up='cd ..'
alias up2='cd ../..'

When an alias won’t cut it

Aliases are extremely useful, but they have their limitations. When a task that you need to perform periodically is too complex for an alias because of various options that will change from time to time, consider writing a script instead.

Wrap-up

Aliases provide an easy way to reuse complicated commands and those that you use often without much effort.

Linux
]]>
https://www.networkworld.com/article/2071283/making-bash-aliases-easy-to-manage.html 2071283
Data center provider razes 55 homes to make room for Illinois campus Thu, 21 Mar 2024 13:35:15 +0000

The rush to build data centers has become the new Oklahoma land grab, with providers competing for prime real estate. In one instance, a data center provider bought 55 homes only to demolish them to make room for its campus.

Stream Data Centers, a Dallas-based provider of colocation and custom data-center construction services, last November purchased 55 homes in a 34-acresubdivision of Elk Grove Village, Illinois. According to published reports (here and here), Stream paid an average of $950,000 for each house.

After finalizing the purchases, Stream demolished the houses to make room for a 2 million square-foot campus with four-story buildings. When completed, it’s expected to be the largest data center in the city – which already has 11 data centers owned by companies including Equinix and Digital Realty Trust.

Elk Grove Village has become a popular destination for data centers in the state of Illinois. Its appeal comes from a combination of close proximity to Chicago, relatively cheap land, lots of fiber-optic network bandwidth, and access to fresh water from Lake Michigan. The result is a cluster of data centers all in one space.

Stream Data Centers closed the housing deal last November, but demolition didn’t begin until February as families were given time to move out. There are also commercial properties on the site, but they will not be vacated until April 2025. Groundbreaking for the first data center building is scheduled for late 2024, with occupancy projected for early 2026.

“Elk Grove Village is a critical market for major cloud deployments and network services companies, and this project gave us a chance to secure 34 acres and develop a large hyperscale campus in this land- and power-constrained market,” said a company spokesperson via email.

“Beyond getting an outstanding data center site, it was nice to see that the landowners did very well as evidenced by the support we received in our development process (who showed up at our public meetings voicing their support for this project). Craig Johnson, the mayor of Elk Grove Village, and the village staff have been hugely supportive throughout the development process,” the spokesperson added.

Elk Grove is the largest data center submarket in the Chicago area, and it offers several latency-sensitive connection points that developers want to be around, according to Andy Cvengros, managing director, data center markets, with Jones Lang LaSalle, which specializes in building data centers.

Regarding the 55-home purchase, it’s not the first time a data center company paid for and demolished homes, Cvengros said. Data center providers Prologis and CyrusOne have done the same thing.

Real estate generally follows “its highest and best use,” he said. In this case, when the land was used for residences, it was worth $20 per square foot. But if converted to data center use, it’s worth between $45 and $50 per square foot, according to Cvengros.

So, Stream is very likely to make its money back, even with the added expense of buying all those homes. “It is all a function of what can be developed there. Every data center operator is trying to build at scale by bringing massive power with multiple buildings. If they pull off what they intend to, they should very profitable, I would imagine,” Cvengros said.

Alvin Nguyen, senior analyst with Forrester Research, believes this is something to expect more of in the future. “Data centers need to be located near the users – residential, commercial, and industrial – so competition for this type of space makes sense. If their prospective clients are based nearby, or more importantly, their prospective clients’ clients are nearby, then this is an ideal opportunity,” Nguyen said.

Even considering that Stream paid a premium for the houses, Nguyen says that “unless they are horribly run,” the data center should make a profit. The cost of buildout for a new data center can start at around $1,000 per square foot, and additional costs come from power, cooling, network connectivity, monitoring equipment, management software, facilities staff, and security.

Data Center, High-Performance Computing
]]>
https://www.networkworld.com/article/2071209/data-center-provider-razes-55-homes-to-make-room-for-illinois-campus.html 2071209
Google to invest $1 billion in Kansas City data center Wed, 20 Mar 2024 22:06:30 +0000

Google plans to build a new $1 billion data center in Kansas City, Missouri, marking the company’s first data center in the state. 

The facility is expected to support up to 1,300 jobs and contribute to the region’s workforce and energy infrastructure. It’s one of a growing number of data centers worldwide. 

“With its available land and talented workforce, Kansas City is a great place for us to locate a data center,” Google spokesperson Chris Mussett said. 

Mussett said that AI is driving demand for data centers. “We are always planning for future capacity needs, and we want to be sure that we have options to continue to support the growing demand for our online services,” he added. 

The data center will support Google’s digital services, such as Google Docs, Maps, Search, and Gmail. To power the facility’s operations, Google has partnered with Ranger Power and D E Shaw Renewable Investments (DESRI) to provide 400 megawatts of new, carbon-free power, Missouri Governor Michael Parson’s office announced.

As market for AI heats up, cooling must improve

As data volumes continue to grow rapidly and AI drives further expansion, the data center industry faces new challenges and opportunities, said Sara Martin, associate principal at HED, an architecture and engineering design firm.

One key trend is incorporating more flexible and efficient cooling systems to accommodate higher densities and minimize environmental impact. “Data center leaders are asking engineers for designs that will accommodate higher density cabinet loads as they transition from traditional all-air cooling methods to direct-to-cabinet cooling systems in anticipation of the impact of AI,” Martin said. “As climate concerns grow, there will be continued pressure on data center companies to utilize more efficient cooling methods as a way to reduce their carbon footprint as well.”

Another significant shift in 2024 will be expanding data centers into new markets. “AI will drive the shift as data centers go in search of new locations with available power,” Martin predicted. “The challenge of meeting the power demands of the modern data center became apparent in 2023 and will only get worse in 2024 if firms don’t focus on moving to where power is available.”

Martin expects data center planners and investors to find markets like Denver, Kansas City, Nashville, and Salt Lake City attractive alternatives to power-constrained regions. “If they can’t get the power to the data center, bring the data center to the power,” she said.

Midwestern expansion

The rise of AI and the growing trend of enterprises moving to the cloud have been significant drivers of demand for data centers, said Narayana Pappu, CEO of Zendata, a San Francisco-based provider of data security and privacy compliance solutions.

“Capacity helps meet this demand and prepare for things to come without putting pressure on cost,” Pappu said, emphasizing the benefits of extra capacity in data centers.

When it comes to the best locations for data centers, Pappu noted that multiple factors come into play, including environmental risks, operational costs, and regulatory requirements. “The usual suspects from a location perspective have been northern Virginia, Silicon Valley, Singapore, and Shanghai,” he said. “However, since the pandemic, there have been more data centers breaking ground in the Midwest.”

Pappu also highlighted AI’s impact on data center workloads, saying that “the first impact is on the need for specialized hardware that AI applications need, along with increased demands on electricity and network usage.”

According to Gal Ringel, co-founder and CEO of Mine, a global data privacy management company, the data center industry is set for significant growth in the coming years. This growth will be driven by a combination of regulatory changes and the increasing demand for AI resources.

“Key benefits include the ability to capitalize on opportunity and increased data security,” Ringel said, discussing the advantages of having extra capacity in data centers. “Data localization and residency requirements are in place across Asia and Europe, with the latter due to the US long struggling to be granted an adequacy agreement for data transfers under the GDPR.”

Ringel also noted that recent executive orders by US President Joe Biden, which restrict sensitive data from being sent abroad, are likely to increase American companies’ demand for domestic data center capacity. “There will likely be an uptick in American companies looking for data center capacity domestically rather than utilizing cheaper data centers abroad, so data centers with extra capacity should be able to take advantage of these regulatory swings,” Ringel added.

Data Center, Data Center Design
]]>
https://www.networkworld.com/article/2069474/google-to-invest-1-billion-in-kansas-city-data-center.html 2069474
2020-2022 global network outage report and internet health check Wed, 20 Mar 2024 19:49:36 +0000

Editor’s note: We began watching the performance of cloud providers and ISPs and sharing ThousandEyes’ weekly report of network outages and internet disruptions during the Covid-19 pandemic, when the use of cloud apps and conferencing services, such as unified communications-as-a-service (UCaaS), became even more critical for enterprise companies. This is an archive of 2020-2022 incidents as tracked by Cisco subsidiary ThousandEyes.

For current trends, see the 2024 outage report and internet health check, which is updated weekly. We’ve also archived our 2023 coverage, which can be found here.

Internet report for Dec. 12, 2022

Global outages across all three categories last week increased from 282 to 347, up 23%. In the US, they increased from 78 to 90, up 15%. Globally, ISP outages increased from 210 to 281, up 34%, and in the US they increased from 61 to 76, up 25%. Globally, cloud-provider network outages dropped from eight to four, and in the US remained the same at two. Globally, collaboration-app network outages jumped from one to seven, and in the US dropped from one to zero.

Two notable outages

About 2:30 p.m. EST on December 5, AWS’s Ohio-based us-east-2 region experienced a connectivity issue that appeared to affect some customer internet connectivity to and from the region, characterized by significant packet loss between it and global locations. The loss was seen only between end-users connecting via ISPs and didn’t appear to affect connectivity between instances within the region or between regions. The packet loss continued for more than an hour before resolving around 3:50 p.m. EST. Click here for an interactive view.

On December 7, NTT America experienced an outage affecting customers and downstream partners across the US, the Netherlands, Belgium, the Republic of Korea, and Japan. The 13-minute outage was first observed around 12:15 a.m. EST and appeared to center on NTT nodes in Newark, New Jersey, and Ashburn, Virginia. Ten minutes after being observed, the Ashburn nodes appeared to clear. The outage was cleared around 12:30 a.m. EST. Click here for an interactive view.

Internet report for Dec. 5, 2022

Global outages across all three categories last week increased from 222 to 282, up 27%. In the US, outages increased from 48 to 78, up 63%. Global ISP outages increased from 173 to 210, up 21%, and in the US increased from 32 to 61, up 90%. Global cloud-provider network outages jumped from four to eight, while in the US they remained at two. Global collaboration-app network outages dropped from six to one, and in the U.S., from five to one.

Two notable outages

On November 29, TATA Communications (America) Inc., experienced an outage affecting many of its downstream partners and customers including in the US, Canada, the UK, France, the Netherlands, Chile, Peru, Colombia, Saudi Arabia, Argentina, India, Germany, Hong Kong, and Singapore. The 33-minute outage was first observed around 9:20 a.m. EST, and apparently centered on TATA nodes in Newark, New Jersey. Ten minutes later the outage appeared to include nodes in New York, New York; London, England; Marseille, France; and Pune and Bangalore, India. This appeared to coincide with the peak in the number of regions, downstream partners, and customers affected. Around five minutes after the peak, just nodes in Newark, London, and Marseille exhibited outage conditions. The outage was cleared around 9:55 a.m. EST. Click here for an interactive view.

On December 1, Microsoft experienced an issue affecting user access to some Microsoft services, including Office 365, predominantly in the APJC region. First observed around 7:50 p.m. EST and lasting approximately an hour and 18 minutes, the outage appeared to initiate in Microsoft’s Japan infrastructure before affecting other Microsoft servers in the region. Network connectivity to the service did not appear to experience any significant issues throughout the outage. The outage was cleared around 9:10 p.m. EST. Click here for an interactive view.

Internet report for Nov. 28, 2022

Global outages across all three categories last week decreased from 331 to 222, down 33% compared to the week prior. In the US, they dropped from 121 to 48, down 60%. Globally, ISP outages decreased from 246 to 173, down 30%. In the US they dropped from 85 to 32, down 62%. Globally, cloud-provider network outages decreased from five to four and in the US from three to two. Globally, collaboration-app network outages decreased from nine to six outages, while in the US they decreased from six to five.

Two notable outages

On November 23, Astute Hosting experienced an outage affecting multiple downstream providers and customers in the US, Australia, Singapore, Canada, and the UK. The 19-minute outage was first observed around 4:40 a.m. EST and appeared to center on Astute Holding nodes in Seattle, Washington. Fifteen minutes later, some of the nodes appeared to recover. The outage was cleared around 5:00 a.m. EST. Click here for an interactive view.

On November 21, Embratel experienced a series of outages over a period of an hour and 35 minutes that affected downstream providers and customers in the US and Canada. The 10-minute outage was first observed around 10:30 p.m. EST centered on Embratel nodes in Atlanta, Georgia. An hour and 30 minutes later, nodes located in Sao Paulo, Brazil, also exhibited outage conditions. The outage was cleared around 12:05 a.m. EST. Click here for an interactive view.

Updated Nov. 21

Global outages across all three categories last week decreased from 352 to 331, down 6% from the week prior. In the US, they decreased from 124 to 121, down 2%. Globally ISP outages decreased, from 265 to 246, down 7%, and in the US, they dropped from 93 to 85, down 9%. Globally, cloud provider outages dropped from 14 to five, while in the US they dropped from six to three. Globally, collaboration-app network outages remained the at nine and increased in the US from five to six.

Two notable outages

On November 17, GTT Communications, experienced an outage affecting some of its partners and customers in the US, Australia, Canada, China, Brazil, Republic of Korea, New Zealand, the Netherlands, Singapore, Hong Kong, the Philippines, and Japan. The hour-and-28-minute outage was first observed around 6:10 a.m. EST and appeared to center on GTT nodes in San Jose, California. Twenty minutes later, nodes in San Francisco, California, also exhibited outage conditions. Twenty-five minutes after that, nodes in Seattle, Washington, exhibited outages, too. The outages were cleared around 7:40 a.m. EST. Click here for an interactive view.

On November 16, Cogent Communications experienced a 44-minute outage affecting multiple downstream providers and customers across the US and the UK. The outage was first observed around 12:50 a.m. EST and appeared to center on Cogent nodes in New York, New York. Fifteen minutes later some of the nodes appeared to recover, reducing the impact. The outage was cleared around 1:35 a.m. EST. Click here for an interactive view.

Updated Nov. 7

Global outages across all three categories last week decreased from 381 to 361, down 5% compared to the week prior. In the US, they dropped from 130 to 74, down 43%. Globally, ISP outages decreased from 298 to 289, down 3%, and dropped from 107 to 64 in the US, a 40% decrease. Globally, cloud-provider outages remained the same at nine. In the US they dropped from four to two. Globally, collaboration-app network outages decreased from five to two, and in the US dropped from five to one.

Two notable outages

On November 2, TATA Communications (America) experienced an outage affecting downstream partners and customers in countries including the US, the UK, the Netherlands, Australia, Vietnam, Germany, Poland, France, China, India, and Singapore. The 24 minute outage was first observed around 8:40 a.m. EDT, and initially centered on TATA nodes in London, England. Fifteen minutes later, nodes in Newark, New Jersey, San Francisco, California, and the United Arab Emirates (UAE) also exhibited outage conditions. The outage was cleared around 9:05 a.m. EDT. Click here for an interactive view.

On November 2, AT&T experienced an outage affecting AT&T customers and partners across the US. The nine-minute outage was first observed around 7:15 p.m. EDT, appearing to center on AT&T nodes in San Jose, California. Five minutes later the number of San Jose nodes exhibiting outage conditions appeared to rise. The outage was cleared at around 7:25 p.m. EDT. Click here for an interactive view.

Updated Oct. 31

Global outages across all three categories last week increased from 374 to 381, up 2% compared to the week prior. In the US, outages increased from 94 to 130, up 38%. Globally, ISP outages increased from 293 to 298, up 2%, while in the US they increased from 72 to 107, up 49%. Globally, cloud-provider network outages decreased from 10 to nine, and in the US they remained the same at four. Globally collaboration-app network outages decreased from seven to five, and in the US they increased from four to five.

Three notable outages

On October 25, Zscaler experienced an outage that impacted customers using Zscaler Internet Access (ZIA) services on the Zscaler Cloud network 2. First observed around 7:46 a.m. EDT, the outage appeared to affect customers’ network connectivity. Around 7:46 a.m. EDT, Zscaler announced it had identified the cause of the issue and begun mitigation. It appeared the majority of customer connectivity had been restored by 11:34 a.m. EDT, with Zscaler announcing the issue resolved around 4:22 p.m. EDT. See here for a more detailed analysis of the outage.

Around 1:30 a.m. EDT on October 27, Salesforce experienced an outage that affected customers globally that appeared to last about an hour and 24 minutes. It manifested itself as a series of server errors and timeouts, which is consistent with a backend service issue. Around 2:14 a.m. EDT, Salesforce announced it was taking steps to alleviate the issue. Around 2:35 a.m. EDT, services appeared to start to return with the major portion of the issue clearing around 3:15 a.m. EDT. Around 7:23 a.m. EDT, the outage was officially cleared. Click here for an interactive view.

On October 28, Facebook experienced a service disruption that rendered the application inaccessible to some users globally. First observed around 3:33 p.m. EDT, the disruption appeared to prevent some users from accessing content and manifested as a combination of HTTP server errors and packet loss at Facebook’s network edge. The incident appeared to clear around 4:45 p.m.EDT. Click here for an interactive view.

Updated Oct. 24

Global outages across all three categories last week increased from 283 to 374, up 32% compared to the week prior. In the US, they increased from 72 to 94, up 31%. Globally, ISP outages jumped from 194 to 293, up 51% while in the US they increased from 55 to 72, up 31%. Globally cloud-provider network outages jumped from six to 10, and in the US increased from one to four. Globally collaboration-app network outages decreased from nine to seven, and in the US decreased from six to four.

Two notable outages

On October 19, LinkedIn experienced a service disruption affecting its mobile and desktop user base. The disruption was first observed around 6:34 p.m. EDT, with users attempting to post to LinkedIn receiving error messages. The total disruption lasted around an hour and a half during which no network issues were observed connecting to LinkedIn web servers indicating the issue was application related. The service was restored around 7 p.m. EDT.

On October 22, Level 3 Communications experienced an outage affecting downstream partners and customers in the US, Canada, the Netherlands, and Spain. The outage lasted a total of 18 minutes divided into two occurrences distributed over a 30-minute period. The first occurrence was observed around 12:35 a.m. EDT and appeared centered on Level 3 nodes in Chicago, Ilinois. Five minutes later, nodes in St. Louis, Missouri, also exhibited outage conditions. Ten minutes after the outage appearing to clear, the St. Louis nodes began exhibiting outage conditions again. The outage was cleared around 1:05 a.m. EDT. Click here for an interactive view.

Updated Oct. 17

Global outages across all three categories last week decreased from 328 to 283, a 14% decrease compared to the week prior. In the US, outages dropped from 101 to 72, down 29%. Global ISP outages decreased from 239 to 194, down19%, and in the US decrease from 76 to 55, down 28%. Global cloud-provider network outages dropped from 12 to six, while in the US they dropped from six to one. Global collaboration-app network outages decreased from 10 to nine, and from seven to six in the US.

Two notable outages

On October 10, Microsoft experienced an outage affecting downstream partners and access to services running on Microsoft environments. The outage, which lasted 19 minutes, was first observed around 3:50 p.m. EDT and appeared centered on Microsoft nodes in Des Moines, Iowa. Ten minutes after that, nodes in Los Angeles, California exhibited outage conditions and appeared to clear five minutes later. The Des Moines outage was cleared around 4:10 p.m. EDT. Click here for an interactive view.

On October 12, Continental Broadband Pennsylvania experienced an outage affecting some customers and partners across the US. The outage lasted around 49 minutes in total, divided into four occurrences distributed over a period of an hour and 45 minutes. The first occurrence was observed around 11:10 p.m. EDT, lasted 23 minutes, and appeared to focus on Continental nodes in Columbus, Ohio. The first occurrence appeared to clear around 11:35 p.m. EDT. Five minutes later, Cleveland, Ohio, nodes exhibited outage conditions before clearing after four minutes. Fifteen minutes after that, the Columbus nodes once again exhibited outage conditions. The outage was cleared around 12:55 a.m. EDT. Click here for an interactive view.

Updated Oct. 10

Global outages across all three categories last week increased from 301 to 328, up 9% compared to the week prior. In the US they decreased from 107 to 101, down 6%. Globally ISP outages increased from 233 to 239, up 3%, and in the US they decreased from 78 to 76, down 3%. Globally cloud-provider network outages doubled from six to 12, while in the US they remained the same at six. Globally and in the US, collaboration app network outages remained the same with 10 outages globally and seven in the US.

Two notable outages

On October 4, Deft experienced an outage affecting some of its customers and downstream partners across the US, Brazil, Germany, Japan, Canada, India, Australia, the UK, France, and Singapore. The outage lasted around an hour and six minutes in total, divided among four occurrences over a period of an hour and 30 minutes. The first occurrence was observed around 5:25 a.m. EDT and appeared to center on Deft nodes in Chicago, Ilinois. It lasted 14 minutes and appeared to clear around 5:40 a.m. EDT. Five minutes later, a second occurrence lasting 19 minutes was observed with Chicago nodes exhibiting outage conditions. The third occurrence lasting 24 minutes was observed around 6:10 a.m. EDT, again centered on Chicago nodes. Ten minutes later they appeared to clear, but began exhibiting outage conditions again. The outage was cleared around 6:55 a.m. EDT. Click here for an interactive view.

On October 5, TATA Communications America experienced an outage affecting downstream partners and customers in the US, the UK, France, Turkey, the Netherlands, Portugal, India, and Israel. The outage, lasting 9 minutes in total, was first observed around 9:25 a.m. EDT and appeared initially to center on TATA nodes in Newark, New Jersey, and London, England. Five minutes into the outage, the Newark and London node outages were joined by nodes in Marseille, France. The outage was cleared around 9:35 a.m. EDT. Click here for an interactive view.

Updated Oct. 3

Global outages across all three categories last week decreased from 304 to 301 compared to the week prior, while in the U.S., they increased from 90 to 107, up 19%. Globally, ISP outages increased from 232 to 233, while in the US, they increased from 65 to 78, up 20%. Globally, cloud-provider network outages dropped from 14 to six and remained the same in the US at six. Globally, collaboration-app network outages jumped from three to 10 outages, while those in the US rose from three to seven.

Two notable outages

On September 29, Microsoft experienced an outage affecting some downstream partners and access to services running in Microsoft environments. The 33-minute outage was first observed around 7:10 a.m. EDT and appeared centered on Microsoft nodes in Washington, DC. Around 7:15 a.m. EDT nodes in Ashburn, Virginia also exhibited outage conditions. A further 15 minutes later, New York, New York, nodes also exhibited outage conditions. The outage was cleared around 7:45 a.m. EDT. Click here for an interactive view.

On October 1, Cogent Communications experienced a series of outages over a period of 35 minutes affecting downstream providers in the US, France, Singapore, Germany, the UK, Canada, and Mexico. The outage, lasting a total of 17 minutes, was first observed around 3:20 a.m. EDT centered on Cogent nodes in Oakland, California, and Washington, DC. Around 3:25 a.m. EDT, the Washington, DC, nodes appeared to clear, but nodes in Salt Lake City, Utah, showed outage conditions. This lasted around nine minutes, and the Cogent environment was then stable for 15 minutes before experiencing an eight-minute outage with Oakland and Los Angeles, California, nodes exhibiting outage conditions. Five minutes into the second occurrence, the nodes in Washington, DC, New York, New York, Houston, Texas, San Francisco, California, and Bilbao, Spain, in exhibited outage conditions. The outage was cleared around 3:55 a.m. EDT. Click here for an interactive view.

Updated Sept. 26

Global outages across all three categories last week decreased from 347 to 304, down12% compared to the week prior. In the US, outages decreased from 108 to 90, down 17%. Globally, the number of ISP outages decreased from 252 to 232, down 8%, and from 76 to 65 in the US, down 14%. Globally, cloud-provider network outages decreased from 15 to 14, and in the US decreased from eight to six. Globally collaboration-app network outages occurred only in the US and decreased from nine to three.

On September 23, NTT America experienced an outage affecting some customers and downstream partners across countries including the US, the Netherlands, Hong Kong, Switzerland, and Japan. The 18-minute outage was observed around 3 p.m. EDT and appeared to center on NTT nodes in New York, New York. Around 10 minutes later, nodes in Ashburn, Virginia, also began exhibiting outage conditions. The outage was cleared around 3:20 p.m. EDT. Click here for an interactive view.

On Sept. 25, TierPoint experienced an outage affecting some customers and downstream partners across the US and Canada. First observed around 1 p.m. EDT, the outage, lasting a total of 41 minutes over a 55-minute period, appeared to center on nodes located in Nashville, Tennessee. Fifteen minutes later, outage conditions appeared to exist in nodes in Raleigh and Charlotte, North Carolina, and 10 minutes later, they appeared to recover. Around 1:35 p.m. EDT, the nodes located in Nashville appeared to recover, but exhibited outage conditions five minutes later. This second occurrence lasted four minutes before appearing to clear and then once again began exhibiting outage conditions. The outage was cleared around 1:55 p.m. EDT. Click here for an interactive view.

Updated Sept. 19

Global outages across all three categories last week decreased from 414 to 347, down 16% compared to the week prior. In the US, they decreased from 133 to 108, down 19%. Globally, ISP outages decreased from 304 to 252, down17%, and in the US increased from 73 to 76, up 4%. Globally, cloud-provider network outages dropped from 27 to 15, and in the US from 18 to 8. Globally, collaboration-app network outages increased from six to nine, and in the US increased from three to nine

Two notable outages

On September 14, Level 3 Communications experienced an outage affecting downstream partners and customers across the US. The 29-minute outage was first observed around 1:35 a.m. EDT and appeared to be centered on Level 3 nodes in Seattle, Washington. The outage was cleared around 2:05 a.m. EDT. Click here for an interactive view.

On September 15, at 11 a.m. EDT, Zoom Communications experienced an issue affecting users globally for about 24 minutes. The outage appeared to affect users’ ability to start and join meetings, but network connectivity to Zoom did not appear to experience any significant issues. Around 11:22 a.m. EDT, Zoom announced it was aware of the issues and were investigating. Zoom announced the outage was fully resolved around 11:49 a.m. EDT. Click here for an interactive view.

Updated Sept. 12

Global outages across all three categories last week increased from 327 to 414, up 27% compared to the week prior. In the US, outages increased from 87 to 133, up 53%. Globally, ISP outages increased from 260 to 304, up 17%, and in the US they increased from 64 to 73, up 14%. Globally, cloud provider network outages jumped from four to 27, up 575%, while in the US they increased from two to 18, up 800%. Globally, collaboration app network outages decreased from eight to six outages, down 25%, and in the US decreased from five to three.

Two notable outages

On September 6, Hurricane Electric, experienced an outage affecting customers and downstream partners across countries including the US, Spain, Hong Kong, Canada, Malaysia, and the United Arab Emirates. The 29-minute outage, first observed at around 8:50 p.m. EDT, initially appeared centered on Hurricane Electric nodes in New York, New York. About five minutes later, the New York nodes appeared to recover, but nodes in Singapore and Marseille, France, exhibited outage conditions. Around 9 p.m. EDT, New York, Singapore, Marseille, France, nodes and those in Paris, France, all exhibited outage condition, boosting the number of partners and regions impacted. The outage was cleared at around 9:20 p.m. EDT. Click here for an interactive view.

On September 7, Microsoft experienced an issue that affected connections and services leveraging its Azure Front Door (AFD) platform. First observed around 12:20 p.m. EDT, the major portion of the outage, lasting around 50 minutes, and appeared to affect users’ ability to connect and access Microsoft cloud services that use AFD. Network connectivity to AFD edge locations did not appear to experience any significant issues throughout the outage. Microsoft’s preliminary analysis reported that the disruption appeared to be the result of an unusual spike in traffic, causing multiple environments managing the traffic load-balancing to go offline. Microsoft remediated the residual impact and announced the outage fully cleared at 3:55 p.m. EDT. Click here for an interactive view.

Updated Sept. 5

Global outages across all three categories last week decreased from 363 to 327, down 10% compared to the week prior. In the US, outages decreased from 91 to 87, down 4%. Globally, ISP outages increased from 250 to 260, up 4%, and in the US from 57 to 64, up 12%. Globally, cloud-provider network outages dropped from 11 to four, and from six to two in the US. Globally, collaboration-app network outages decreased from 10 to eight and from seven to five in the US.

There were two notable outages:

On Sept. 3, Cogent Communications experienced a series of outages over a period of 50 minutes that impacted multiple downstream providers in countries including the US, Spain, Portugal, the UK, Israel, India, Luxembourg, Germany, Singapore, South Africa, Austria, France, Argentina, Denmark, and Australia. The outage, lasting a total of 43 minutes, was first observed around 6:25 p.m. EDT, initially centered on Cogent nodes in New York, New York; Washington, DC; Los Angeles and San Francisco, California; and Bilbao, Spain. After this initial nine-minute outage the Cogent environment was stable for five minutes before experiencing a 34-minute outage involving Cogent nodes located in London, England; Frankfurt and Munich, Germany; and Paris and Marseille, France. After 10 minutes, all the nodes, with the exception of those in Paris, appeared to clear. The outage was cleared around 7:15 p.m. EDT. Click here for an interactive view.

On September 1, AT&T experienced an outage that impacted customers and partners across the US. The seven-minute outage was first observed around 4:55 a.m. EDT, appearing to center on AT&T nodes located in Phoenix, Arizona. Five minutes later some of the Phoenix nodes appeared to recover. The outage was cleared at around 5:05 AM EDT. Click here for an interactive view.

Updated Aug. 29

Globally, outages across all three categories increased from 299 to 363 last week, up 21% from the prior week. In the US, they increased from 82 to 91, up 11%.

Globally, ISP outages increased from 204 to 250, up 23%, and rose from 53 to 57 in the US, an 8% increase. 

Globally, cloud-provider network outages dropped from 21 to 11, and increased from five to six in the US.

Globally, collaboration-app network outages decreased from 14 to 10, and from eight to seven in the US.

Two notable outages:

On August 24, Comcast Communications experienced an outage affecting downstream partners and customers across the US. The 15-minute outage consisted of two occurrences over a two-hour period. The first occurrence was observed around 7:35 a.m. EDT and appeared to center on Comcast nodes in Houston, Texas. An hour and 50 minutes after appearing to clear, the Houston nodes again appeared to exhibit outage conditions. The outage was cleared around 9:35 a.m. EDT. Click here for an interactive view.

On August 27, Verizon Business experienced an outage affecting customers and partners across the US. The outage was first observed around 8:00 a.m. EDT and appeared centered on Verizon Business nodes in San Jose, California. The 17-minute outage was divided into three occurrences spanning an hour and five minutes and was cleared around 9:05 a.m. EDT. Click here for an interactive view.

Updated Aug. 22

Global outages across all three categories last week increased from 256 to 299, a 17% increase compared to the week prior. In the US, outages decreased from 93 to 82, down 12%.

Globally ISP outages increased from 190 to 204, up 7%, and in the US they dropped from 71 to 53, down 25%.

Globally, cloud provider network outages jumped from nine to 21, while in the US they remained the same at five.

Globally, collaboration app network outages jumped from three to 14, and from three to eight in the US.

A notable outage:

On August 17, Level 3 Communications experienced an outage that impacted multiple downstream partners and customers in countries including the US, Germany, Japan, Taiwan, the Czech Republic, and Switzerland. The outage was first observed around 11:45 p.m. EDT and appeared to be centered on Level 3 nodes in Philadelphia, Pennsylvania. The outage lasted 13 minutes and was cleared around 12:00 a.m. EDT. Click here for an interactive view.

Updated Aug. 15

Global outages across all three categories last week decreased from 260 to 256, down 2% compared to the week prior. In the US, outages decreased from 103 to 93, down 10%.

Globally, the number of ISP outages rose from 173 to 190, up 10%, while in the US they remained steady at 71. 

Globally, cloud-provider network outages increased from three to nine, and in the US increased from two to five.

Globally, collaboration-app network outages dropped from 10 to three and in the US from six to three.

Two notable outages.

On August 8, Google experienced an outage that affected the availability of Search, Maps, and associated services that leverage them. First observed around 9:15 p.m. EDT, users were unable to access the service, although the application remained reachable from a network perspective. Errors seen during the incident were indicative of a back-end application issue. The disruption lasted 41 minutes over a 55-minute period. A Google spokesperson attributed the outage to a software-update issue. The outage was cleared around 10:10 p.m. EDT. Click here for an interactive view.

On August 11, Switch Communications experienced an outage affecting customers and downstream partners across countries including the US, Ireland, Canada, Spain, Greece, the Philippines, the Netherlands, Germany, Mexico, Italy, the UK, and South Africa. First observed around 5:50 a.m. EDT, the outage appeared to center on nodes in Las Vegas, Nevada. The outage lasted a total of 62 minutes over a 105-minute period. The outage was cleared around 7:35 a.m. EDT.  Click here for an interactive view.

Updated Aug. 1

Global outages across all three categories last week decreased from 276 to 260, down 6% from the week prior. In the US they increased from 98 to 103, up 5%.

Globally, the ISP outages decreased from 182 to 173, down 5%, while in the US they increased from 62 to 71, up 15%.

Globally, cloud-provider network outages dropped from 13 to three and from 11 to two in the US.

Globally, collaboration-app network outages increased from nine to 10 and from two to six in the US.

There were two notable outages.

On August 4, Cogent Communications experienced an outage affecting downstream providers and Cogent customers in countries including the US, Australia, China, Singapore, Turkey, the UK, Canada, Argentina, the Netherlands, Denmark, France, Brazil, Germany, Spain, Republic of Korea, India, and Hong Kong. The 29-minute outage was first observed around 3:35 a.m. EDT centered on Cogent nodes in San Francisco, San Jose, and Sacramento, California. Five minutes later, Kansas City, Missouri, nodes also exhibited outages. During the last 15 minutes of the outage nodes gradually cleared until just those in San Francisco and San Jose showed outage conditions. The outage was cleared around 4:05 a.m. EDT. Click here, for an interactive view.

On August 4, Level 3 Communications experienced an outage affecting downstream partners and customers across countries including the US, Switzerland, and Germany. The 14-minute outage was first observed around 10:20 a.m. EDT and appeared centered on Level 3 nodes in Philadelphia, Pennsylvania. The outage was cleared around 10:35 a.m. EDT. Click here, for an interactive view.

Updated July 24

Global outages across all three categories last week increased 5% from 272 to 285 compared to the week prior. In the US, total outages increased from 88 to 122 – an increase of 39%.

Global ISP network outages decreased from 210 to 203. In the US they increased 39% from 64 to 89.

Global cloud-provider network outages decreased from 9 to 8, and in the US they climbed from three to five.

Global collaboration-app network outages decreased from 10 to 7, while in the US, they fell from eight to three, a drop of 63% compared to the week prior.

Two notable outages:

On July 20, Hurricane Electric, a network transit provider headquartered in Fremont, CA, experienced an outage that impacted customers and downstream partners across multiple regions, including the U.S., Canada, United Arab Emirates (UAE), Colombia, Germany, South Africa, Brazil, Malaysia, Japan, and the U.K. The outage, first observed at around 2:10 PM EDT, lasting a total of 13 minutes, appeared to center on Hurricane Electric nodes located in San Jose, CA. Around five minutes into the outage, the number of nodes exhibiting outage conditions located in San Jose, CA, appeared to increase. This increase in the exhibiting outage conditions appeared to coincide with the increase in the numbers of partners and regions impacted. The outage was cleared at around 2:25 PM EDT.

On July 20, Microsoft experienced an issue that affected access to Microsoft Teams globally. First observed around 9:15 PM EDT, the outage, lasting around 3 hours, appeared to impact users’ ability to access the service. However, network connectivity to the service did not appear to experience any significant issues throughout the outage. Around 11:00 PM EDT Microsoft announced that they had determined that a recent deployment had resulted in a connectivity issue to an internal storage system and began rerouting traffic to an alternate region in an effort to restore functionality to the service. Microsoft Teams availability appeared to be recovered for most global users around 12:15 AM EDT.

Updated July 17

Global outages across all three categories last week decreased from 281 to 272 compared to the week prior, and in the US decreased from 120 to 88, down 27%.

Global ISP network outages decreased from 217 to 210, and in the US they dropped 34% from 97 to 64.

Global cloud-provider network outages decreased from 19 to 9 – a drop of 53% – and in the US they dropped from five to three.

Global collaboration-app network outages doubled from five to 10, while in the US, they jumped from three to eight, a spike of 167% compared to the week prior.

Two notable outages:

On July 14, Arelion (formerly known as Telia Carrier), a global Tier 1 ISP headquartered in Stockholm, Sweden, experienced an outage that impacted customers and downstream partners across multiple countries including, the U.S., the Czech Republic, Hungary, Mexico, Colombia, Brazil, Chile, Peru, Singapore, Canada, France, and Germany. The disruption lasted a total of 23 minutes, divided into two occurrences over a thirty-five-minute period. First observed around 8:45 PM EDT, the first occurrence and the longest, lasting 14 minutes, appeared to center on nodes located in Atlanta, GA. Ten minutes after appearing to recover, the nodes located in Atlanta, GA once again began exhibiting outage conditions. The outage was cleared around 9:20 PM EDT.

On July 14, Twitter experienced a service disruption that impacted users globally. First observed around 8:05 AM EDT, users were unable to access the service although the application remained reachable from a network perspective. Errors seen during the incident were indicative of a back-end application issue. The disruption lasted 40 minutes, with service access restored to a number of users around 8:50 AM EDT. With full resolution confirmed by Twitter around 12:37 PM.

Updated July 11

Global outages across all three categories last week decreased from 283 to 281 compared to the week prior, and in the US decreased from 148 to 120, down 19%.

Global ISP network outages increased from 208 to 217, and in the US they dropped from 109 to 97.

Global cloud-provider network outages increased from 18 to 19, and in the US they dropped from 11 to five.

Global collaboration-app network outages dropped from 12 to five, while in the US, they decreased from nine to three.

Two notable outages:

On July 7, AT&T experienced an outage that impacted AT&T customers and partners in the US, Canada, China, Austria, Spain, Bulgaria, Japan, Australia, South Africa, Ireland, India, and the Netherlands. The 4:20 a.m. ET outage lasted around 19 minutes, and appeared initially to center on AT&T nodes in San Jose, California, and Seattle, Washington. Ten minutes later nodes in Chicago, Illinois, and Ashburn, Virginia, were also affected. The outage was cleared around 4:40 a.m. ET.

On July 5, Cogent Communications experienced a series of outages over a period of two hours and 10 minutes affecting multiple downstream providers in the US, China, Ireland, Hong Kong, the Netherlands, New Zealand, Australia, Singapore, and Japan. The outage, lasting a total of 26 minutes, was first observed around 12:15 a.m. ET centered on Cogent nodes in Oakland and San Francisco, California. After three minutes, the outage cleared, and the Cogent environment was stable for 16 minutes. It then experienced a nine-minute outage affecting nodes in Boston, Massachusetts, and Oakland and Los Angeles, California. An hour and twenty-five minutes after the initial outage, a four-minute outage was observed on Los Angeles nodes in Los Angeles Oakland, and San Francisco, California. Thirty minutes after that appeared to clear, the Los Angeles nodes exhibited outage conditions again. That final outage was cleared around 2:25 a.m. ET.

Updated July 4

Global outages across all three categories last week increased from 247 to 283, up 15%. In the US, they increased from 92 to 148, up 61%.

Globally, ISP outages increased from 173 to 208, up 20%, and in the US they jumped from 63 to 109, up 73%.

Globally, cloud-provider network outages increased from six to 18, and in the US increased from six to 11.

Globally, collaboration-app network outages increased from 10 to 12, and from seven to nine in the US.

Two notable outages:

On June 28, Hurricane Electric experienced an outage that impacted customers and downstream partners across countries including the US, Malaysia, Turkey, Argentina, Germany, Slovenia, Switzerland, Hong Kong, Australia, Japan, and the UK. The outage was first observed around 10:20 p.m. EDT and lasted a total of 24 minutes. It appeared initially centered on Hurricane Electric nodes in Marseille, France, and around five minutes later nodes in New York, New York, and Vienna, Austria showed outage conditions. The number of affected parties appeared to peak 10 minutes after the outage was observed, and it was cleared around 10:45 p.m. EDT. Click here for an interactive view.

On June 30, Cogent Communications experienced a 31-minute outage affecting multiple downstream providers and customers in countries including the US, Brazil, Singapore, Australia, Hong Kong, New Zealand, Mexico, the UK, Germany, France, Spain, India, and Austria. The outage  was first observed around 4:45 a.m. EDT centered on Cogent nodes in Los Angeles and San Jose, California; Phoenix, Arizona; and Houston, Texas. Five minutes later, nodes in San Francisco and Oakland, California; Miami, Florida;  El Paso, Texas, and Hong Kong also showed outage conditions. The outage was cleared around 5:20 a.m. EDT. Click here for an interactive view.

Updated June 27

Global outages across all three categories last week decreased from 281 to 247, down 12% compared to the week prior. In the US, outages increased from 89 to 92, up 3%.

Globally the number of ISP outages decreased from 215 to 173, down 20% and in the US increased from 62 to 63.

Globally cloud-provider network outages decreased from seven to six, and in the US increased from four to six.

Globally collaboration-app network outages jumped from one to 10 and in the US increased from one to seven.

Two notable outages:

On June 21, Cloudflare suffered an interruption that impacted its customers globally. First observed around 2:30 a.m. EDT, the disruption lasted around 1 hour and 10 minutes, and saw a Cloudflare nodes exhibiting outage conditions in London, England; Vancouver, Canada; Dallas, Texas; Singapore; Tokyo, Japan; Cadiz, Spain; Frankfurt, Germany; and Sydney, Australia. The interruption appeared to prevent some customer traffic from flowing properly to websites and services that rely on Cloudflare. The company announced that the outage was a result of a network-configuration change that prevented traffic from flowing to the Cloudflare infrastructure. Around 3:00 a.m. EDT, Cloudflare announced it had identified the cause and began rolling back the change. Around 3:10 a.m. EDT, connectivity appeared to be restored, and Cloudflare declared the outage cleared around 3:40 a.m. EDT. Click here for an interactive view.

On June 23, UUNET Verizon experienced an outage that impacted customers and partners across countries including the US, UK, Canada, Italy, Japan, the Netherlands, Portugal, China, Germany, and India. The outage was first observed around 12:40 AM EDT and appeared to be centered on Verizon Business nodes located in New York, New York; Newark, New Jersey; San Jose, California; and Seattle, Washington. The outage was divided into three occurrences spanning 4 hours and 45 minutes. Mainly customers and partners in the U.S. were impacted. The total outage lasted around 3 hours and 22 minutes and was cleared around 5:25 a.m. EDT. Click here for an interactive view.

Updated June 20

Global outages across all three categories last week decreased from 309 to 281, down 9% from the week prior. In the US, outages decreased from 129 to 89, down 31%.

Globally, ISP outages decreased from 228 to 215, down 6%, and in the US, they decreased from 99 to 62, down 37%.

Globally, cloud-provider network outages increased from four to seven, while in the US they increased from two to four.

Globally, collaboration-app network outages dropped from 10 to one, and in the US dropped from four to one.

Two notable outages:

On June 16, Cogent Communications, experienced an outage impacting downstream providers as well as customers in countries including, the US, Italy, the UK, Canada, Spain, South Africa, Germany, and Japan. The 19-minute outage was first observed around 8:25 a.m. EDT centered on nodes located in London, England. Ten minutes later nodes in York, England were also affected. The outage was cleared around 8:45 a.m. EDT. Click here for an interactive view.

On June 16, NTT America experienced an outage impacting customers and downstream partners across countries including, the US, Argentina, Uruguay, Brazil, Panama, and Japan. The 28-minute outage was first observed around 3:50 p.m. EDT and appeared to center on nodes in Miami, Florida. The outage was cleared around 4:20 p.m. EDT. Click here for an interactive view.

Updated June 13

Global outages across all three categories last week jumped from 183 to 309, up 69%, and in the US, increased from 85 to 129, up 52%.

Globally, the number of ISP outages jumped from 132 to 228, up 73%, and in the US increased from 59 to 99, up 68%.

Globally, cloud-provider network outages decreased from seven to four, and from three to two in the US.

Globally, collaboration-app network outages doubled from five to 10, and remained the same at four in the US.

Two notable outages:

On June 7, Hurricane Electric experienced an outage affecting customers and downstream partners across regions, including the US, Mexico, Peru, Singapore, China, Canada, Argentina, Costa Rica, and Brazil. The outage, first observed around 5:40 p.m. EDT, was cleared at around 6:10 PM EDT. Click here for an interactive view.

On June 8, Cogent Communications, experienced an outage affecting downstream providers as well as Cogent customers in countries including, the US, Australia, Singapore, Republic of Korea, the UK, Germany, New Zealand, Hong Kong, Japan, Italy, Spain, Israel, Bulgaria, and Canada. First observed around 12:25 a.m. EDT, the outage, lasted 36 minutes in total, distributed across six occurrences over a 3 hour and 45-minute period. The outage was cleared around 4:15 a.m. EDT. Click here for an interactive view.

Updated June 6

Global outages across all three categories last week decreased from 228 to 183, down 20% compared to the week before. In the US, outages decreased from 99 to 85, down 14%.

Globally, ISP outages decreased, from 165 to 132, down 20%, and in the US dropped from 79 to 59, down 25%.

Globally, cloud provider outages increased from five to seven, and in the US remained three.

Globally, collaboration-app network outages increased from two to five, and in the US jumped from one to four.

There were two notable outages during the week.

On May 30, a Hurricane Electric outage affected customers and downstream partners across the US, Spain, Canada, Italy, Japan, Thailand, Australia, Sweden, Costa Rica, the UK, Malaysia, Singapore, Switzerland, Belgium, India, Brazil, New Zealand, and Hong Kong. The outage, first observed at around 7 a.m. EDT, lasted a total of 34 minutes and initially appeared to center on Hurricane Electric nodes located in Marseille, France, and Frankfurt, Germany. Ten minutes into the outage, the Marseille and Frankfurt nodes appeared to recover, and nodes in New York, New York, exhibited outage conditions. After 20 minutes nodes in London, England, Paris, France, and Marseille, also exhibited outage conditions, representing the peak in terms of numbers of partners and customer affected. After about five more minutes, London and Paris nodes appeared to clear. The rest of the outage was cleared around 7:35 a.m. EDT. Click here for an interactive view.

On June 1, Amazon experienced an interruption affecting some of its downstream partners and customers in the US, Australia, India, and Sweden. The 14-minute outage was first observed around 7:20 a.m. EDT and appeared centered on Amazon nodes in Ashburn, Virginia. Five minutes into the outage, some of those nodes appeared to recover, then returned to outage conditions until the outage was cleared around 7:35 a.m. EDT. Click here for an interactive view.

Updated May 30

Global outages across all three categories last week decreased from 277 to 228, down18%, and from 102 to 99 in the US, a 3% decreaser. 

Globally ISP outages dropped from 212 to 165, down 22%, and rose slightly in the US from 76 to 79, up 4%.

Globally cloud provider-network outages dropped from 17 to five and from 11 to three in the US.

Globally collaboration-app network outages dropped from four to two, and in the US from four to one.

There were two notable outages during the week.

On May 26, TATA Communications (America) Inc., experienced an outage affecting many of its downstream partners and customers in regions including the US, India, Singapore, Hong Kong, United Arab Emirates, and China. The outage lasted 33 minutes and was divided into three segments over a 1 hour and 50-minute period. It was first observed around 4:45 a.m. EDT centered on TATA nodes in Los Angeles, California; Pune, India; and Singapore. That lasted around 23 minutes, but 10 minutes into it the nodes in Pune appeared to clear. Then nodes in  Delhi, India; and Paris, France exhibited outage conditions. Five minutes later the Los Angeles, Delhi, and Paris nodes appeared to clear. Twenty-five minutes after the first occurrence cleared, the outage reappeared, with Los Angeles nodes initially appearing to exhibit outage conditions again. After five minutes they appeared to clear and Singapore nodes exhibited outage conditions. Forty-five minutes after the Singapore nodes appeared to clear, the third occurrence was observed centered on San Francisco, California, nodes. The outage was cleared around 6:35 AM EDT. Click here for an interactive view.

On May 24, Microsoft experienced an outage on its network affecting some downstream partners and access to services running on Microsoft environments. The outage, which lasted 14 minutes, was first observed around 12:35 p.m. EST and appeared to be centered on Microsoft nodes in Des Moines, Iowa. Five minutes later the affected Des Moines nodes appeared to begin to clear, gradually decreasing the number of impacted partners. The outage was cleared around 12:50 p.m. EST. Click here for an interactive view.

Updated May 23

Global outages across all three categories last week jumped from 204 to 277, up 36%, and in the US, they increased from 85 to 102, up 20%.

Globally, ISP outages increased, from 143 to 212, up 48%, and in the US increased from 64 to 76, up 19%.

Globally, cloud-provider network outages increased from nine to 17, and in the US from six to 11.

Globally, collaboration-app network outages dropped from eight to four, and in the US from six to four.

There were two notable outages.

On May 17, NTT America experienced an outage affecting some customers and downstream partners across countries including, the US, the UK, Germany, and the Netherlands. The outage, lasting around 20 minutes, was observed around 11:05 a.m. EDT and appeared to center on NTT nodes in San Jose, California; Dallas, Texas; and Seattle, Washington. Around 15 minutes into the outage, a number of Dallas nodes appeared to recover. The outage was cleared around 11:25 a.m. EDT. Click here for an interactive view.

On May 19, Amazon experienced an interruption affecting some of its partners and customers in countries including the US, Brazil, India, Armenia, and France. The outage, lasting around 10 minutes, was first observed around 2:45 p.m. EDT, apparently centered on Amazon nodes in Ashburn, Virginia. The number of impacted countries appeared at its highest for the first five minutes, decreasing throughout the duration of the outage until the last minutes when it appeared to affect only the US and India. The outage was cleared around 2:55 pm. EDT. Click here for an interactive view.

Updated May 16

Global outages across all three categories last week decreased from 237 to 204, down 14% compared to the week prior. In the US they decreased from 98 to 85 (13%).

Globally, ISP outages decreased from 175 to 143 (18%) and in the US declined from 78 to 64 (8%).

Globally cloud-provider network outages remained at nine, while in the US they doubled from three to six.

Globally collaboration-app network outages increased slightly last week, from seven to eight, and in the US increased from four to six.

There were two notable outages during the week.

On May 11 Hurricane Electric experienced an outage affecting customers and downstream partners in the US, Hong Kong, Malaysia, Brazil, United Arab Emirates, Canada, India, Germany, the Netherlands, Australia, Costa Rica, and the UK. The outage was first observed at 9:45 p.m. EDT, and lasted a total of 16 minutes in two occurrences over a 95-minute period. The first appeared to center on Hurricane Electric nodes in London, England. Five minutes into the first occurrence, the London nodes appeared to clear, but nodes in San Jose, California, exhibited outage conditions. Fifteen minutes after that appeared to clear, nodes in San Jose, California, again exhibited outage conditions, as did nodes in New York, New York, and Chicago, Illinois. The outage was cleared around 10:20 p.m. EDT. Click here for an interactive view.

On May 10, Level 3 Communications experienced an outage that affected downstream partners and customers across the US. The outage was first observed aroun 1:55 a.m. EDT centered on Level 3 nodes in San Francisco, California and was cleared around 2:05 a.m. EDT. Click here for an interactive view.

Updated May 9

Global outages across all three categories decreased from 285 to 237 (17%) last week, while in the US they decreased from 107 to 98 (8%).

Globally. ISP outages decreased from 209 to 175 (16%), and from 91 to 78 (14%) in the US.

Globally, cloud-provider network outages rose from five to nine, and in the US increased from two to three.

Globally, collaboration-app network outages increased from four to seven, and in the US from two to four.

There were two notable outages during the week.

On May 5, Hurricane Electric, experienced an outage that impacted customers and downstream partners across countries including the US, Canada, Turkey, the Netherlands, Egypt, Australia, Singapore, France, Hong Kong, the UK, and New Zealand. The outage came in 10 occurrences over a period of three hours, 33 minutes starting at 1:55 a.m. EDT. The first initially centered on Hurricane Electric nodes in Chicago, Illinois. Five minutes later, nodes in Dallas, Texas also showed outage conditions, and eight minutes after that, the nodes appeared to clear. Five minutes later the Chicago nodes again appeared to exhibit outage conditions, then appeared to clear four minutes after that, only to exhibit outage conditions again after 10 minutes. Around 2:30 a.m. EDT, the Chicago nodes appeared to clear, but showed outage conditions again 30 minutes later, as did nodes in San Jose, California. Six minutes after this outage cleared, the fifth outage occurred, again centered in Chicago and lasting six minutes but mainly affecting the US. Around 3:33 a.m. EDT, the Chicago nodes again began exhibiting outage conditions. Seven minutes into this occurrence, the Chicago nodes appeared to clear, but nodes in San Jose exhibited outages. Five minutes after that occurrence cleared, a two-minute outage affected nodes in Denver, Colorado. The eighth and longest occurrence lasted 18 minutes affecting nodes in Chicago and New York, New York. Five minutes later, Kansas City, Missouri, nodes showed outage conditions. Ten minutes into that occurrence, the New York nodes appeared to clear, but those in San Jose showed outage conditions. Twenty-five minutes after it cleared, a ninth occurrence was observed in nodes in Minneapolis, Minnesota; Portland, Oregon, and Denver. Five minutes into this occurrence, they appeared to clear and the San Jose nodes exhibited outage conditions. They cleared, but then 10 minutes later exhibited outage conditions again as did nodes in Portland and Denver. That lasted 14 minutes, finally clearing around 5:25 a.m. EDT. Click here for an interactive view.

At 3:40 a.m. EDT on May 4, NTT America experienced a 14-minute outage that impacted some customers and downstream partners across the US. It appeared to center on NTT nodes in San Jose, California, and five minutes later, some of the nodes appeared to clear. The outage was cleared around 3:55 a.m. EDT. Click here for an interactive view.

Updated May 2

Global outages across all three categories last week increased from 219 to 285, up 30%, while in the US they increased from 81 to 107, a 32% jump.

Globally, ISP outages increased from 156 to 209, up 34%, and in the US from 62 to 91, a 47% increase.

Globally, cloud provider network outages decreased from six to five, and in the US rose from one to two.

Globally, collaboration-app network outages decreased from seven to four and in the US from three to two.

There were two notable outages during the week.

On April 27 TATA Communications experienced an outage that impacted many of its downstream partners and customers in countries, including the US, the UK, Australia, Sweden, Germany, and Norway. The outage lasted 11 minutes in total and was divided into three incidents over 55-minutes. It was first observed around 5:10 a.m. EDT with an outage lasting four minutes that appeared to be centered on TATA nodes located in Newark, New Jersey, and Marseille, France. Thirty-five minutes after the first occurrence cleared, Newark nodes appeared to exhibit outage conditions again. Ten minutes after appearing to clear they again appeared to exhibit outage conditions. The outage was cleared around 6:10 a.m. EDT. Click here for an interactive view.

On April 27, Telecom Italia Sparkle experienced an outage that impacted some of its partners and customers in countries, including, the US, China, New Zealand, Canada, Japan, France, Italy, the UK, Germany, Brazil, Japan, Turkey, Nigeria, India, Hong Kong, Sweden, Republic of Korea, and Austria. The 23-minute outage was first observed around 6:25 p.m. EDT and appeared to center on nodes in Ashburn, Virginia. Five minutes into the outage nodes in Milan, Italy; Frankfurt, Germany; London, England; and Sao Paulo, Brazil, exhibited outage conditions. Twenty minutes after being observed, many of the nodes appeared to recover, leaving just nodes located in Milan and Frankfurt exhibiting outage conditions. The outage was cleared around 6:50 PM EDT. Click here for an interactive view.

Update April 18

Global outages across all three categories last week rose from 269 to 297, a 10% increase. In the US, outages increased from 97 to 110, up 13%.

Globally, ISP outages increased from 158 to 209, up 32%. while in the US they increased from 54 to 88.

Globally cloud-provider outages decreased from 11 to eight, but in the US increased from two to six.

Globally, collaboration-app network outages increased from 22 to 24, up 9%, and in the US they dropped from 15 to six.

On April 13, a Hurricane Electric outage impacted customers and downstream partners across regions including the US, Colombia, Israel, Greece, Costa Rica, Canada, and Germany. The outage, first observed at around 2:51 a.m. EDT, lasted 23 minutes and appeared to center on Hurricane Electric nodes located in Miami, Florida and Paris, France. Five minutes into the outage, the nodes located in Paris, France appeared to recover, and nodes located in San Jose, California, exhibited outage conditions. Five minutes after appearing, the San Jose nodes lalso appeared to clear and nodes in Ashburn, Virginia, and Chicago, Illinois exhibited outage conditions. The outage was cleared around 3:15 a.m. EDT. Click here for an interactive view.

On April 12, Oracle experienced an outage that impacted customers and downstream partners interacting with Oracle Cloud services in countries including the US, Japan, Canada, the United Arab Emirates (UAE), India, Germany, the UK, the Netherlands, Brazil, the Republic of Korea, the Philippines, and Australia. First observed at around 8 a.m. EDT and lasting a total of 38 minutes, the outage consisted of two occurrences over a 45-minute period. The first initially appeared centered on Oracle nodes in Toronto, Canada. After 10 minutes, nodes in Montreal and Mississauga, Canada; Chicago, Illinois; Dubai, UAE; Tokyo, Japan; Sydney, Australia; and London, England exhibited outage conditions. After 30 minutes into the first occurrence, only Toronto, Kitchener, and Sydney still exhibited outage conditions. At around 8:35 AM EDT, the outage appeared to clear about 8:35 a.m. EDT, but five minutes later, Toronto nodes once again appeared to exhibit outage conditions. The outage was cleared around 8:45 a.m. EDT. Click here for an interactive view.

Update March 28

Global outages across all three categories last week increased from 216 to 234, up 8%, while in the US they increased from 82 to 99, up 21%.

Global ISP outages increased from 153 to 169, up 10%, and in the US they increased from 61 to 74, up 21%.

Global cloud-provider network outages increased from nine to 15, while in the US they increased from six to nine.

Global collaboration-app network outages increased from nine to 10 outages and in the US they remained at six.

There were two notable outages during the week.

On March 21, Hurricane Electric experienced an outage that impacted customers and downstream partners across multiple regions including the US, Canada, Nigeria, Australia, Malaysia, Germany, China, Denmark, Egypt, Norway, Japan, and Belgium. The outage was observed around 7:10 p.m. EDT, and lasted 37 minutes in two occurrences over an hour and 20-minute period. The first occurrence centered on nodes in Dallas, Texas and appeared to mainly impact the US and Canada. Forty minutes after this first outage cleared, the second outage was observed centered on nodes in London, England. Five minutes into the second occurrence, the London nodes recovered and nodes in San Jose, California, exhibited outage conditions for the next 10 minutes. Twenty minutes into the second occurrence, the San Jose nodes appeared to clear and nodes in New York, New York, exhibited outage conditions. The outage was cleared at around 8:20 p.m. EDT. Click here for an interactive view.

On March 22 Rackspace Technology experienced a series of outages over a period of an hour and 34 minutes that impacted downstream providers and customers in the US. The 51-minute outage was first observed around 2:40 p.m. EDT centered on Rackspace nodes in Chicago, Illinois. Thirty-four minutes later, the Chicago nodes appeared to recover. Forty minutes after first being observed, Chicago nodes once again exhibited outage conditions over a series of five short-duration outages before clearing around 4:25 p.m.EDT. Click here for an interactive view.

Update March 21

Global outages across all three categories last week increased from 210 to 216, a 3% increase compared to the week prior. In the U.S., outages increased from 66 to 82, a 24% increase.

Globally, ISP outages decreased from 159 to 153, down 4%, while in the US they increased from 49 to 61, up 24%.

Globally, cloud-provider network outages increased from seven to nine and doubled in the US from three to six.

Globally, collaboration-app network outages increased from five to nine, and from three to six in the US.

There were two notable outages during the week.

On March 16, Arelion experienced an outage that impacted customers and downstream partners across multiple including, the US, Brazil, Australia, Canada, India, and Germany. The disruption lasted a total of 28 minutes, divided into two occurrences over a 35-minute period. First observed around 3:30 a.m. EDT, the first occurrence lasted 24 minutes and appeared to center on nodes in Dallas, Texas. Five minutes later some of the nodes appeared to recover, reducing the number of downstream partners and customers impacted. Around 3:55 a.m. EDT, the remaining nodes appeared to recover. Five minutes later the Dallas nodes exhibited outage conditions. The outage was cleared around 4:05 a.m. EDT. Click here for an interactive view.

On March 17, Cogent Communications experienced an outage that impacted multiple downstream providers as well as Cogent customers in countries including the US, Argentina, Brazil, Australia, Spain, and Canada. The 14-minute outage was observed around 5:00 a.m. EDT centered on Cogent nodes in San Francisco, California. Five minutes later, nodes in Los Angeles and Rancho Cucamonga, California, also exhibited outage conditions. As the nodes impacted increased, so did the number of customer networks and providers impacted. After 10 minutes into the outage the Rancho Cucamonga nodes appeared to recover and nodes in San Jose, California, exhibited outage conditions. The outage was cleared around 5:15 a.m. EDT. Click here for an interactive view.

Update March 14

Globally, the number of ISP outages decreased from 191 to 159,  down17%,] and in the US were down from 80 to 49, a 39% decrease.

Globally, cloud-provider network outages remained at seven for the third week in a row but dropped from six to three in the US.

Collaboration-app network outages remained at five worldwide and at three in the US.

There were two notable outages during the week.

On March 8, Google experienced a disruption that affected Google Traffic Director customers. Observed around 1:07 p.m. EST, multiple applications, such as Spotify and Wikipedia, returned HTTP 500 server errors for some users, indicating the presence of a backend issue, and appearing to affect Google customers who used shared Virtual Private Cloud (VPC). Network connectivity to affected applications was clear during the incident, further confirming that the issue was application related. Google later confirmed that they had mitigated the issue by rolling back a recent configuration change and forcing a reprogramming of configurations. The disruption lasted 2 hours and 35 minutes and was cleared around 3:42 p.m. EST. Click here for an interactive view.

On March 9, Microsoft experienced an outage on its network that affected some downstream partners and access to services running in Microsoft environments. The outage, which lasted 19 minutes, was observed around 5:35 p.m. EST and appeared centered on Microsoft nodes in Des Moines, Iowa. Ten minutes later the number of affected Des Moines nodes appeared to rise, temporarily increasing the number of affected partners. The outage was cleared around 5:55 p.m. EST. Click here for an interactive view.

Update March 7

Global outages across all three categories last week dropped from 273 to 256, a 6% decrease compared to the week prior. In the US, outages decreased from 116 to 109, also a 6% decrease.

ISP outages globally decreased from 197 to 191, down 3%, and in the US, decreased from 95 to 80, down 16%.

Cloud-provider network outages remained at seven globally but increased in the US from two to six.

Globally, collaboration-app network outages decreased from 14 to 5, down 64%, but in the US they increased from two to three.

There were two notable outages during the week.

On March 3, Oracle experienced an outage that affected Oracle Cloud customers and downstream partners in countries including the US, Malaysia, India, Japan, and Hong Kong. The outage was observed around 6:15 p.m. EST and appeared to center on Oracle nodes in Phoenix, Arizona, and Sweden. Five minutes later, the Sweden nodes appeared to recover, limiting the impact to the US and Hong Kong. The outage lasted 10 minutes in total and was cleared at around 6:30 p.m. EST. Click here for an interactive view.

On March 2, Google experienced an outage affecting its customers and downstream partners in the US and Brazil. The outage was divided into two occurrences over 39 minutes, starting at 6:25 a.m. EST and centered on Google nodes in Omaha, Nebraska. Five minutes later, nodes in Des Moines, Iowa, also exhibited outage conditions, which coincided with an increase in the number of affected customers and partners. Around 6:35 a.m. EST, the Des Moines nodes appeared to recover. About 6:40 a.m. EST, nodes in Des Moines and Sao Paulo, Brazil exhibited outage conditions. The first occurrence, lasting 28 minutes, cleared around 6:55 a.m. EST.  Five minutes after appearing to clear, the Des Moines and Omaha began exhibiting outage conditions again. The outage lasted 32 minutes in total and was cleared at around 7:05 a.m. EST. Click here for an interactive view.

Updated Feb. 28

Global outages across all three categories increased from 205 to 273, up 33%, while in the US they increased from 87 to 116, also up 33%, compared to the week prior.

Globally, ISP outages increased from 156 to 197, up 26%, and in the US they increased from 70 to 95, up 36%.

Global cloud-provider outages dropped from eight to seven, and from four to two in the US.

Collaboration-app network outages increased from 10 to 14 worldwide, but they droppped from five to two in the US.

There were two notable outages during the week.

Around 9 a.m. EST on Feb. 22, Slack experienced disruption to its business-communication platform that lasted around 3 hours and 14 minutes and impacted users accessing its messaging services. During the interruption, a number of application-based errors were observed, indicating that network connectivity to Slack was intact, and the problem resided within the back-end architecture. This was later confirmed by Slack, which identified the cause as a configuration change that inadvertently led to a sudden increase in activity on the Slack database infrastructure. That left some databases unable to serve incoming requests. Slack applied a combination of rate limits and a temporary redirection of requests to replica databases, allowing the system to recover. The outage was cleared around 12:14 p.m. EST.

Around 10:06 p.m. EST on Feb. 24, PCCW experienced an outage impacting some of its ISP customers and networks in countries including, the US and China. It appeared to center on PCCW nodes located in Ashburn, Virginia, and was cleared around 10:20 p.m. EST. Click here for an interactive view.

Updated Feb. 21

Global outages across all three categories decreased from 271 to 205, down 24% from the week before, while US outages decreased from 119 to 87, down 27%.

Globally, ISP outages decreased from 191 to 156, an 18% decrease, and the US outages dropped from 96 to 70, a 27% decrease.

Cloud provider outages worldwide dropped from 10 to eight, and in the US from six to four. 

Globally, collaboration-app network outages decreased from 13 to 10, while in the US they increased from two to five.

There were two notable outages during the week.

On Feb. 17, Level 3 Communications experienced an outage that impacted multiple downstream partners and customers across the US for a total of 28 minutes, divided into two occurrences distributed over an hour and 35 minutes. The first occurrence was observed around 2:40 p.m. EST centered on Level 3 nodes in Salt Lake City, Utah. An hour and five minutes after appearing to clear, the Salt Lake nodes began exhibiting outage conditions again that lasted nine minutes. The outage was cleared around 4:15 p.m. EST. Click here for an interactive view.

On Feb.17, Oracle experienced an outage on its network that affected customers and downstream partners interacting with Oracle Cloud services in countries including the US, Hong Kong, Australia, and Brazil. The outage was observed around 3:05 p.m. EST centered on Oracle nodes in Phoenix, Arizona, and Sao Paulo, Brazil. Five minutes later, the Sao Paulo nodes appeared to recover, reducing the number of affected countries to the US, Hong Kong, and Australia. The outage lasted 18 minutes and was cleared around 3:25 p.m. EST. Click here for an interactive view.

Updated Jan. 17

Global outages all three categories last week increased from 225 to 271, up 20% while in the US they rose from 104 to 119, a 14% increase.

The number of ISP outages globally increased from 151 to 191, up 26%, and in the US they were up from 79 to 96, a 22% increase.

Cloud-provider network outages decreased from 11 to 10. In the US they decreased from seven to six.

Globally, collaboration-app network outages decreased from 14 to 13 while in the US they dropped from three to two.

On Feb. 9, Oracle experienced an outage affecting Oracle Cloud customers and partners interacting with those services in countries including the US, Japan, Germany, Ireland, India, Canada, Brazil, Belgium, the Netherlands, Malaysia, Finland, the UK, Sweden, Poland, Spain, Australia, New Zealand, South Africa, Hong Kong, Austria, Russia, Turkey, Hungary, Taiwan, Greece, Portugal, Ukraine, and China. The outage came in three waves over a space of 3 hours and 15 minutes. The first occurrence lasting about 14 minutes started around 8:20 p.m. EST and centered on Oracle nodes in San Jose, California; Phoenix, Arizona; Jerusalem, Israel; Hyderabad, India; and Tokyo, Japan. An hour and 14 minutes after that, nodes in San Jose; Hyderabad; Amsterdam, the Netherlands; and Sao Paulo, Brazil, exhibited outage conditions. After 15 minutes nodes in Sydney and Melbourne, Australia; Singapore; Ashburn, Virginia; Washington, DC; Frankfurt, Germany; San Francisco, California; Toronto, Canada; and Phoenix, Arizona; exhibited error conditions as well. That second occurrence, lasted an hour and 4 minutes, appearing to clear around 11:20 p.m. EST. After 10 minutes, nodes in San Jose and Hyderabad once again exhibited outage conditions for 4 minutes, affecting customers in the US, Poland, and the UK. That phase of the outage cleared at 11:35 p.m. EST, making the total time of the outages an hour and 22 minutes. Click here for an interactive view.

On Feb. 8, Time Warner Cable experienced a disruption that affected customers and partners in countries including the US, Canada, France, Hong Kong, the UK, India, Singapore, Australia, Germany, Ireland, Malaysia, Brazil, the Netherlands, Italy, Indonesia, Japan, Mexico, and Republic of Korea. Outages occurred in five periods over the course of an hour and 15 minutes. The first period started at 1:30 p.m. EST and lasted four minutes and appeared to center on nodes in Denver, Colorado. About 20 minutes after that a second, 24-minute occurrence was observed on nodes in Los Angeles, California, and 10 minutes into that second period, nodes in Las Vegas, Nevada, and Denver Colorado, also began exhibiting outage conditions. Around 2:15 p.m. EST, the Los Angeles, Las Vegas, and Denver nodes appeared to clear. Five minutes later, at around 2:20 p.m EST, the Los Angeles and Denver nodes exhibited outage conditions again in three, 4-minute bursts over a 40-minute period. It was cleared around 2:45 p.m. EST. Click here for an interactive view.

Updated Jan. 31

Global outages across all three categories last week decreased from 243 to 198, down 19% compared to the week prior. In the US, outages increased from 81 to 86, up 6%.

Globally, the number of ISP outages dropped from 188 to 144, down 23% while in the US they decreased from 68 to 66, down 3%.

Cloud provider network outages globally to increased from six to seven and from four to five in the US.

Globally, collaboration-app network outages dropped from 11 to three and from three to one in the US.

On Jan. 27, Hurricane Electric experienced an outage that impacted customers and downstream partners across countries including the US, the UK, France, Hong Kong, India, Japan, Germany, Singapore, China, Malaysia, and Canada. The outage was divided into three occurrences over a five-minute period. The first period, lasting around 9 minutes, was observed around 12:25 a.m. EST, centering on Hurricane Electric nodes in New York, New York, and Chicago, Illinois. Around 15 minutes after the first occurrence appeared to clear, the second was observed on nodes in Chicago. Around 1 a.m. EST, the Chicago nodes appeared to recover, but 10 minutes later, they and nodes in New York began exhibiting outage conditions again. This occurrence lasted around a minute before appearing to clear. The total outage lasted around 26 minutes and was cleared around 1:25 a.m. EST.

Updated Jan. 24

Global outages across all three categories increased from 236 to 243 last week, a 3% increase compared to the week prior. In the US, outages decreased from 86 to 81, down 6%.

Globally, the number of ISP outages increased from 173 to 188, up 9%, and in the US they increased from 66 to 68, up 3%.

Global cloud-provider outages decreased from 10 to six, and in the US they dropped from five to four.

Collaboration app network outages worldwide increased from six to 11, while in the US they remained at three for the third week in a row.

There were two notable outages last week.

On Jan. 20, Cogent Communications, experienced a series of outages over a period of an hour and 28 minutes that impacted customers globally and multiple downstream providers. The 40-minute outage was first observed around 10:08 p.m. EST and initially centered on Cogent nodes in Denver, Colorado, and Salt Lake City, Utah. This initial outage lasted around a minute and the Cogent environment remained stable for 15 minutes before experiencing a 9-minute outage observed on Cogent nodes in Dallas, Texas, and Phoenix, Arizona. An hour and a half after the initial outage, a 4-minute outage was observed on Denver nodes. Five minutes after that outage appeared to clear nodes in Denver began exhibiting outage conditions again. Fifty-two minutes after first being observed, the outage reappeared, affecting nodes in Sacramento and Oakland, California, Salt Lake City, and Denver, affecting more and more customers and providers. This occurrence lasted 17 minutes before appearing to clear around 11:20 p.m. EST. Five minutes later, Salt Lake City nodes exhibited outage conditions before clearing and being replaced by nodes located in Sacramento. The outage was cleared around 11:35 p.m. EST. Click here for an interactive view of the outage.

On Jan. 17, PCCW, a Hong Kong-based Tier 1 ISP, experienced an outage impacting some of its customers and networks in countries including the US, Singapore, Thailand, and China. The outage was first observed around 9:15 p.m. EST and appeared to center on PCCW nodes in Singapore. It lasted 34 minutes and was cleared around 9:50 p.m. EST. Click here for an interactive view of the outage.

Updated Jan. 17

Global outages across all three categories increased from 206 to 236 last week, up 5% compared to the week before. In the US they increased from 78 to 86, up 10%

Globally, ISP outages increased for the second consecutive week, rising from 150 to 173, up 15%, while in the US they increased from 57 to 66, up 16%.

Cloud-provider network outages worldwide jumped from two to 10, and from two to five in the US.

Globally, collaboration-app network outages doubled from three to six, and remained the same at three in the US.

Notable outages

On Jan. 10, Cogent Communications experienced an outage that affected some of its downstream providers and customers in countries including the US and Mexico. The outage was first observed around 10:20 p.m. EST and appeared to center on nodes in Phoenix, Arizona. The outage lasted 14 minutes and was cleared around 10:35 p.m. EST. Click here for an interactive view of the outage.

On Jan.12, Hurricane Electric, experienced an outage affecting customers and downstream partners across regions including the US, Hong Kong, Japan, Canada, Australia, the Philippines, and Thailand. The 14-minute outage was first observed around 1 p.m. EST and appeared to center on Hurricane Electric nodes in Minneapolis, Minnesota, and Seattle, Washington. Five minutes later, the Minneapolis nodes appeared to recover and those in Tokyo, Japan, exhibited outage conditions. This coincided with an increase in the number of downstream partners and countries impacted. Around 1:10 p.m. EST, the Tokyo nodes appeared to clear. The outage was cleared around 1:15 p.m. EST. Click here for an interactive view of the outage.

Update Jan. 10

Global outages across all three categories last week increased from 180 to 206, up 14% compared to the week prior. In the US, outages increased from 48 to 78, a 63% increase.

Globally the number of ISP outages increased from 121 to 150, up 24%, and in the US they increased from 40 to 57, up 43%.

Cloud-provider network outages dropped from five to two worldwide, but in the US they increased from zero to two.

Globally, collaboration-app network outages decreased from five to three, and rose from two to three in the US.

There were two notable outages during the week.

On Jan. 3, Oracle experienced an outage on its network affecting Oracle Cloud services in the US. The outage was divided into two occurrences over a 25-minute period. The first occurrence was observed around 5:20 a.m. EST centered on Oracle nodes in Austin, Texas, and lasted about a minute. Five minutes after appearing to clear, they began exhibiting outage conditions again that lasted 12 minutes. The outage was cleared around 5:35 a.m. EST.

On Jan. 5, Microsoft experienced an outage on its network that affected access to services running on Microsoft environments. The outage, first observed around 11:20 p.m. EST, lasted nine minutes and appeared to be centered on Microsoft nodes in Chicago, Illinois. Five minutes later, a number of the Chicago nodes appeared to recover, reducing the number of affected partners. The outage was cleared around 11:30 p.m. EST. Given the duration, timing, and location of the nodes the cause is likely to have been a maintenance exercise.

Update Dec. 13

Global outages across all three categories last week increased from 287 to 356, up 24% from the week prior. In the US, outages increased from 103 to 130, up 26%.

Globally, the number of ISP increased from 209 to 261, up 25%, while in the US, they increased from 85 to 108, up 27%.

Cloud-provider network outages worldwide dropped from 28 to 27. In the US, they increased from two to five.

Globally, collaboration-app network outages jumped from five to 13 outages, while in the US they increased from three to seven.

On December 7, AWS experienced an outage disrupting users and customers accessing its services in regions across the globe. The outage was first observed around 10:40 a.m. EST and appeared to be centered on infrastructure in the AWS US-EAST-1 region, located in Northern Virginia. It initially affected services relied upon by non-Amazon apps and services, across regions including in the US, Europe, and APJC. The impact varied depending on the user’s IP address. At around 12:37 p.m. EST, Amazon announced it had identified issues related to an application programming interface (API) and were working on recovering services. At 5:43 p.m. EST, Amazon announced it had mitigated the underlying issue, and services began to return to normal. Around 6:03 p.m. EST, most services had been restored, with some disruption still being experienced on the AWS API gateway service. The disruption can be divided into two occurrences, with the first appearing to be the most prominent and lasting around an hour and four minutes; many of the services were restored around 11:44 a.m. EST. The second, lasting around 8 hours, was cleared around 8 p.m. EST.

On December 9, NTT America experienced an outage that affected some customers and downstream partners across the US and Japan. The 54-minute outage was observed around 2:00 a.m. EST and appeared to center on NTT nodes  in Ashburn, Virginia. Fifteen minutes into the outage, some of the nodes appeared to clear, leaving just downstream customers and partners located in the US affected. The outage was cleared around 2:55 a.m. EST.

Update Dec. 6

Global outages across all three categories last week decreased from 290 to 287. In the US they increased from 93 to 103.

Globally, ISP outages decreased from 234 to 209, down 11%, while in the US they increased from 79 to 85, up 8%.

Cloud-provider network outages increased from 17 to 28 worldlwide, and in the US they dropped from three to two.

Globally, collaboration-app network outages decreased from seven to five while in the US they remained at three.

There were two notable outages during the week. On November 29, Hurricane Electric experienced an outage that affected customers and downstream partners including in the US, Hong Kong, Republic of Korea, Singapore, Philippines, Malaysia, Switzerland, and Canada. The 12-minute outage was observed around 4:20 p.m. EST centered on Hurricane Electric nodes in Hong Kong. Five minutes later nodes in San Jose, California, also exhibited outage conditions, increasing the number of partners and countries affected. Around 4:30 p.m. EST, the Hong Kong nodes appeared to clear, leaving nodes in San Jose,  Minneapolis, Minnesota, and Seattle, Washington, with outage conditions until the outage cleared around 4:35 p.m. EST.

At 5:55 p.m. EST on December 2, GTT Communications experienced an outage centered on GTT nodes in San Jose, California, that affected services in the US and the UK. The outage was cleared around 6:10 PM EST.

Update Nov. 29

Globally, outages across all three categories last week decreased from 388 to 290, down 25% and dropped from 153 to 93 in the US, down 39%.

Globally, the number of ISP outages last week decreased from 287 to 234, down, 18%, and from119 to 79 in the US, a 34% drop.

Worldwide, cloud-provider network outages decreased from 22 to 17, and remaind at three in the US.

Globally, collaboration-app network outages decreased from eight to seven, while in the US they dropped from six to three.

On Nov. 22, Telia Carrier experienced an outage that impacted customers and partners across countries including the US, Panama, Costa Rica, Brazil, Argentina, and Australia. First observed around 11:20 p.m. EST, the outage appeared to center on nodes in Atlanta, Georgia. Nodes in San Francisco, California, exhibited outage conditions 10 minutes and appeared to recover around 11:35 p.m. EST. The Atlanta outage continued and was cleared around 11:50 p.m. EST. Click here for an interactive view of the outage.

On Nov. 23, Cogent Communications, experienced an outage affecting providers and customers in countries, including the US, Spain, New Zealand, Mexico, Canada, and Australia. The outage played out in three occurrences over an hour and five minutes starting around 1:20 a.m. EST centered on Cogent nodes in San Francisco, Oakland, and Sacramento, California, and Washington, DC. The first occurrence lasted four minutes before all nodes appeared to recover. The second occurrence started 25 minutes later around 1:50 a.m. EST and lasted around eight minutes. It appeared to center on nodes in Sacramento. Five minutes into the second occurrence nodes in San Francisco, Oakland, and Salt Lake City, Utah exhibited outage conditions. The third occurrence started 10 minutes after the second appeared to clear and affected nodes in Atlanta, Georgia. The outage lasted a total of 21 minutes and was cleared at around 2:25 a.m. EST. Click here for an interactive view of the outage.

Update Nov. 15

Global outages across all three categories increased from 307 to 360, a 17% increase compared to the week prior. In the U.S., outages increased from 141 to 163, a 16% increase.

Globally ISP outages increased from 221 to 259, up 17%, and in the US they increased from 111 to 135, up 22%.

Globally, cloud-provider network outages increased from 16 to 27, a 69% increase, and in the US, they rose from one to two.

Globally, collaboration-app network outages rose from six to 11, and from six to nine in the US.

On Nov. 9, Comcast Cable Communications experienced two outages that affected downstream partners and customers across the US. The first outage, lasting over an hour, was observed at approximately 12:45 a.m. EST and appeared to center on Comcast nodes in Sunnyvale, California, and primarily affecting customers on the West Coast. After appearing to clear around 1:25 AM EST, they again exhibited outage conditions, before clearing 1:35 a.m. EST. The second outage observed about 8:05 a.m. lasted over an hour and intermittently impacted routes across regions including Chicago, Illinois; Pittsburgh, Pennsylvania; and  Ashburn, Virginia. The outage was cleared around 9:20 a.m. EST.

On Nov. 11, NTT America experienced an outage that impacted some of its customers and downstream partners across the US, Brazil, China, Japan, and Canada. Observed around 1 p.m. EST, the 19-minute outage appeared to center on NTT nodes in Osaka, Japan. Five minutes later, nodes in San Jose, California, also began to exhibit outage conditions. Around 1:10 p.m. EST the San Jose nodes appeared to recover, and the outage was cleared around 1:20 p.m. EST

Update Nov. 8

Global outages in all three categories increased from 278 to 307, a 10% increase compared to the week prior. In the US, however, they decreased from 147 to 141, a 4% dip.

Globally, ISP increased from 197 to 221, up 12%, and also increased in the US from 109 to 111, up 2%.

Cloud-provider network outages globally decreased from 22 to 16 outages, and in the US, dropped from five to one.

Globally, collaboration-app network outages, all of them in the US, decreased from eight to six.

Two notable outages:

On Nov. 4, Telia Carrier experienced an outage affecting customers and downstream partners across the US. The disruption lasted a total of 43 minutes, divided into two occurrences over a 55-minute period. First observed around 7:40 a.m. EDT, the first occurrence lasted 39 minutes cetnered on nodes in Phoenix, Arizona, and Los Angeles, California. Five minutes later, the Phoenix nodes appeared to recover, before appearing to exhibit outages again 10 minutes later. The Los Angenes nodes appeared to recover 35 minutes after first being observed, leaving the Phoenix nodes the only ones exhibiting outage conditions. The outage was cleared around 8:35 a.m. EDT.

On Nov. 4, Cogent Communications experienced an outage affecting some downstream providers and customers in the US, Mexico, Hong Kong, Canada, the UK, Germany, and Singapore. The 33-minute outage was observed around 10:15 p.m. EDT centered on nodes in Kansas City, Missouri, and Chicago, Illinois. After 15 minutes, nodes in Cincinnati and Cleveland, Ohio, San Francisco, California, and Denver, Colorado, also showed outage conditions. Five minutes later the Cleveland and Cincinnati nodes appeared to recover, leaving the US as the only affected country. Thirty minutes after the initial occurrance, only the Chicago nodes showed outage conditions. The outage was cleared around 10:50 p.m. EDT.

Update Nov. 1

Outages globally across all three categories decreased from 354 to 278, a 21% drop compared to the week before. In the US they decreased from 154 to 147, down 5%.

ISP outages globally decreased from 254 to 197, down 22%, and from 112 to 109 in the US, down 3%

Both globally (22) and in the US (5), cloud-provider network outages remained the same.

Collaboration app network outages increased from one to eight, all of them in the US.

There were two notable outages during the week.

On Oct. 27, Hurricane Electric experienced an outage that affected customers and partners across countries including the US, the UK, Russia, Brazil, France, Hong Kong, New Zealand, Finland, Japan, Germany, the Netherlands, and Canada. The outage was divided into eight occurrences over two hours and 10 minutes. The first, lasting around 5 minutes, was observed at around 9:29 p.m. EDT, centering on Hurricane Electric nodes located in New York, New York, and Chicago, Illinois. Around 15 minutes after it appeared to clear, the second one started. Five minutes into it, nodes in Ashburn, Virginia exhibited outages. Around 10 p.m. EDT, all the nodes appeared to recover, but 10 minutes later the nodes in Chicago began exhibiting outage conditions again for about two minutes. Fifteen minutes later, nodes in New York and Chicago  again exhibited outage conditions. Nodes in those locations repeatedly appeared to recover then exhibited outage conditions for the next five occurrences. The total outage lasted around 40 minutes and was cleared at around 11:40 p.m. EDT

On Oct. 25, Rackspace Technology experienced an outage affecting some customers and partners across countries including the US, France, the UK, Spain, Singapore, Hong Kong, Vietnam, Mexico, Chile, Switzerland, Canada, and the Netherlands. The outage, lasting around 28 minutes, was observed around 10:46 a.m. EDT and appeared to center on Rackspace nodes in Chicago, Illinois. Twenty minutes later, some of the nodes apparently began to recover before clearing at around 11:15 a.m. EDT.

Update Oct. 25

Global outages across all three categories dropped from 387 to 354, a 9% decrease compared to the week prior. In the US, outages decreased from 185 to 154, down 17%.

The number of ISP outages globally dropped from 281 to 254, down 10%, and US outages dropped from 150 to 112, down 25%.

Globally, cloud-provider outages decreased from 30 to 22, down 27%, while in the  US, they were cut in half, from 10 to five.

There was just one collaboration-app network outage and that was in the US. The previous week there were two outages globally.

There were two notable outages during the week.

On Oct. 19, Hurricane Electric, experienced an outage affecting customers and downstream partners across the US, Australia, the United Arab Emirates, Thailand, Malaysia, and Canada. The outage was divided into three occurrences over 34 minutes, starting at 1:51 a.m. EDT, centering on Hurricane Electric nodes in Chicago, Illinois. Around five minutes after the first occurrence appeared to clear, outage conditions appeared at nodes located in Los Angeles, California that lasted around eight minutes. The outage was cleared around 2:25 a.m. EDT.

On Oct. 21, Level 3 Communications experienced an outage affecting multiple downstream partners and customers. The outage lasted around 11 minutes total, divided between two occurrences distributed over a 25-minute period. The first lasted around eight minutes and was observed around 5:30 a.m. EDT and appeared centered on Level 3 nodes in Phoenix, Arizona. It appeared to clear, but 10 minutes later the Phoenix nodes exhibited outage conditions again. The outage was cleared around 5:55 a.m. EDT.

Update Oct. 18

Global outages across all three categories increased from 352 to 387, a 10% increase, and from 184 to 185 in the US.

The number of ISP outages increased from 270 to 281, and decreased in the US from 163 to 150.

Globally, cloud-provider network outages rose from 18 to 30, while in the US, they increased from three to 10.

Collaboration-app network outages worldwide increased from one to two, and stayed the same at one in the US.

There were two notable outages during the week. On October 13, GTT Communications experienced an outage that affected partners and customers across countries including the US, China, the UK, Ireland, India, and Japan. The 44-minute outage was first observed around 11:10 p.m. EDT, initially center on GTT nodes in London, England, and San Jose, California. Twenty-five minutes later nodes in San Jose appeared to recover. The outage was cleared around 11:55 p.m. EDT.

On October 11, Microsoft experienced an outage on their network that affected downstream partners and access to services running on Microsoft environments. The 34-minute outage was first observed around 5:05 p.m. EDT centered on Microsoft nodes in Amsterdam, the Netherlands. Five minutes later, nodes in Frankfurt, Germany, began exhibiting outage conditions, affecting more partners. Around 5:15 p.m. EDT, the nodes located in Frankfurt appeared to recover. The outage was cleared around 5:40 p.m. EDT. Given the duration and timing, relative to the location of the nodes at the center of the outage, it is likely to have been a maintenance exercise.

Update Oct. 11

Outages across all three categories during the past week increased from 323 to 352, up 9%. In the US, outages increased from 161 to 184, a 14% increase.

Globally, ISP outages went from 239 to 270, up13%, while in the US, ISP outages increased from 132 to 163, up 23%.

Cloud provider network outages dropped from 26 to 18, down 31%. In the US, they dopped by half, from six to three.

There was one collaboration-app network outage, and that occurred in the US, dropping the worldwide and US levels from six the week before.

On Oct 4, Facebook ‘s backbone network suffered an outage that disconnected its data centers globally, making Facebook, Instagram, and WhatsApp unavailable to all users for more than seven hours. Initially observed around 11:40 a.m. EDT, the backbone network outage triggered a second issue: Facebook’s authoritative name servers, detecting that the network connection was “unhealthy,” stopped advertising routes to Facebook’s servers, rendering the services inaccessible. Facebook identified the root cause of the outage as a faulty configuration change and implemented a fix. Around 6:20 p.m. EDT, connectivity to services began to return, and the outage cleared around 6:45 p.m. EDT. A more detailed analysis of the outage can be found here.

On Oct. 7, Telia Carrier, a Tier 1 ISP headquartered in Stockholm, Sweden, experienced an outage affecting customers and downstream partners in the US, Spain, Germany, Austria, the UK, the Philippines, Japan, New Zealand, South Africa, the Czech Republic, Egypt, the Netherlands, India, and Canada. First observed around noon EDT, the outage initially centered on nodes in Ashburn, Virginia, and Sweden. Five minutes later the nodes in Sweden appeared to recover, while nodes in Newark, New Jersey, and London, England, began exhibiting outage conditions, increasing the number of countries and downstream partners affected. All except the London nodes appeared to recover 35 minutes after first being observed. The outage lasted an hour and 14 minutes and was cleared around 1:15 p.m. EDT.

Update Oct. 4

Global outages across all three categories last week decreased from 367 to 323, a 12% dip compared to the week prior. US outages remained the same at 161.

Globally, the number of ISP outages increased from 233 to 239, and in the US they increased from 115 to 132, up 15%.

Cloud-provider network outages worldwide dropped from 41 to 26, down 37%, while in the US they increased from four to six.

Both globally and in the US, collaboration-app network outages decreased by two, from eight to six.

There were two notable outages during the week.

On September 30, Oracle experienced an outage that affected customers and downstream partners interacting with Oracle Cloud services in the US. The outage was observed around 4:01 p.m. EDT and appeared to center on Oracle nodes in Frankfurt, Germany,  and Arlington and Ashburn, Virginia. Five minutes later, all nodes except those in Arlington appeared to recover. The outage lasted 7 minutes and was cleared around 4:10 p.m. EDT.

On September 29, NTT America experienced an outage that affected some customers and downstream partners across the US, Japan, and Hong Kong. It lasted around 18 minutes total, divided between two occurrences over a 25-minute period. The first was observed at 7:10 p.m. EDT and appeared centered on NTT nodes in New York, New York. It lasted four minutes and appeared to clear at around 7:15 p.m. EDT. Five minutes later, the second occurrence, lasting 14 minutes, affected the New York nodes plus nodes in Ashburn, Virginia. The outage to nodes in New York appeared to clear 10 minutes later, and the outage in Ashburn was cleared around 7:35 p.m. EDT.

Update Sept. 27

Global outages across all three categories last week increased from 276 to 367, a 33% increase compared to the week prior. In the US, outages increased from 116 to 161, a 39% increase.

Globally ISP outages increased from 186 to 233, a 25% increase, and in the US they increased from 87 to 115, up 32%.

Cloud-provider network outages jumped from 24 to 41, up 71%, while in the US they increased from three to four.

Collaboration-app network outages increased by eight worldwide and also by eight in the US.

Collaboration-app network outages reached the highest number they have reached this year. Both globally and in the US, they were up by eight, increasing from two to 10 worldwide, and from one to nine in the US.

There were two notable outages this week.

On September 21, Oracle experienced an outage affecting customers and downstream partners in the US, Germany, Finland, Singapore, Hong Kong, Australia, and Thailand. The outage was observed around 10:25 a.m. EDT and appeared to center on Oracle nodes in Frankfurt, Germany. Fifteen minutes later some of the nodes appeared to recover, reducing the affected countries to Finland, Hong Kong, and Australia. The outage lasted 24 minutes and was cleared at around 10:50 a.m. EDT.

On September 20, Comcast Communications experienced an outage that impacted downstream partners and customers in the US, Switzerland, China, and Hong Kong. The 18-minute outage was  observed around 4:50 p.m. EDT and appeared to center on Comcast nodes in Sunnyvale and Los Angeles, California; and Houston, Texas. Five minutes after it started, the disruption expanded to nodes in Santa Clara, California; New York, New York; Richmond, Virginia; Pittsburgh, Pennsylvania; and Denver, Colorado. Around 5:05 p.m. EDT, all nodes except those in New York appeared to recover The outage was cleared around 5:10 p.m. EDT.

Update Sept. 20

Global outages across all three categories during the past week increased from 209 to 276, a 32% increase compared to the week prior. In the US, outages increased from 85 to 116, a 36% increase.

Globally, the number of ISP outages increased from 149 to 186, up 25%, and in the US, they increased from 72 to 87, a 21% increase.

Cloud-provider network outages increased from 20 to 24, while in the US they increased from two to three.

Collaboration-app outages worldwide remained the same at two, and also the same in the US at one.

On September 14, Zayo Group, experienced an outage that affected some of its partners and customers in countries including the US, Hong Kong, Switzerland, and the UK. The outage lasted around 15 minutes, was first observed around 3 p.m. EDT, and appeared to center on Zayo Group nodes located in Philadelphia, Pennsylvania. Ten minutes later some of the node appeared to recover, reducing the number of impacted parties before the outage was cleared around 3:20 p.m. EDT.

On September 13, Microsoft experienced a network outage that impacted downstream partners and access to services running on Microsoft environments. The 29-minute outage was first observed around 1:25 a.m. EDT and appeared to be centered on Microsoft nodes in Des Moines, Iowa. Five minutes later, nodes located in Chicago, Illinois, showed outage conditions, resulting in an increase in the number of impacted partners. Around 1:35 a.m. EDT, nodes in Cleveland, Ohio, also showed outage conditions. Twenty five minutes after the outage was first observed, the nodes in Des Moines and Cleveland appeared to recover about 25 minutes into the outage leaving just the Chicago nodes as the only ones still out.The outage was cleared around 1:55 a.m. EDT. Given the duration and timing relative to the location of the nodes at the center of the outage, it is likely to have been a maintenance exercise.

Update Sept. 13

Global outages across all three categories last week increased by one, from 210 to 209. In the US, they decreased from 90 to 85.

ISP outages grew worldwide from 140 to 149 and grew in the US from 68 to 72.

Globally, the number of cloud-provider network outages stayed the same at  20, while in the US they dropped from seven to two.

Collaboration-app network outages remained the same at two, and in the US the number rose from zero to one.

There were two notable outages.

At 1:45 p.m. EDT on Sept. 7, Amazon’s network experienced a disruption affecting downstream partners and customers in the US, Japan, South Korea, Ireland, and the Philippines. It lasted around 18 minutes distributed across two occurrences over a 35-minute period. The first instance appeared to center on Amazon nodes in Incheon, South Korea, and lasted around nine minutes affecting users in the US, the Philippines, and South Korea. Around 2:10 a.m. EDT, approximately 15 minutes after the first occurrence cleared, a second occurrence was observed, also lasted around 9 minutes, and appeared to again center on Amazon nodes located in Incheon, South Korea. It  affected service in South Korea, Japan, and Ireland. The outage was cleared around 2:20 a.m. EDT.

On Sept. 6, Cogent Communications, experienced a series of outages over a half-hour period that affected downstream providers and customers in multiple countries, including the US, Spain, Greece, Germany, Luxembourg, the Ukraine, and Portugal. It lasted 13 minutes divided between two occurrences over a 25 minute period. The first occurrence was observed around 7:15 p.m. EDT centered on Cogent nodes in Sacramento, California, and Bilbao, Spain, and lasted around  nine minutes. The Cogent environment was stable for 15 minutes then experienced a series of four-minute outages observed on Cogent nodes in Bilbao, Spain. The outage was cleared around 7:45 p.m. EDT.

Updated Sept. 6

Global outages across all three categories last week decreased from 286 to 210, down 26%, with most of the decline coming from improved performance in the US where the total outages dropped from 145 to 90, down 37%.

ISP outages globally dropped from 214 to 140, a 35% decrease. In the US they decreased from 119 to 68, a 43% drop.

Globally, cloud-provider outages increased from 17 to 20, while in the US they increased from five to seven.

Worldwide, collaboration-app network outages dropped by from four to two, with the decrease coming from improved performance in the US where outages dropped from two to zero.

There were two notable outages during the week.

At 2:15 a.m. EDT on Sept. 1, Microsoft experienced a 29-minute outage affecting downstream partners and services running in Microsoft environments that appeared centered on nodes in Des Moines, Iowa; Chicago, Illinois; and Spokane, Washington. Five minutes in, nodes in New York, New York; Los Angeles, California; Amsterdam, the Netherlands; Sydney, Australia; Portland, Oregon; and Cleveland, Ohio; also began exhibiting outages. Around 2:25 AM EDT, the nodes in New York, Sydney, Portland, Chicago, Spokane, and Cleveland appeared to recover. The outage was cleared around 2:45 AM EDT. Given the duration, timing, and the location of the nodes, it is likely to have been a maintenance exercise.

At 12:05 a.m. EDT on Sept. 2, Cogent Communications experienced an outage affecting multiple downstream providers and customers in the US, Brazil, Australia, Canada, Singapore, South Africa, and the K. It lasted around 14 minutes and appeared to be centered on Cogent nodes in New York, New York. Five minutes into it some of the nodes appeared to recover, reducing the number of countries affected to just the US. The outage was cleared around 4:20 AM EDT.

Updated Aug. 30

Global outages across all three categories remained fairly steady last week, increasing from 284 to 286. In the US, outages bumped up from 99 to 145 outages, a 46% increase.

Globally, the number of ISP outages increased from 193 to 214, up 11%, while in the US, they increased from 68 to 119, a 75% jump.

Worldwide cloud provider outages dropped from 33 to 17 outages, down 48%. In the US they also dropped, from 19 to 5, a 74% decrease.

Collaboration-app network outages worldwide remained at four, and decreased from three to two in the US.

There were two notable outages during the week.

On August 24, Cogent Communications experienced an outage that impacted both downstream providers as well as customers in countries including the US, New Zealand, Ukraine, Spain, Mexico, Luxembourg, Hong Kong, Singapore, the UK, Republic of Korea, Portugal, Japan, Germany, Greece, China, Philippines, Brazil, Australia, France, and Argentina. First observed around 5:35 a.m. EDT, it appeared initially centered on Cogent nodes in Los Angeles, California. Five minutes later, the location of nodes terminating traffic increased to include Phoenix, Arizona; Las Vegas, Nevada; San Francisco, San Jose, and Fullerton, California; El Paso and Houston, Texas; Atlanta, Georgia; and Hong Kong. The outage lasted 59 minutes and appeared to coincide with Cogent planned-maintenance that involved code upgrades on their infrastructure in the Los Angeles area. The outage was cleared around 6:35 a.m. EDT.

Also on August 24, Zayo Group experienced an outage affecting some of its partners and customers in countries that included the US, Canada, Singapore, India, and Hong Kong. The outage was first observed around 3 a.m. EDT and lasted around 18 minutes. Initially it appeared centered on Zayo Group nodes in Seattle, Washington, and Los Angeles, California. Five minutes later the location of nodes exhibiting outages expanded to include San Diego, California, and Phoenix, Arizona. For the next 10 minutes, the number of nodes terminating traffic began to clear until around 3:15 a.m. when the only outage conditions were in Phoenix. That outage cleared around 3:20 a.m. EDT.

Updated Aug. 23

Worldwide outages across all three categories last week increased from 275 to 284 compared to the previous week. In the US, outages increased from 85 to 99.

Globally, the number of ISP outages decreased from 200 to 193, but in the US they increased 11% from 61 to 68.

Cloud provider network outages worldwide from 29 to 33 outages, and in the US, jumped from three to 19.

Globally and in the US, Collaboration-app network outages decreased globally from five to four and in the US from four to three.

There were two notable outages during the week.

At 4 a.m. EDT on August 18, Rackspace Technology experienced an outage affecting some of its customers and in countries including the US, Switzerland, Canada, and Germany. The outage lasted around 30 minutes in total, divided among six occurrences distributed over a period of 75 minutes. The first appeared to center on Rackspace nodes in Washington, DC, and lasted 14 minutes before appearing to clear at around 4:15 a.m. EDT. Five minutes later a series of four outages were observed, each lasting around four minutes with a period of five minutes between each, still appearing to center on nodes in Washington. The final occurrence was observed around 5:10 a.m. EDT, before clearing around 5:15 a.m. EDT. 

At 6 p.m. on August 18, NTT America experienced an outage affecting customers and downstream partners across the US and Canada. It lasted about six minutes and was divided into two occurrences over a 15 minute period, appearing to center on NTT nodes in Seattle, Washington. It was cleared around 6:15 p.m. EDT.

Updated Aug. 16

The total number of outages worldwide across all three categories increased from 201 to 275 during the past week, up 37%, while in the US they rose from 65 to 85, up 31%.

Globally the number of ISP outages increased from 149 to 200, a 34%, rise and in the US went up from 51 to 61, a 20% increase.

Worldwide cloud-provider network outages jumped from 11 to 29, a 163% increase, and in the U.S., increased from two to three.

Globally, collaboration-app network outages increased from one to five outages, while the number in the us went from zero to four.

There were two significant outages during the week. At 7:40 p.m. on Aug. 11, Telia Carrier experienced an outage affecting customers and downstream partners across countries including the US, Germany, France, the UK and Canada. It appeared to center on nodes in London, England. Five minutes later, a number of those nodes appeared to recover, reducing the number of countries impacted by the outage to the UK, Germany, the US, and France. The outage lasted 21 minutes and was cleared around 8:05 p.m. EDT.

At 5:35 a.m. on August 12, GTT Communications experienced an outage that affected partners and customers across countries including the US, India, Japan, Spain, the UK, and the Netherlands. It lasted around 11 minutes and appeared to center on GTT nodes in Seattle, Washington. It was cleared around 5:50 AM EDT.

Updated Aug. 9

Global outages across all three categories decreased from 278 to 201, a 28% decrease compared to the week prior. In the US, they decreased from 131 to 65 outages, a 50% drop.

Globally, the number of ISP outages decreased from 191 to 149, down 22%, and in the US, they decreased from 103 to 51, a 50% drop.

Worldwide cloud-provider network outages dropped from 22 to 11, a 50% decrease compared to the week prior, and in the US decreased from 5 to 2.

Globally collaboration-app-network outages decreased from four to one and in the US from two to zero.

There were two significant outages during the week. At 2:15 a.m. EDT on August 3, Microsoft experienced an outage on their network that affected some downstream partners and access to services running on Microsoft environments. The 29-minute outage appeared to be centered on Microsoft nodes in Des Moines, Iowa. Ten minutes later, nodes in Chicago, Illinois and Cleveland, Ohio also began exhibiting outage conditions. Around 2:35 a.m. EDT, the Chicago and Cleveland nodes appeared to recover, leaving those in Des Moines and Portland, Oregon, as the only ones exhibiting outage conditions. The outage was cleared around 2:45 a.m. EDT. Given the duration and timing relative to the location of the nodes involved, it is likely to have been a maintenance exercise.

At 1:20 a.m. EDT on August 6, Hurricane Electric experienced an outage affecting customers and downstream partners in countries including the US, Ireland, the UK, Finland, the Netherlands, France, Russia, South Africa, Germany, India and Canada. The outage was divided into two occurrences over a 55-minute period. The first period lasted around 8 minutes centering on nodes in New York, New York. Around 10 minutes after it appeared to clear, the second occurrence was observed again in York but also in Los Angeles, California, and Ashburn, Virginia. Five minutes into the second occurrence, nodes in Atlanta, Georgia, and Paris, France, also began exhibiting outage conditions. Around 2 a.m. EDT, the nodes in Paris appeared to recover, but Chicago, Illinois, and San Jose, California exhibited outage conditions for the next five minutes. All the nodes except those in New York and Ashburn appeared to recover. This second occurrence lasted around 33 minutes and had the biggest impact in terms of countries affected. The total outage lasted around 55 minutes and was cleared around 2:15 a.m. EDT.

Updated Aug. 2

Globally the total of outages in all three categories increased from 251 to 278, an increase of 11% compared to the previous week. In the US, they increased from 129 to 131.

The number of ISP outages globally remained the same at 191, while in the US they dropped from 105 to 103.

Cloud-provider network outages jumped from 12 to 22 worldwide, an 83% increase, and increased from two to five in the US.

Globally, collaboration-app network outages decreased from six to four, and in the US they dropped from three to two.

There were two notable outages during the week.

At 4:15 p.m. EDT on July 27, NTT America experienced an outage that impacted some of its customers and partners across countries including the US, Ireland, Canada, France, South Africa, Germany, UK, Singapore, Japan, Spain, Sweden, Italy, Brazil, Republic of Korea, and the Netherlands. After about 15 minutes, NTT America nodes located in Paris, France, began exhibiting outage conditions. About 4:35 p.m. EDT, the node in Paris appeared to recover. Around 4:40 p.m. EDT, nodes located in New York, New York also began exhibiting outage conditions, but appeared to recover five minutes later. The number of countries and nodes hit by the outage continued to decrease until it appeared to clear around 5:25 p.m. EDT. Five minutes later the nodes in London began exhibiting outage conditions again.The issue was cleared around 5:35 PM EDT.

At 8:50 a.m. EDT on July 29, NetActuate experienced an outage affecting multiple downstream partners and customers in the US. The outage lasted around 18 minutes and appeared to center on NetActuate nodes in Dallas, Texas. Fifteen minutes later the number of affected nodes in Dallas appeared to drop and with it the number of affected customers and partners. The issue was cleared around 9:10 a.m. EDT. 

Updated July 19

Global outages across all three categories during the past week decreased from 309 to 295, down 5%. In the US they were up one from 135 to 136.

ISP outages globally decreased from 225 to 212, down 6%, while in the US they decreased from 113 to 106, also a 6% drop.

Cloud-provider network outages overall increased from 31 to 36. In the US, they jumped from four to 11.

Collaboration-app network outages dropped from seven to one worldwide, and from two to one in the US.

At 3:40 a.m. EDT on July 12 AT&T experienced an outage that affected customers in the US, U.K, Japan, Germany, Canada, Australia, India, Brazil, Republic of Korea, Switzerland, and the Netherlands. The outage centered on AT&T nodes located in Phoenix, Arizona, and lasted 14 minutes.

Updated June 28

Global outages in all three categories last week dropped from 427 to 212, a 50% decrease compared to the week before. In the U.S., outages dropped from 275 to 86, a 69% decrease.

Globally, the number of ISP outages decreased from 352 to 147, a 58% drop, while in the US they fell from 250 to 59, a 76% decrease.

Cloud-provider network outages dropped from 23 to 15, down 35% worldwide compared to the week prior. In the US, they dropped from four to two.

Globally, collaboration-app network outages increased from six to eight, and from four to seven in the US.

There were three notable outages this week.

At 12:50 a.m. EDT on June 22, Internap experienced an outage affecting downstream partners and customers in countries including the US, the UK, Japan, Australia, Singapore, Germany, India, Israel, Italy, and Hong Kong. The outage lasted 24 minutes and centered on Internap nodes in New York, New York, and peaked during the first five minutes. It was cleared around 1:15 a.m. EDT.

At 3:10 p.m. EDT on June 24, Amazon experienced an interruption that impacted downstream partners and customers in countries including the US, the UK, Australia, South Africa, India, Japan, Mexico, Germany, and the Philippines. The 17-minute outage appeared to center on Amazon nodes in Columbus, Ohio, and the number of countries affected was at its highest for the first 10 minutes, decreased steadily until the last seven minutes when it appeared to affect only the US, India and the Philippines. The outage was cleared around 3:35 p.m. EDT.

At 7:20 p.m. EDT on June 23, TATA Communications (America) experienced an outage that impacted downstream partners and customers in countries including the US, Australia, India, Japan, Brazil, the UK, Germany, Canada, the Netherlands, and Switzerland. The outage lasted around 12 minutes and appeared to center on TATA nodes in Montreal, Canada, and Chicago, Ilinois. It was cleared around 7:35 PM EDT.

Updated June 21

Global outages across all three categories last week increased from 332 to 427, up 29% from the week before. In the US, total outages jumped from 173 to 275, a 59% increase.

Worldwide the number of ISP outages  increased from 263 to 352, a 34%. In the US they increased from 146 to 250, a 71% increase.

Cloud provider network outages globally more than doubled for the second week in a row, from 10 to 23. In the US, cloud-provider network outages decreased from five to four.

Globally, collaboration-app network outages decreased from seven to six outages, while in the US they increased from three to four.

There were two significant outages during the week.

About 12:20 a.m. EDT on June 17, Akamai’s DDoS mitigation service, Prolexic Routed, experienced a service disruption that made its customers’ websites, including major financial services firms and airlines, unreachable. The outage affected many of the approximately 500 Akamai Prolexic customers that use the service. During the incident, there appeared to be a massive surge in network outages that also coincided with application availability issues. Akamai identified the cause as the Prolexic routing process. The outage was most severe in its initial minutes, but lasted until about 4:22 a.m. EDT.

About 2:40 p.m. EDT on June 15 , Cogent Communications experienced an outage that affecting downstream providers as well as Cogent customers in the US. The outage lasted around 35 minutes divided into three occurrences over the period of an hour. The first occurrence appeared centered on Cogent nodes in Chicago, Illinois, and Atlanta, Georgia. The outage appeared to clear around 2:45 p.m. EDT but reappeared five minutes later. Fifteen minutes into the outage the nodes in Atlanta, GA appeared to recover, leaving only the Cogent nodes located in Chicago exhibiting outage conditions. This continued for another four minutes before appearing to clear. The third occurrence of the outage was observed around 3:10 p.m. EDT centered at Cogent Chicago nodes. This third occurrence was the longest of the three, lasting around 24 minutes. The outage was cleared around 3:35 p.m. EDT.

Updated June 14

Total outages across all three categories last week jumped from 222 to 332, a 50% increase compared to the week prior. In the US, outages more than doubled from 80 to 173, a 116% increase compared to the week prior.

The number of ISP outages worldwide went from 182 to 263, a 45% increase. In the US they increased from 71 to 146, a 106% increase.

Cloud-provider network outages globally more than doubled from four to 10 outages, and in the US the grew from zero to five.

Globally, collaboration-app network outages jumped from one to seven, and in the US, from zero to three.

There were two notable outages during the week. Around 5:50 a.m. EDT on June 8, Fastly suffered a major outage that impacted the sites and applications of many of its customers. The outage, lasting about an hour, caused users to have issues loading content and accessing sites around the globe. Not all customers were affected for the full hour because they were able to use alternative services to deliver content to users. Around 6:27 a.m. EDT, Fastly announced it had identified the source of the outage, and around 6:50 a.m. announced that all services had been restored and the outage was cleared. An in depth view of the outage can be found here.

About 1:10 a.m. EDT on June 9, Zayo Group experienced an outage that affected some of its partners and customers in countries including the US, Germany, the Netherlands, Canada, the UK, Austria, Hong Kong, Australia, Brazil, Japan, Russia and Malaysia. The outage lasted around 54 minutes and appeared to center on Zayo nodes in Denver, Colorado, and Salt Lake City, Utah. Five minutes later, the Salt Lake City nodes appeared to recover but outage conditions started in nodes in Seattle, Washington, and London, UK. Thirty minutes into the outage it grew to include nodes in Chicago, Illinois, before being cleared around 2:10 a.m. EDT.

Updated June 7

Global outages across all three categories last week decreased from 265 to 222, a 16% decrease. In the US they dropped from 128 to 80, a 38% decrease.

ISP outages globally last week decreased from 211 to 182, a 14% decrease, while in the US they decreased from 105 to 71, a 32% decrease.

Globally, cloud provider network outages decreased from 9 to 4 and from two to zero in the US.

Collaboration-app network outages worldwide dropped from six to one and in the US dropped from five to none.

About 3 a.m. EDT on June 2, the ISP PCCW, experienced a 19-minute outage impacting some of its customers and networks in the US. It appeared to center on PCCW infrastructure located in Ashburn, Virginia, and was cleared around 3:25 a.m. EDT.

Around 6:45 p.m. on June 1, Microsoft experienced a 29-minute outage that impacted some downstream partners and access to services running on Microsoft environments. It appeared to be centered on Microsoft nodes located in Dublin, Ireland and was cleared around 7:15 p.m. EDT. Given the duration and timing relative to the location of the nodes at the cente of the outage, it is likely to have been a maintenance exercise.

Around 1:05 a.m. EDT on June 1, Flag Telecom Global Internet experienced an outage on their network that lasted around an hour and 51 minutes over a three-hour period. It affected customers and downstream partners in countries including the US, Australia, India, France, the Netherlands, Singapore, the Philippines, Hong Kong, Germany, Brazil, and Taiwan. It appeared to be centered on Flag Telecom nodes located in Singapore. Five minutes after the initial outage, Flag Telecom nodes in Hong Kong also exhibited outage conditions and coincided with an increase in the number of impacted countries, customers, and partners. After a further five minutes, the nodes located in Hong Kong appeared to recover for 10 minutes before exhibiting outage conditions again for five more minutes. Flag Telecom nodes located in Singapore also appeared to recover about 50 minutes after the initial outage. Around 2 a.m. EDT, the nodes located in Singapore again begin exhibiting outage conditions. A series of varying-duration outages, all centered on Singapore nodes, were observed for the next two hours. The outage was cleared around 4:05 a.m. EDT.

Update May 31

Global outages across all three categories last week decreased from 363 to 265, a 27% drop, while in the US they decreased from 197 to 128, a 35% decline.

ISP outages globally decreased from 284 to 211, down 26%. In the US they decreased from 175 to 105, a 40% drop.

Worldwide cloud-provider network outages decreased from 12 to 9 outages, and remained the same in the US with two.

Globally, collaboration-app network outages increased from five to six, and in the US they increased by two, from three to five.

There were two major outages during the week. At 12:15 a.m. EDT on May 26, Verizon Business experienced an outage affecting customers and partners across countries including the US, Ireland, Poland, the Netherlands, Canada, the UK, Germany, and India. The outage appeared to be centered on Verizon Business nodes in New York, New York, and was divided into two occurrences spanning 45 minutes. The first lasted around nine minutes and initially appeared to be clearing, with the number of affected parties dropping, but about 20 minutes later the outage returned and lasted about 23 minutes, again centered on nodes in New York. 

Around 1:35 p.m. EDT May 26, Cogent Communications experienced a series of outages totalling 48 minutes over the span of an hour and 10 minutes that impacted downstream providers and customers globally. The initial outage centered on Cogent nodes in Las Vegas, Nevada, and lasted around 12 minutes. Then the Cogent environment was stable for 10 minutes before experiencing a second occurrence on nodes in Dallas and Houston, Texas. Five minutes later, the Cogent node located in Dallas appeared to recover, but nodes in Kansas City, Missouri, experienced outages. After five more minutes the Kansas City nodes recovered, but nodes in Denver, Colorado, experienced outages.  Forty-five minutes after the initial outage was observed, a 24-minute outage was observed on nodes in Dallas, Houston, and Kansas City. Ten minutes into the third occurrence, the number of locations exhibiting outage conditions expanded to include Salt Lake City, Utah, Oklahoma City, Oklahoma, and Denver. As the Cogent nodes involved increased, so did the number of customer networks that were affected. The outage was cleared around 2:45 p.m. EDT.

Update May 24

Global outages across all three categories jumped from 252 to 363, last week, a 44% increase. In the U.S. outages increased from 123 to 197, a 60% increase.

Globally, ISP outages jumped from 180 to 284, up 58%, while in the US, ISP outages increased from 98 to 175, up 79%.

Worldwide, cloud-provider network outages declined slightly from 14 to 12 outages, but in the US, they increased from one to two. 

Collaboration-app network outages worldwide increased from three to five, and in the US increased from one to three.

There were three notable outages during the week.

Around 1:30 p.m. on May 20, Slack experienced an interruption to its business-communication platform that lasted about 25 minutes and affected users accessing the services. A number of internal server errors were observed. Slack identified the cause as a code change that inadvertently affected some workspaces. Slack reverted the change and restore services by 1:55 p.m. EDT.

About 8:55 a.m. EDT on May 19, Coinbase experienced an interruption that lasted about two hours and affected global access to the Coinbase site and application. Connectivity and access across the network appeared to be unimpaired during the interruption, with initial requests simply timing out with system errors indicating system congestion. An hour and half after the outage was first observed, services began to be restored, with access in APAC and EMEA still affected. The outage was cleared around 10:45 a.m. EDT.

On May 17, Hurricane Electric experienced an outage that was divided into three instances over an hour and a half that affected users across countries including the US, Australia, Singapore, Hong Kong, the UK, Brazil, Germany, South Africa, the Netherlands, and Canada. The first period was observed around 1:43 p.m. EDT centered on Hurricane Electric nodes in San Francisco and San Jose, California. After five minutes, the San Francisco nodes appeared to recover, reducing the scope of the outage. But five minutes after that, San Francisco nodes exhibited outage conditions again. Five minutes after this occurrence cleared, a second one lasting three minutes was observed centered on the San Jose and San Francisco nodes. Around 2:15 p.m. EDT, the nodes appeared to recover, temporarily clearing the outage, but an hour later, those two nodes exhibited outage conditions again before clearing after eight minutes. The total outage lasted around 26 minutes and was cleared around 3:25 p.m. EDT.

Update May 17

Global outages across all three categories last week increased from 237 to 252, up 6%, while in the US, they increased from 95 to 123, a 29% jump.

ISP outages globally increased from 168 to 180, and in the US they increased from 76 to 98.

Cloud-provider network outages worldwide increased from 13 to 14 but dropped from four to one in the US.

Globally, collaboration app network outages dropped from four to three, and in the US from two to one.

There were three significant outages this week.

About 3:30 p.m. on May 12, NetActuate experienced an outage affecting downstream partners and customers in the US. It lasted around 13 minutes overall, divided into two occurrences spanning a 30-minute period. The first lasted four minutes and appeared to center on NetActuate nodes located in Raleigh, North Carolina. The outage reappeared 15 minutes later and lasted nine minutes and centered on the Raleigh nodes and nodes in Durham, North Carolina, increasing the number of customers affected. The nodes in Durham cleared five minutes into the second period of the outage. The outage was cleared around 4:05 p.m. EDT.

About 10:15 p.m. EDT on May 13, TATA Communications (America) experienced an outage affecting downstream partners and customers in countries including the US, the UK, Australia, India, China, Germany, the Netherlands, Japan, Hong Kong, Brazil, Switzerland, Republic of Korea, and Canada. The outage lasted 35 minutes and was divided into two periods over 55 minutes. The first period lasted around nine minutes and appeared to be centered on TATA nodes in Tokyo, Japan. About 10 minutes after it cleared, the outage reappeared, centering on TATA nodes located in Hong Kong and after another five minutes expanding to TATA nodes in Los Angeles, California. The nodes in Hong Kong appeared to clear after 15 minutes, leaving just the Los Angeles nodes exhibiting outage conditions. This second period of the outage lasted 26 minutes and was cleared around 11:10 p.m. EDT.

About 5:10 p.m. EDT on May 11, Salesforce experienced an interruption that left users able to reach the Salesforce front-end, but experiencing issues logging on and navigating to the Salesforce Sales Cloud, Marketing Cloud, Commerce Cloud, and Experience Cloud. That indicated internet and network connectivity for end users was functioning. Salesforce identified issues with its domain name system that had a cascading effect on its services. A fix was implemented around 7:45 p.m. EDT, and the outage was cleared about 10:15 p.m.EDT.

Update May 10

Global outages across all three categories last week decreased from 243 to 237, while in the US they decreased from 117 to 95.

Globally, the number of ISP outages increased by one, from 167 to 168. In the US they dropped from 100 to 76, a 24% decrease compared to the week prior.

Cloud provider network outages dropped 52% from 27 to 13 worldwide. In the US, they doubled from two to four.

Collaboration app network outages increased from three to four globally and stayed at two in the US.

There were two notable outages during the week.

Around 6 p.m. EDT on May 3, Cloudflare experienced a disruption to its Magic Transit service when some customers began experiencing significant packet loss at Cloudflare’s network edge. The outage appeared to impact Cloudflare’s infrastructure across the globe, with packet loss occurring at varying levels for approximately two hours. At around 8 p.m. EDT, Cloudflare began implementing a fix for the issue, and announced that it was resolved just after 9 p.m. EDT.

Around 3:35 a.m. EDT on May 7, PCCW experienced an outage impacting some of its customers and networks in multiple countries including, the US, Australia, Brazil, and China. The outage lasted around 22 minutes and was divided into two periods over a half-hour span, the first of which appeared to center on PCCW infrastructure located in Ashburn, Virginia. It lasted about 18 minutes. Five minutes later it recurred, again centered in Ashburn, but with the outage condition including infrastructure in New York, New York. It lasted lasted around 4 minutes and was cleared around 4:05 a.m. EDT.

Update May 3

The number of outages globally across all three categories  decreased slightly last week from 246 to 243. In the US, outages decreased from 123 to 117.

The number of ISP outages worldwide increased from 162 to 167, while in the US they increased from 92 to 100.

Cloud-provider network outages remained at 27 overall and went down in the US from five to two.

Globally, collaboration-app network outages remained at three and dropped from three to two in the US.

There were three notable outages during the week.

At 7:36 a.m. EDT on April 27, TATA Communications (America), experienced an outage affecting many of its downstream partners and customers in countries including the US, Australia, India, Japan, and the Philippines. The outage affected TATA nodes in Ashburn, Virginia, and appeared to clear after five minutes, but came back around 8 a.m. EDT centered on TATA nodes in Chicago, Illinois, and Los Angeles, California. The Chicago nodes appeared to recover 10 minutes later, leaving only the nodes in Los Angeles exhibiting outage conditions. About 10 minutes after that, nodes in Hong Kong began exhibiting outage conditions. In total, the outage lasted around 29 minutes, divided into two occurrences over the course of an hour and was cleared around 8:25 a.m. EDT.

AT 12:40 a.m. EDT on April 29, Hurricane Electric experienced an outage affecting users across countries including the US, Spain, Russia, and Ireland. It affected Hurricane Electric nodes in Ashburn, Virginia, and New York, New York. After five minutes, the nodes in New York appeared to recover, reducing the impact to US users only. Around 12:50 a.m. EDT, the nodes in Ashburn appeared to recover, temporarily clearing the outage. But five minutes later the nodes located in New York began exhibiting outage conditions again before clearing after three minutes. The total outage lasted around 11 minutes, consisting of two periods over half an hour. The issue was cleared around 1 a.m. EDT.

Around 6 a.m. EDT on April 27, Microsoft experienced an outage that affected its Teams users globally for about an hour and a half. The outage occurred outside of business hours for much of the Americas, but its global nature resulted in service disruption for users connecting from Asia and Europe. During the outage other Microsoft services continued to be reachable and available, but Teams services appeared unable to authenticate connection requests. Check out the ThousandEyes Internet Report for a deeper dive into the outage.

Update April 16

Globally the number of ISP outages moved from 160 to 162. In the US they moved from 85 to 92, an 8% increase.

Cloud provider network outages overall dropped by one, from 28 to 27. In the US they increased from two to five.

Collaboration-app network outages globally dropped from eight to three, and in the US from five to three.

There were two notable outages during the week.

At 11:10 p.m. EDT on April 20, Internap experienced an outage that hit many of its downstream partners and customers in countries including the US, the UK, Japan, Australia, Singapore, Germany, and Hong Kong. It lasted 18 minutes and centered on Internap nodes in New York, New York. The outage peaked 10 minutes later and was cleared around 11:30 p.m. EDT.

At 11:35 a.m. EDT on April 21, Zayo Group experienced an outage affecting partners and customers in countries including the US, China, Mexico, Canada, Hong Kong, Germany, Sweden, Brazil, India, and Singapore. It lasted around 24 minutes over a one-hour period and appeared to initially center on Zayo Group nodes in Atlanta, Georgia; Salt Lake City, Utah; and Denver, Colorado. A second occurrence started about 25 minutes later and lasted about four minutes. The outage expanded to include Zayo nodes located in Toronto, Canada, and that coincided with an expansion of affected countries and partners. Ten minutes after that, a third, three-minute occurrence centered on Zayo nodes in San Francisco, California, and affected a handful of countries. The final period of the outage was observed around 12:20 p.m. EDT centered on Zayo nodes in Phoenix, Arizona, and lasted 15 minutes. It appeared to affect only US-based customers and partners. The outage was cleared around 12:40 p.m. EDT.

Update April 19

Global outages in all three categories rose from 214 to 245, up 14% over the previous week, and from 88 to 106, up 20%, in the US.

The number of ISP outages worldwide increased from 137 to 160, a 17% increase, and in the US from 73 to 85, a 16% increase.

Cloud-provider network outages globally went from 12 to 28, a 133% jump, and in the US they increased from 1 to 2.

Worldwide collaboration-app network outages increased from two to eight, a 300% increase, while the US, outages jumped from zero to five.

There were two notable outages during the week. Around 4:46 p.m. EDT on April 12, TATA Communications (America) experienced an outage that impacted many of its downstream partners and customers in countries including the US, the UK, Australia, India, Brazil, Germany, the Netherlands, Japan, Switzerland, Republic of Korea, and Canada. The outage, lasting around nine minutes, appeared to be centered on TATA nodes located in Tokyo, Japan. The outage was cleared around 5:10 p.m. EDT.

Around 8:45 a.m. EDT on April 14, Zayo Group experienced an outage that affected some of its partners and customers in multiple countries. The outage lasted around 36 minutes, was first observed in Zayo nodes in Atlanta, Georgia. Five minutes later the outage expanded to include nodes in Seattle, Washington, and Chicago, Illinois, which expanded the area affected from just the US to include the UK, Russia, Singapore, India, and Canada. Five minutes after that, nodes in Houston, Texas, became involved and customers in Australia were affected.  Around 9:10 a.m. EDT, nodes located Denver, Colorado, were affected. This appeared to be the peak of the overall effects of the outage. Thirty minutes into the outage, the Denver node appeared to recover, reducing the number of affected countries and downstream partners. The outage was cleared around 9:25 a.m. EDT.

Update April 12

The number of outages last week across all three categories increased slightly from 210 to 214, up 2% compared to the week prior. In the US they decreased from 93 to 88, down 5%.

Globally, the number of ISP outages decreased from 143 to 137, a 4% decrease, and in the US they decreased from 74 to 73.

Cloud-provider network outages worldwide increased from nine to 12, up a third, while in the US they decreased from three to one, down two thirds.

Globally, collaboration-app network outages increased from one to two. In the US they dropped from one to zero.

There were two notable outages during the week. At 2:35 a.m. EDT on April 8, NTT America, experienced a 34-minute outage that affected some customers and downstream partners across countries including the US, Australia, Canada, France, India, Germany, UK, Switzerland, Japan, Hong Kong, and the Netherlands. The outage appeared initially to be centered on NTT America nodes in Newark, New Jersey, and Paris, France. The issue was cleared around 3:10 a.m. EDT.

About 10 p.m. EDT on April 6, AT&T experienced an outage on its network affecting customers in countries including the US, UK, Japan, Germany, Canada, Australia, India, Brazil, Republic of Korea, Switzerland, and the Netherlands. The outage centered on AT&T nodes in Phoenix, Arizona, lasted 24 minutes, and was cleared around 10:25 p.m EDT.

Update April 5

Global outages across all three categories decreased over the last week from 282 to 210, down 26%, and in the U.S., fell from 119 to 93, a 22% decrease.

The number of ISP outages globally dropped from 204 to 143, a 30% decrease and decreased in the US from 96 to 74, which is 23%.

Globally, cloud-provider outages went from 20 to 9, a 55% decrease. In the US outages went from four to three.

Worldwide, collaboration-app network outages dropped from seven to one and decreased from 2 to 1 in the US.

There were three notable outages during the week.

At 7 a.m. EDT on March 30, Cogent Communications experienced a 44-minute outage that affected multiple downstream providers, as well as Cogent customers globally. The outage appeared to be centered on Cogent nodes in El Paso, TX, Washington DC, and Phoenix, AZ. Five minutes in, the number of Cogent nodes exhibiting outage conditions increased to include nodes located in Salt Lake City, UT, Houston, TX, San Francisco, CA, and Los Angeles, CA. Fifteen minutes in, just those in Los Angeles, CA, San Francisco, CA, and Washington DC still exhibited outage conditions. Twenty minutes in, nodes in San Francisco, CA, and Los Angeles, CA recovered, but the Washington DC nodes remained out for a further 24 minutes.

Around 9:45 p.m. EDT on March 31, the AT&T network experienced an outage that impacted AT&T customers in multiple countries, including the US, UK, Japan, Germany, Canada, Australia, India, Brazil, Republic of Korea, Switzerland, and the Netherlands. IT centered on AT&T nodes in Phoenix, AZ, and lasted 18 minutes.

On April 1, Microsoft experienced an interruption that impacted customers in multiple countries, including the US, UK, Germany, Poland, Belgium, the Netherlands, Australia, Sweden, Japan, France, Ireland, China, Turkey, and the Ukraine. First observed around 5:30 p.m. EDT, the outage appeared to impact availability of Microsoft Azure DNS services. The outage lasted 24 minutes, with full availability being restored around 6:00 p.m. EDT.

Update March 29

Global outages across all three categories decreased from 300 to 282, down 6% from the previous week, and in the US they dropped from 143 to 119, a 17% decrease.

ISP outages globally increased from 197 to 204, a 4% increase. In the US they dropped from 106 to 96, a 9% dip.

Cloud-provider network outages went down from 26 to 20, a 23% decrease, and in the US decreased from five to four.

Globally, collaboration-app network outages increased from four to seven, a 75% increase. In the U.S., they moved up from one to two.

There were two notable outages during theweek. On March 23, Level 3 Communications, experienced an outage that impacted multiple downstream partners and customers in multiple countries including the US, Malaysia, the UK, the Netherlands, Brazil, India, the Czech Republic, Canada, France, Japan, and Australia. The 18-minute outage was first observed around 11:30 a.m. EDT and appeared centered on Level 3 nodes located in London,UK. During the outage, the number of affected nodes in London incrementally decreased, with the outage cleared around 11:50 a.m. EDT. Click here for an interactive view of the outage.

March 24, Zayo Group experienced a 24-minute outage that affected some of its partners and customers in the US. It was observed around 2:35 p.m. EDT and appeared to center on Zayo Group nodes located in Los Angeles, CA. The outage was cleared around 3 p.m. EDT. Click here for an interactive view of the outage.

Update March 22

Globally, outages in all three categories increased from 281 to 300, up 7%. In the US they increased from 137 to 143, 4 %.

The number of ISP outages decreased from 203 to 197, a 3% decrease, while in the US, the drop went from 108 to 106, a 2% decrease.

Cloud-provider network outages went up from 11 to 26, a 136%, but in the US they decreased from six to five. 

Collaboration-app network outages increased two to four, and in the US remained at one.

A notable outage occurred on March 17 when Cloudflare suffered an interruption that impacted its customer in the northwest Pacific region of the US and Canada. The 33-minute outage over a one-hour period, was first observed around 10:20 a.m. EDT and appeared to center on Cloudflare nodes located in Kansas City, MO. This first portion of the outage lasted around three minutes. Fifteen minutes later there was a 22-minute incident centered on Cloudflare nodes located in Seattle, WA. Forty minutes after the outage was first observed, two more were observed, again centering on Cloudflare nodes in Seattle, WA. It was cleared around 11:25 a.m. EDT. Click here for an interactive view of the outage.

Update March 14

Outages in all three categories worldwide during the previous week were down from 385 to 281, a 27% decrease. In the US, they dropped from 168 to 137, an 18% change.

Globally the number of ISP outages decreased from 281 to 203, down 28%, and from 132 to 108 in the US, down 18%.

Cloud-provider outages fell from 26 to 11 worldwide, a decrease of 58%. In the US, they rose from four to six.

Collaboration-app network outages worldwide fell from five to two, and in the US from five to one.

On March 10, Dynamic Network Services experienced an interruption that resulted in DNS-resolution degradation on their Dyn Managed DNS service. The disruption affected users in countries including the UK, South Africa, Singapore, Australia, Ireland, France, Spain, and Portugal. The 55-minute outage was first observed around 6:40 p.m. EST and appeared to be centered on Dyn nodes located in London, UK. Twenty minutes later, a second Dyn node in Manchester, NH, showed outage conditions. The appearance of this second Dyn node coincided with a Dyn notification that their engineers had identified the issue and had implemented a fix. Twenty-five minutes into the outage, only the Dyn node located in London, UK, was exhibiting outage conditions and the number of affected services began to reduce, indicating the service was recovering. The outage was cleared around 7:35 p.m. EST. Click here for an interactive view of the outage.

On March 11, NTT America experienced an outage affecting some of its customers and downstream partners across countries including, the US, Australia, Canada, France, India, Germany, UK, Switzerland, and the Netherlands. The 20-minute outage was first observed around 3:05 p.m. EST and appeared to be centered on NTT America nodes located in Ashburn, VA, and Los Angeles, CA. Five minutes into the outage, the countries affected were reduced to just the US, UK, the Netherlands, and Germany, accessing downstream NTT networks. Five minutes later the outage cleared at the Ashburn, VA, node leaving just the node in Los Angeles, CA, exhibiting outage conditions. That outage was cleared around 3:30 p.m. EST. Click here for an interactive view of the outage.

Update March 7

Outages in all three categories worldwide fell from 393 to 385, a 2% decrease compared to the week before. In the US, they decreased from 184 to 168, 9% fewer.

Globally, the number of ISP outages decreased from 311 to 281, a 10% decline,while in the US they decreased from 166 to 132, down 20%.

Cloud provider outages worldwide increased from 22 to 26, an 18% increase, and the change in the US was an increase from two to four.

The number of collaboration-app network outages jumped from two to five, all of them in the US.

There were two notable outages during the week. On March 3, UUNETVerizon experienced an outage that impacted many of its peers and customers, including, Bank of America, JP Morgan Chase, Nomura, Samsung, and Zoom. The outage, lasting around 36 minutes over a 75 minute period, was first observed around 9:00 a.m. EST and appeared to center on UUNETVerizon nodes in Philadelphia, PA, and Ashburn, VA. This initial part of the outage lasted around 4 minutes and appeared to have a cascading impact on Cogent infrastructure located in New York, NY, and affected Cogent’s path to the JP Morgan Chase network. Approximately five minutes after the initial outage cleared, a second was observed that lasted around 13 minutes. It was observed on UUNETVerizon nodes located in Seattle, WA and Dallas, TX, as well as appearing to have a cascading impact in Level 3 Communications infrastructure located in Seattle, WA, and affecting Level 3 customers and partners in Canada. Five minutes into this second period, the Level 3 infrastructure direct outage cleared and after another five minutes, the only UUNETVerizon nodes exhibiting the issue were located in Dallas, TX. Around 9:50 a.m. EST, the third occurrence of the outage was observed 20 minutes after the second. This outage lasted around 19 minutes and was initially focused on UUNETVerizon infrastructure in Dallas, TX. Five minutes into the third period of the outage, UUNETVerizon infrastructure exhibiting problems expanded to include Seattle, WA. Approximately 10 minutes into this third period of the outage, UUNETVerizon infrastructure located in San Jose, CA was added to those in Seattle, WA, and Dallas, TX. At around 10:10 AM ET, the UUNETVerizon infrastructure located in San Jose, CA, was the only infrastructure exhibiting issues. The outage was cleared  around 10:15 AM ET. Click here for an interactive view of the outage.

On March 3, PCCW experienced an outage affecting some of its U.S. customers and networks, including Flagstar Bank, Target, Bloomberg, Morgan Stanley, and Dell. The outage lasted around 31 minutes and was divided into three periods over an hour and 20 minutes. The outage was first observed around 8:45 a.m. EST and appeared to center on PCCW infrastructure located in Ashburn VA. The first period of the outage lasted around 9 minutes, before recurring 15 minutes later, again centered on PCCW infrastructure located in Ashburn, VA. This second outage lasted around 19 minutes. The third period was observed 30 minutes after the second ended and lasted around 9 minutes. The outage was cleared around 10:05 a.m. EST.

Click here for an interactive view of the outage.

Update March 1

Global outages across all three categories jumped from 279 to 393, a 41% increase over the week before. In the US outages went from 138 to 184, up 33%.

ISP outages rose from 233 to 311, a 33% increase worldwide, and from 123 to 166 in the US, a 35% increase.

Cloud-provider network outages globally jumped from 5 to 22, a 340% increase. The US accounted for two of them, up from one the week before.

Collaboration-app network outages dropped from four to two globally and from two to zero in the US.

There were three notable outages this week.

On Feb. 23, LinkedIn experienced a service disruption affecting its mobile and desktop global user base. The outage was first observed around 1:50 p.m. EST, with users attempting to connect to LinkedIn receiving server-unavailable error messages. Around 45 minutes later, services to some regions began to return, although others were still unable to use the services. After another 45 minutes, the server unavailable messages were replaced with content not available errors. The total disruption lasted around two hours, during which no network issues were observed connecting to LinkedIn web servers, further indicating the issue was application related. Service was restored around 3:40 p.m. EST. Click here for an interactive view of the outage.

On Feb. 25, Hurricane Electric experienced an outage that affected users across inthe US, UK, Australia, South Africa, New Zealand, Germany, Canada, Japan, Spain and Brazil. The outage, lasting around 36 minutes over a 45-minute period, was divided into two events. The first was observed around 1:40 a.m. EST on Hurricane Electric infrastructure in Singapore and Hong Kong. It lasted 24 minutes and initially cleared around 2:05 a.m. EST, but five minutes later, around 2:10 a.m. EST, a node in Marseille, France, was affected and created issues for around 12 minutes. The outage affected access to customer networks including Credit Suisse, Proctor and Gamble, DBS, Shell, and Bank of America. The issue was cleared around 2:25 a.m. EST. Click here for an interactive view of the outage.

On Feb. 24, Comcast Communications experienced an outage that affected peers and customers in the US, Canada, and the Netherlands. The outage, lasting around 14 minutes, was first observed around 11 p.m. EST and appeared to be centered on Comcast nodes in Newark, NJ, and affected access to customers including CBS, NBC, Bloomberg, and JP Morgan Chase. Ten minutes into the outage, the radius of the disruption expanded to include Comcast nodes in New York, NY, Dallas, TX, Chicago, IL, and Boston, MA. It affected customers in more countries including the US, UK, Australia, Canada, France, and the Netherlands. The outage was cleared around 11:15 p.m. EST. Click here for an interactive view of the outage.

Update Feb. 22

Outages overall were up 35%, from 206 to 279, compared to the week before. In the US they were up 53%, from 90 to 138.

Globally, the number of ISP outages jumped from 154 to 233, a 51% increase, and in the US, they increased from 78 to 123, up 58%.

Cloud-provider network outages dropped from 24 to five, a 79% decrease worldwide. In the US, they dropped from five to one.

Collaboration-app network outages doubled from two to four globally and from one to two in the US.

On Feb. 16, Level 3 Communications experienced a notable outage that affected multiple downstream partners and customers in countries including the US, Canada, Argentina, Mexico, and the UK. First observed around 11:40 a.m. EST, the outage lasted around 36 minutes over a one-hour period and affected access to customers including Bank of America, TiVo, and Lending Tree. The first minutes of the outage appeared to center on Level 3 nodes in San Francisco, CA, affecting only the U.S. and Canada. Ten minutes into the outage, nodes in Boston, MA, became involved, and at this point the impact spread to include the other countries. Five minutes after that, the outage in the San Francisco node cleared leaving just the Boston infrastructure in an outage condition. Twenty-four minutes after the outage was first observed it appeared to clear. Fifteen minutes later the outage reappeared, this time centered on nodes in San Francisco, Salt Lake City, UT, and Portland, OR. This second episode lasted for four minutes and was followed by two more four-minute outages each five minutes apart. The outage was cleared around 12:45 p.m. EST. Click here for an interactive view of the outage.

On Feb.18, GTT Communications experienced an outage that affected some of its partners and customers in the US. It lasted around 14 minutes, was first observed around 4:30 a.m. EST, and appeared to center on GTT nodes in Los Angeles, CA, affecting customer networks including Ford Motor Company, Guaranteed Rate and Loanet. The outage was cleared around 4:45 a.m. EST. Click here for an interactive view of the outage.

Updated Feb. 15

Worldwide outages across all three categories decreased from 267 to 206, a 23% drop from the week before. They dipped 4% in the US, from 94 to 90.

ISP outages decreased 20%, from 192 to 154, worldwide, and they stayed the same in the US at 78.

Globally cloud-provider network outages increased from 21 to 24, up 14%, and in the US bumped up from three to five.

Global collaboration app network outages remained at two for the second week in a row, while US outages dropped from two to one.

Around 4:30 a.m. EST on Feb. 11, AT&T suffered an outage centered in Washington, D.C., followed by issues in Tulsa, OK, and San Antonio, TX. that affected customers including some in the US, Germany, and Australia. Five minutes later only Washington, DC, and Tulsa, OK, nodes were involved and five minutes after that, just those in Washington, DC. Customer networks affected included J.P. Morgan Chase, Jeffries Group, Travelers Property Casualty, and ConocoPhillips. The outage lasted 14 minutes and was cleared at around 4:45 a.m. EST. Click here for an interactive view of the outage.

On Feb. 11, Cogent Communications experienced a series of outages over a period of 3 hours and 38 minutes that affected downstream providers as well as Cogent customers globally. The outage lasted a total of 43 minutes and was first observed around 1:07 a.m. EST centered on Cogent nodes in San Francisco, CA. This initial outage lasted around two minutes, and the Cogent environment was then stable for 43 minutes before experiencing a series of four-minute outages observed on Cogent nodes in Dallas, TX and Phoenix, AZ. An hour and a half after the initial outage was observed, a four-minute outage was observed centering on Cogent nodes located in Newark, NJ. The outage reappeared 25 minutes later, extending the list of affected nodes to locations including San Francisco, CA, Atlanta, GA, Cleveland ,OH, New York, NY, and Washington D.C. Less than an hour later, a 19-minute outage affected nodes including Miami, FL, Austin, TX, El Paso, TX, Houston, TX, San Jose, CA, Los Angeles, CA, and Las Vegas, NV. Customer networks affected included Ford Motor Company, Oracle, Home Depot, and TikTok. The outage was cleared at around 4:45 a.m. EST. Click here for an interactive view of the outage.

Updated Feb. 8

Globally outages in all three categories decreased from 278 to 267, a 4% decrease. In the US, outages decreased from 119 to 94, a 21% decrease.

ISP outages worldwide decreased from 214 to 192, a 10% drop. In the US, ISP outages decreased from 102 to 78, down 24%.

Cloud-provider network outages worldwide dropped from 30 to 21, down 30%. In the US they dropped from 10 to three.

Collaboration-app network outages globally decreased from five to two and stayed the same, at two, in the US.

There were two notable outages this week. Around 1:40 a.m. on Feb. 1, Hurricane Electric experienced an outage affecting countries including the US, New Zealand, and Brazil. The outage was centered on Hurricane Electric infrastructure in Los Angeles, CA. After five minutes, the number of  interfaces affected there reduced and appeared to affect users in the US only. The outage lasted around nine minutes and affected customers including Disney Streaming and LinkedIn. The issue was cleared around 1:50 a.m. ET. Click here for an interactive view of the outage.

On Feb. 2, TATA Communications (America), experienced an outage that affected some of its downstream partners and customers in countries including the US. It was first observed around 1:30 a.m. ET as TATA nodes located in Los Angeles, CA, appeared to show outage conditions. After five minutes, other nodes located in Seville, Spain, and Singapore, were affected. As the number of affected nodes increased, so did the number of customer networks affected, including Wells Fargo, Reuters, Twitter, and Salesforce. The outage lasted around 18 minutes across a half hour period and was cleared around 2 a.m. ET. Click here for an interactive view of the outage.

Update Feb. 1

Outages in all three categories globally ticked up three from 275 to 278 during the week and dropped from 132 to 119 in the US, an 11% decrease.

Globally, the number of ISP outages decreased from 217 to 214. In the US, levels slightly decreased too, from 106 to 102.

Cloud-provider network outages jumped from 18 to 30 worldwide, a 67% increase. US outages increased from nine to 10.

Collaboration-app network outages remained flat both globally and in the US, with five worldwide and two in the US.

Two notable outages affecting Comcast Cable and Verizon occurred during the week.

On Jan. 26, Comcasts suffered a 24-minute outage first observed around 12 a.m. ET that appeared to be centered at Comcast nodes in Newark, NJ, affecting access to customer networks including Amazon, Bloomberg and CBS. At 20 minutes into the outage, disruption was also observed in a New York, NY, node. The outage affected multiple Comcast peers and customers, and it was cleared around 12:25 a.m. ET. Click here for an interactive view of the outage.

On Jan 12 Verizon experienced an outage that affected East Coast customers abilty to access services including Slack, Zoom, Amazon and Google Traffic disruption was observed around 11:30 a.m. ET across multiple nodes concentrated along the US Verizon backbone. During the outage Verizon indicated a fiber cut affected service delivery in the Brooklyn, NY, area but is not believed directly related to the larger outage. Network services started to stabilize around 12:30 p.m. ET. Click here for an interactive view of the outage.

Update Jan. 26

Outages in all three categories worldwide rose from 215 to 275, a 28% increase compared to the week before. In the US they rose from 105 to 132, a 50% increase.

Globally, the number of ISP outages increased from 160 to 217, a 36% increase and in the U.S., they increased from 88 to 106, a 20% increase.

Cloud provider network outages increased from 14 to 18 globally, a 29% increase, while in the US they jumped from two to nine, a 350% increase.

Collaboration app network outages worlwide increased from two to five and from one to two in the US.

There were two notable outages during the week. On Jan. 20, Level 3 Communications experienced an outage affecting downstream partners and customers in countries including the US, South Africa, the U.K., Turkey, Russia, New Zealand, and Australia. The outage was first observed around 12:20 p.m. ET and lasted about 34 minutes. It appeared centered on Level 3 nodes in Washington, DC, and affected customers including J.P. Morgan Chase, Visa International, and Oracle. IT was cleared around 12:55 p.m. ET. Click here for an interactive view of the outage.

On Jan.18, TATA Communications (America), experienced an outage affecting many of its downstream partners and customers in multiple countries including the US, the UK, Australia, India, Singapore, Germany, Hong Kong, Japan and Canada. The outage was first observed around 7:40 a.m. ET. TATA nodes located in Newark, NJ; New York, NY; Frankfurt, Germany; London, England; Singapore; Paris, France; and Buckinghamshire, England all appeared to show outage conditions. After five minutes, TATA nodes located in Chicago, IL; San Jose, CA; Seville, Spain; Tokyo, Japan; and Hong Kong were affected. As the number of TATA nodes affected increased, so did the number of customer networks affected, including Wells Fargo, Reuters, Oracle, and Amazon. The outage lasted around 34 minutes and was cleared around 8:15 a.m.ET. Click here for an interactive view of the outage.

Update Jan. 19

Worldwide outages in all three categories increased from 157 to 215 over the previous week, an increase of 37%, and were up from 88 to 105 in the US, a 19% increase.

The total number of ISP outages increased from 122 to 160, up 31%, while in the US the increase was 19%, from 88 to 105.

Cloud-provider network outages doubled from 7 to 14 worldwide and remained the same in the US at two.

Globally there were two collaboration-app network outages, up from one, and they remained at one in the US.

There were two notable outages. On January 13, AT&T experienced an outage that affected customers in multiple countries, including the US, UK, Japan, Germany, Canada, Australia, India, and the Netherlands. The outage started around 9:25 p.m. ET and centered on AT&T nodes located in Phoenix, AZ, and last 23 minutes, and was cleared around 9:50 p.m. ET. Click here for an interactive view of the outage.

On January 13, Microsoft experienced an outage that affected some downstream partners and access to services running in Microsoft environments. The 12-minute outage was first observed around 12:15 a.m. ET and occurred in three four-minute occurrences over a 30-minute period. IT centered in Microsoft nodes in Des Moines, IA. It was cleared around 12:45 a.m. ET. Given the timing and uniform pattern of the outage, it is likely to have been an automated maintenance process. Click here for an interactive view of the outage.

Update Jan 12

Global outages in all three categories increased from 96 to 157, up 39% from the week before, and in the US they jumped from 33 to 88, up 167%.

ISP outages worldwide went from 71 to 122, up 72%, and from 30 to 74 in the US, up 147%.

Cloud provider outages increased from two to seven, a 250% increase,

Globally, cloud provider network outages increased from 2 to 7, a 250% increase, and from zero to two in the US.

There was just one collaboration-app network outage and that occurred in the US. There were none the week before.

There were two notable outages during the week. On Jan. 4, Slack experienced an outage at 10 a.m. EST that lasted until after 1:40 p.m. It affected customers worldwide, with many users unable to login, send or receive messages, or to place or answer calls. Slack identified the cause  as insufficient router capacity in its cloud-provider network to meet customer demand. Starting at 11:15 a.m. EST Slack implemented a fix, and many customers could use the service again by 12:15 p.m. Slack announced messaging service restoration at 1:40 p.m. EST, although it’s calendar integration features took longer to restore.

On Jan. 7, Cogent Communications experienced an outage at 4:40 p.m. that lasted just under an hour and that affected downstream providers and Cogent customers globally. It consisted of four outage occurrences over a two-hour period, the first of which centered on Cogent nodes in Amsterdam, the Netherlands, mainly affecting European countries. Five later, Cogent nodes in Washington, DC, also exhibited outage conditions. At this point the Amsterdam nodes recovered, but the Washington D.C. nodes stayed down for another 35 minutes. Thirty-five minutes after the first outage cleared, the second outage was observed,  centering on nodes in Oakland, CA. It lasted four minutes and affected only customers in the US. This was repeated five minutes later, this time lasting around three minutes. Following a five minute break, a final four-minute outage was observed, this time centering on Cogent nodes in Las Vegas, NV, and Oakland, CA. The outage affected access to services including Amazon, Yandex (Russan based search engine), Oracle, and Sberbank (a state-owned Russian banking and financial services company). The outage was cleared around 6:35 p.m. EST. Click here for an Interactive view of the outage.

Update Jan. 5

Outages in all three categories decreased from 172 to 96, a 44% decrease compared to the week prior. In the US, they decreased from 80 to 33, a 59% decrease.

Globally, ISP outages decreased from 135 to 71, down 47%. In the US, they dropped from 74 to 30, a 59% decrease.

Cloud-provider network outages decreased from five to two, and in the U.S., from two to zero.

There were no collaboration app network outages the previous two weeks.

Update Dec. 21

Total outages across all three categories dropped vs. the previous week, from 252 to 193, a 23% difference. In the US they outages decreased from 115 to 89, also a 23% difference.

Globally, the number of ISP outages decreased from 180 to 145, a 19% decrease, and in the US they decreased from 97 to 75, a 23% drop.

Cloud-provider network outages worldwide decreased from 11 to four, down 64%, while in the US they fell from 2 to 1.

There were three collaboration-app network outages during the week, all in the US. The week before there were four outages, none of them in the US.

There were two notable outages. On Dec. 14 between 6:50 a.m. and 7:30 a.m. EST Google experienced a global outage. ThousandEyes tests measured elevated server wait times, indicating the application was taking longer to respond to service requests. During the service disruption, network paths connecting to Google’s edge servers did not show any traffic loss

The other notable outage hit NTT America and affected some downstream providers and NTT networks in multiple countries including the US, Germany, Brazil, the UK, and Canada. The outage was first observed around 8:30 a.m. EST and appeared to be centered on NTT infrastructure in Los Angeles, California, and Seattle, Washington. The outage lasted just over 19 minutes and was cleared around 8:50 a.m. EST. Click here for an interactive view of the outage.

Update Dec. 14

Total outages in all three categories were up 26%, from 200 to 252, over the previous week, and the were up 39% in the US, from 83 to 115.

ISP outages worldwide increased from 129 to 180, a 40% increase. In the US, ISP outages increased from 66 to 97, a 47% increase.

Worldwide cloud-provider network outages increased from eight to 11, up 38%. In the US they increased from one to two.

Globally, there were four collaboration-app network outages, up from zero. None of them were in the US.

There were two notable outages during the week. On Dec. 10, Hurricane Electric experienced a 17-minute outage that hit users in the US, Canada, Germany, Egypt, Sweden, France, and the UK. The outage was first observed around 2:11 p.m. EST centered on Hurricane Electric infrastructure in Atlanta, Georgia, and 10 minutes later just in Dallas, Texas. The last two minutes affected Hurricane Electric interfaces in both Atlanta and New York, New York. The issue was cleared around 2:38 p.m. EST. Click here for an Interactive view of the outage.

On December 8, Cogent Communications experienced an outage that, though only lasting four minutes, affected multiple downstream providers, as well as Cogent customers globally. The outage was first observed around 4:50 p.m. PST across Cogent’s global infrastructure, with Cogent nodes in the US, Germany, UK, Spain, France, Switzerland, and Ireland all reflecting the outage. The outage affected access to services including Microsoft, Amazon, SAP, Disney Streaming, and Wells Fargo. The outage was cleared around 4:55 p.m. PST. Click here for an Interactive view of the outage.

Update Dec. 7

Worldwide outages in all three categories were up compared to the week before from 159 to 200. In the US they were up from 48 to 83, 73% increase.

Globally ISP outages increased from 119 to 129, up 8%. They were up 65% in the US from 40 to 66.

Overall cloud-provider network outages increased from five to eight, but in the US they dropped from four to one.

For the first time since late September, there were zero collaboration app network outages anywhere in the world. 

A notable outage occurred Dec. 2 when Level 3 Communications experienced a 14-minute outage that affected several downstream providers as well as Level 3 customers in the UK and Canada. First observed around 1:10 a.m. PST centered on Level 3 nodes in Seattle, WA. Service was restored to many of the customers and providers after five minutes and the outage was cleared at 1:25 a.m. PST.

Update Nov. 30

During the last week outages worldwide across all three categories decreased from 306 to 159, a 48% drop. In the US, they decreased 75%, from 193 to 48.

Globally, the number of ISP outages decreased by 54%, from 256 to 119. In the US they dropped 77%, from 176 to 40.

Cloud-provider network outages decreased overall from eight to five, a 38% decrease. In the US, they went up one, from three to four.

Collaboration-app network outages decreased from 4 to 1 worldwide, and in the US, the number dropped from 3 to 1.

A notable outage occurred on Nov. 25 when Kinesis, a key AWS service, suffered a day-long outage that affected other AWS services and many of its customers who rely on these services to run their businesses (including iRobot’s Roomba vacuum cleaner app). The outage was not network related, and ThousandEyes tests did not detect an elevation in packet loss during the incident. AWS later described the root cause as related to an operating system configuration in a detailed incident post-mortem.

Update Nov 23

Total outages in all three categories were up 20% globally over the week before from 256 to 306. In the US, the total rose 28%, from 121 to 193.

ISP outages globally were up 28%, from 200 to 256, and up 71%, from 103 to 176 in the US.

Globally, cloud-provider network outages decreased from 12 to 8, a 33% decrease. In the US the number remained at three for the fourth week in a row.

Collaboration app network outages worldwide increased from three to four. The US number was three, just like the week before.

There were two notable outages during the week. On Nov. 17, Cogen Communications experienced an outage that lasted over two hours, affecting several downstream providers, as well as Cogent customers globally. The outage was made of two incidents over a three-hour period. The first was observed just after 3 a.m. EST and lasted around 48 minutes. It was observed in Cogent nodes in San Francisco, California, and Oakland, California, as well as Seattle, Washington. It affected access to organizations including Microsoft and ON24. Five minutes into the outage it expanded to nodes in locations including Salt Lake City, Utah; Denver, Colorado; Chicago, Illinois; Portland, Oregon, Los Angeles, California, and Cleveland, Ohio. This in turn affected a number of networks in the US and other countries. The number of Cogent nodes displaying symptoms decreased until around the 48-minute mark when the only nodes displaying outages were restricted to those in San Jose, California.

The second outage was observed at around 4:15 a.m. EST, 20 minutes after the first one cleare. Though it lasted 90 minutes, the second outage centered in San Francisco and Oakland nodes in California. Networks affected included the California-based unified communications provider 8×8, as well as TikTok (Bytedance), and Microsoft. Click here for an Interactive view of the outage.

Another notable outage occurred Nov. 18 about 3:25 a.m. affecting PCCW Global and some of its US East customers and partners using its network to access services including Twitter, TiVo, Ellie Mae and Verizon’s AirTouch. The outage lasted around 40 minutes and occurred over two incidents across an 80-minute period. Both incidents appeared to be focused on PCCW Ashburn, Virginia, nodes. The first incident began at around 3:25 a.m. EST and lasted 13 minutes. The second incident was observed 35 minutes later and lasted around 24 minutes. The outage was cleared at around 4:45 a.m. EST. Click here for an Interactive view of the outage.

Update Nov. 16

Global outages in all three categories increased 2%, from 251 to 256, and in the US they jumped 26% from 96 to 121.

ISP outages globally were up 1% from 198 to 200, while in the US they increased 23% from 84 to 103.

Outages in public cloud provider networks worldwide decreased 20% from 15 to 12 and stayed the same in the US at three.

Collaboration app network outages decreased from four to three globally, all of them in the US.

There were two noteworthy outages during the week. Microsoft suffered an outage at 1:20 p.m. EST Nov. 10 that lasted five minutes. It affected users in countries including the US, Mexico, Ireland, Russia, and China, and it was centered in Microsoft infrastructure in Des Moines, Iowa, and Cleveland, Ohio. Click here for an Interactive view of the outage.

A Verizon outage centered at facilities in Kansas City, Kansas, and Newark, New Jersey, started at 2:25 a.m. EST, causing slow page loads for users. The New Jersey outage lasted five minutes and affected users on the U.S. East Coast, and the Kansas City outage lasted 10 minutes. Click here for an Interactive view of the outage.

Update Nov. 2

Globally, outages observed across all three categories decreased from 227 to 214, a 6% decrease compared to the week prior. In the US, total outages decreased from 121 to 85, a 30% decrease compared to.

ISP outages worldwide decreased from 184 to 151, an 18% decrease. In the US, the number of dropped from 107 to 67, a 37% decrease.

Cloud-provider network outages increased from 9 to 29, a 222% increase jump. In the US, outages increased from one to three.

Collaboration-app network outages globally dropped from five to one, a 400% decrease compared to the week prior, and in the US they dropped from three to zero.

A notable outage was suffered by Cogent Communications on Oct. 24, affecting downstream providers and Cogent customers worldwide. The outage took place in two incidents over a 60-minute period, the first lasting 24 minutes and affecting Cogent nodes across the US including those in Washington, DC, New York, NY, Atlanta, GA, Dallas, TX, Los Angeles, CA, and San Francisco, CA. The second started about 15 minutes after the first ended and lasted about eight minutes, hitting the same locations. Click here for an interactive view of the outage.

Update Oct. 26

Globally, the number of total outages in all three categories decreased 4%, while US outages overall increased 9% compared to the week prior.

The number of ISP outages worldwide decreased by one, dropping from 185 to 184. In the U.S., the number increased from 93 to 107, a 15% jump. ISP outages accounted for 81% of all outages observed this week.

Globally, cloud-provider outages increased from seven to nine, and there was one outage in the US, where there were none the week before.

Collaboration-app networks suffered five outages worldwide, up two, and three in the US, one more than the week before.

Cogent Communications suffered a two-stage outage on Oct. 20 at 2:10 a.m. EDT. It affected Cogent customers globally as well as downstream providers connected to the Cogent network.

The first outage lasted about eight minutes, involved Cogent nodes in Chicago, Illinois; Denver, Colorado; Cleveland, Ohio; Salt Lake City, Utah; Dallas, Texas; and San Francisco, California. The second, 10-minute outage began around 15 minutes after the first ended and centered on Cogent infrastructure in San Jose, California. It affected fewer countries and no downstream providers. Click here for an interactive view of the outage.

Update Oct. 21

Globally, outages in all three categories fell 10%, from 261 the previous week to 236. In the US, outages fell 13%, from 128 to 111.

ISP outages worldwide declined 7%, from 199 to 185. The US saw a 15% drop, from 110 to 93.

Public cloud provider outages plunged 70% from 23 to 7 globally, while in the US they bottomed out, with zero reported outages.

Collaboration app network outages increased from two to three worldwide, with two of them occurring in the US, the same number as the week before.

A notable outage occurred about 3:30 a.m. EDT on Oct. 13, affecting the Zayo telecom network for more than 90 minutes and having an impact on other downstream provides. It started in Denver and spread to Zayo infrastructure in San Francisco, San Jose, Salt Lake City and parts of Australia and Europe. Click here for an interactive view of the outage.

Update Oct. 12

Globally, the number of outages observed in all three categories increased by 12% vs. the week before, from 233 to 261. In the US they increased 15%, from 111 to 128.

ISP outages worldwide increased 18%, from 168 to 199. In the US they rose 15%, from 96 to 110.

The number of outages in public cloud networks globally dropped from 28 to 23, a decrease of 18%. In the US they stayed steady at four.

In total there were two collaboration app outages, both in the US.

A notable outage for the week started Oct. 5 about 6 a.m. PDT and affected the collaboration app Slack. ThousandEyes tests returned 503 server errors, indicating the service was unavailable, as well as timeouts, suggesting that the application was running slower than normal. These problems were intermittent. No network issues were observed connecting to Slack’s edge servers, which are hosted within AWS. Slack confirmed issues within their backend systems and that they were resolved at 10 p.m. PDT. Click here for an interactive view of the outage.

Update Oct. 5

Globally, the total number of outages observed across all three categories increased by 21% from the week before, from 193 to 233. The increase was reflected in the US, where outages rose from 84 to 111, a 32% increase.

The number of ISP outages worldwide increased by 15%, rising from 146 to 168, accounting for 72% of all outages observed. In the US, the number rose from 72 to 96, a 33% increase.

Globally, cloud provider outages more than doubled from 11 to 28, a 155% increase. In the US, the number rose from one to four.

For the first time in three weeks there were no collaboration-app-network outages observed globally. The week before there were two.

A notable outage occurred about 3 a.m. EDT on Sept. 30 when Cogent, a US based multinational transit service provider, experienced a service disruption that affected users around the world attempting to access Microsoft, Amazon, Facebook, and Google services. The outage lasted 41 minutes spread over three hours and affected multiple parts of Cogent’s US network. The timing and pattern of the outage indicate traffic-engineering activity as the cause. The service was restored about 5:50 a.m. EDT. Click here for an interactive view of the outage.

Update Sept. 28

The number of outages observed worldwide in all three categories decreased by 16% from the week prior, from 230 to 193. In the U.S., the number of outages increased by 11, a 15% increase.

Global ISP outages dropped from 175 to 146, down 17%. But in the U.S., outages rose from 63 to 72, an increase of 14%.

Public cloud outages worldwide increased from eight to 11, up 38%. U.S. public cloud outages remained stable at one.

Collaboration app network outages jumped 300% globally from one to four. Most of that was due to the 300% increase in the U.S. from one to three.

Google suffered a notable disruption about 9 p.m. EDT Sept. 24 that prevented many users around the world from accessing services including Gmail, YouTube, Google Calendarand Google Meet. Front-end servers remained reachable during the outage, but requests to access services returned receive errors. Google confirmed that a pool of servers that handled application traffic on the backend had crashed. Service was restored about 9:30 p.m. EDT. Click here for an interactive view of the outage

Update Sept. 21

The number of outages reported globally in all three categories was 230 for the week Sept. 14-20, up 50% from 153 the week before. In the U.S., the count was 73 for the latest week, up two from the week before.

ISP outages rose 62%, from 108 to 175 worldwide, and from 57 to 63 in the U.S., and increse of 11%.

Public-cloud provider outages were down a third, from 12 to eight, with the count in the U.S. dropping from five to just one.

Collaboration-app network providers suffered a single outage this week, with that one occurring in the U.S. The week before, there were none.

Instagram and Amazon suffered notable outages during the week.

About 11:10 a.m. PDT on Sept. 17 Instagram experienced a service disruption that prevented many users worldwide from using the application. With no network or reachability issues with its front-end servers, and users receiving HTTP 502 error notifications, the cause  appeared to be anapplication back-end issue. Service began to return about 11:15 a.m. PDT, wth full service restored by 11:45 a.m. PDT. Click here for an interactive view of the outage.

About 2:45 p.m. EDT Sept. 14 Amazon suffered a 29-minute outage centered on nodes in Columbus, Ohio, and affecting Amazon cloud-compute instances at its Hilliard, Ohio, data center. The outage affected 99 interfaces and was contained to the one location. The impact was that some users experiencing non-responsive or slow EC2 instances. The outage was cleared just past 3 p.m. EDT.

Update Sept. 14

Globally the number of outages observed between Sept. 7 and 13 in all three categories decreased by 40% from the week before, from 256 to 153, the lowest figure observed since early February. In the U.S., the number of outages dropped from 134 to 71, a 47% decrease.

The number of ISP outages worldwide dropped 50% from 216 to 108. In the decrease was 54% from 123 to 57, the lowest weekly number since early April.

Cloud-provider outages globaly decreased by 25% from 16 to 12. In the U.S. they remained at 5 for the second consecutive week.

Globally and in the U.S. for the first week since early August, no collaboration app network outages were recorded.

Cogent Communications suffered three outages about 11:45 a.m. EDT on Sept. 11, lasting 36 minutes. They three lasted 13 minutes, 4 minutes and 19 minutes, spread across just over an hour. All three centered on Cogent node in Newark, N.J., and affected customers across the U.S. and also in the U.K., Netherlands, Canada, Mexico and India. The customers were using the network to access services such as Visa online services, Microsoft office, and Shopify. The outages occurring during business hours and their focus indicate that some form of control-plane condition was the cause. The problem was cleared about 12:50 p.m. EDT.

Update Sept. 7

Globally the number of outages observed in all three categories decreased by 33% from the week prior, from 381 to 256. In the U.S., the number of outages dropped by 46, decreasing from 180 to 134, a 26% decrease from the week prior.

Worldwide, the number of ISP outages decreased by 103, dropping from 319 to 216, a 32% decrease and accounting for 84% of all outages observed this week. In the U.S., the number of ISP outages decreased by 27, dropping from 150 to 123, an 18% decrease.

Cloud provider outages globally increased by a third from 12 to 16. In the U.S., outages more than doubled, rising from two to five.

There was just one collaboration app provider outage worldwide — not in the U.S. — down from two.

PCCW Global suffered two outages starting about 12:40 a.m. EDT Sept. 3, one lasting 20 minutes, and the other lasted six. The first centered on PCCW nodes located in Atlanta, Ga., and affecting services using the Charlotte Colocation and Affiliated Computer Services networks. The second started about half an hour after the first cleared and centered on PCCW nodes in Ashburn, Va., and affected access to Oracle Cloud services. All outages were cleared by 1:45 a.m. EDT. The cause was likely the result of a traffic-engineering exercise.

About 6 p.m. PDT, Comcast suffered a four-minute outage that affected users in the western U.S. centered on Comcast core devices in Sunnyvale, Calif. and mainly affecting services across Comcast Xfinity networks (Comcast Cable Communications). The outage would likely have caused internet connectivity slowdowns and disruption for users.

Update Aug. 31

Globally the number of outages observed across all three categories increased by 29% from the week prior, rising from 296 to 381. This was the largest number of outages recorded in a single week this year. In the U.S. outages increased 70% compared to the week prior from 106 to 180.

The vast majority of outages were due to ISP problems. Worldwide the number jumped from 214 to 319, with the count in the U.S. growing from 80 to 150.

Public cloud outages declined worldwide fom 27 to 12 and from four to two in the U.S.

Collaboration apps networks stayed stead at two worldwide, with both of them occurring in the U.S. where the count rose from zero to two.

CenturyLink suffered a major outage just after 6 a.m. EDT Aug. 30 that hit a broadrange of providers and businesses including Twitter, Microsoft (Xbox Live), Discord, Reddit, Cloudflare, OpenDNS, and Hulu. Shortly after the outage began, providers started rerouting traffic from CenturyLink to alternate providers in an effort to alleviate the impact, however, given the size and distribution of CenturyLink’s network, many services were still unreachable, ThousandEyes said. At 8:13 a.m. EDT, CenturyLink announced it was investigating issues affecting some services within their Mississauga, Ontario, Canada data center. Having identified the cause as an incorrect flowspec announcement from the Mississauga data center, CenturyLink requested that its Tier 1 Internet provider partners de-peer and ignore any traffic coming from its network. (BGP flow specification (flowspec) is a feature that allows you to rapidly deploy and propagate filter policies among a large number of BGP peer routers.) In order to resolve the issue, CenturyLink reset all the equipment and start with clean BGP routing tables, a process that took almost five hours to complete. Just before 3:00 p.m. EDT, CenturyLink announced that the issue had been resolved and all services had been restored.

Update Aug. 24

Globally the total number of outages observed across all three categories during the week Aug. 17-23 increased by 21% compared to the week prior, rising from 245 to 296. This increase in the U.S. rose from 90 to 106 an increase of 18% from the week prior.

ISP outages worldwide rose from 166 to 214 and from 72 to 80 in the U.S.

Public cloud network outages dropped worldwide from 28 to 27, and stayed the same in the U.S. at four.

Collaboration app network outages rose from zero to two globally, but remained at zero in the U.S.

ThousandEyes flagged three notable outages during the week.

Just after 8 a.m. EDT on Aug. 18, Spotify suffered an outage that prevented users from streaming songs from the service. The outage lasted just over an hour and would play songs for a few seconds, then pause and return an error. The outage is believed to be assosicated with an expired TLS certificate. Click here for an explanation on the impact of certificate expiration.

About 11:30 p.m. EDT on Aug. 17, Equinix suffered a power outage to a colocation center in Docklands, London. About 2 a.m. the failure of an output static switch from a UPS system triggered a fire alarm, resulting in loss of power for multiple customers. At 3:50 a.m. services started to be restored and were fully restored by 4:50 p.m. EDT. Affected customers included BT, Sky, Virgin Media, Giganet, Epsilon, SiPalto, EX Networks, Fast2Host, ICUK.net, and Evoke Telecom.

About 10:50 p.m. PDT on Aug. 19 Cogent Networks suffered a 36-minute outage affecting U.S. users’ access to Microsoft networks and associated services, as well as CDN content for services such as TikTok and ESPN. The outage affected nodes across the U.S. and apparently resulted from a configuration adjustment. A second outage two hours later at 11:26 p.m. PDT lasted 24 minutes and likely was connected to the first outage’s configuration adjustment. It affected users in the U.S., Asia-Pacific and Europe, Mid-East and Africa. Click here for an interactive view of the outages.

Update Aug. 17

Global outages across all three categories fell between the weeks of Aug. 3-9 and Aug. 10-16 from 294 to 245 (-17%) and in the U.S. from 123 to 90 (-27%).

ISP outages dropped worldwide from 227 to 166 and from 109 to 72 in the U.S.

Public cloud outages worldwide fell from 30 to 28 and from five to four in the U.S.

Collaboration app network outages worldwide remained at 0 for the second week in a row.

Cogent Networks suffered a notable outage at about 10:30 p.m. EDT on Aug. 13 that lasted about 40 minutes and affected its Atlanta, Ga., network. It affected access to Microsoft networks and associated services, such as Sharepoint, Office, Azure services and hosting, and appeared to be located in the Cogent data center in Atlanta. Based on the affected interfaces and nodes it appears it was a result of configuration adjustments rather than a control-plane issue.

Separately, BT incurrd an outage on its European backbone about 7:30 p.m. EDT affecting customers and partners in the U.K., U.S., Sweden, and Germany. The outage came in three four-minute intervals spanning 25 minutes, indicatingan automated restoration process and likely was for maintenance. The outage cleared at 7:55 p.m. EDT.

Update Aug. 10

Globally, there were no collaboration app network provider outages observed this week. In the U.S., this is the second consecutive week of zero outages.

Overall the number of outages in all three categories increased from 248 to 294, the highest tally since late April. In the US the total was up from 99 to 123.

ISP outages globally increase from 181 to 227. In the U.S. the increase was from 88 to 109.

Cloud-provider outages worldwide rose from 18 to 30, and in the U.S. increased from three to five.

Collaboration app network outages dropped from 1 to 0. U.S. outages remained at 0.

About 8:25 p.m. PDT on Aug. 4 Cogent Networks experienced a 15-minuite network disruption affecting parts of its San Francisco network and its infrastructure in the U.K., Germany and the Netherlands. It affected nearly 70 network interfaces. The scope and timing of the disruption indicates the provider was making service adjustments/maintenance. An interactive visualization of the outage is here.

About 3:25 a.m. CDT on Aug. 5, GTT had a 10-minute network disruption affecting parts of their infrastructure in Dallas, Chicago, Los Angeles, and London. The timing and scope of the disruption are consistent with service-adjustment activity. Interactive visualization of the outage is here.

Update Aug. 3

During the week of July 27-August 2, the number of outages globally in all three categories decreased by 6% from the week prior, from 263 to 248. In the U.S., outages rose from 90 to 99, a 10% increase from the week prior.

The number of ISP outages globally decreased by 1%, dropping from 183 to 181. In the U.S., ISP outages rose from 73 to 88, a 21% increase compared to the week prior.

Worldwide cloud provider outages decreased by 38% when compared to the week prior. In the U.S., there were three public cloud network outages for the third consecutive week.

Globally, collaboration app network provider outages decreased from 3 to 1, a drop of 66% when compared to the week prior. In the U.S., no collaboration app network outages were recorded this week.

There were two noteworthy outages during the period:

Verizon Business suffered an outage within their network that impacted users accessing services such as Zoom, Bloomberg Professional and Flagstar Bank. The outage centered on former UUNET nodes located in San Jose Calif., and Seattle. The outage occurred just before 11:00AM PDT on July 27 and lasted a total of 27 minutes, over a 55-minute period. The outage cleared around 11:55AM PDT.

Reddit users began to experience some errors when accessing Reddit’s site around 10:30AM EDT on July 29. During the incident, the Reddit site was reachable, but many of the page components produced errors either failing to load or simply not responding to requests, all of which is indicative of an application issue as opposed to a network disruption. A fix was implemented by Reddit at 1:32PM EDT, and Reddit announced that the issue had been resolved at 3:24PM EDT.

Update July 27

During the week July 20-26, the number of outages globally in all three categories increased by 14% from the week prior, from 231 to 263. In the U.S., outages rose from 70 to 90, a 29% increase from the week prior.

The number of ISP outages globally increased by 5%, from 175 to 183. In the U.S., ISP outages rose from 60 to 73, a 22% increase and a return to late June levels.

Cloud-provider outages worldwide were almost double, increasing 93%, from 15 to 29. In the U.S., there were three public cloud network outages for the second consecutive week.

Globally, collaboration-app network provider outages increased from 1 to 3, a rise of 200%, with all outages attributed to a single provider in the U.S. These were the first collaboration outages seen domestically since mid-June.

The most noteworthy outage of the week occurred just after 3:15 a.m. EDT on July 23 when services on Garmin.com and Garmin Connect became interrupted. The outage – which at the time of this writing is ongoing – also affects Garmin call centers, which were unable to receive calls and emails or participate in online chats. The network connectivity to Garmin services remains active, but syncing data and accessing functions on Garmin Connect remain down. Since Thursday, users attempting to access these functions have been met with a “Server Maintenance” message. In a press release on the 27th, Garmin confirmed it suffered a cyber attack that encrypted some of their systems, resulting in many of their online services being interrupted.

Update July 20

During the week of July 13-19 global outages of all three kinds dropped 19% from the week before, from 285 to 231. The drop in U.S. outages was even greater – 28% – from 97 to 70.

ISP outages dropped globally from 215 to 175 or 19%. In the U.S. they dropped 34%, from 91 to 60.

Cloud provider outages dropped 58%, from 36 to 15, and most of those occurred in South America. U.S. outages rose from two to three, or 50%.

Globally, collaboration-app network outages decreased from four to one,  a drop of 75%,  with the outage attributed to a single provider in the U.K. There were no outages in the U.S. for the fifth week in a row.

GitHub suffered an outage just after 2:30 a.m. EDT July 13 that lasted until 4:31 a.m. EDT. Users were affected worldwide. GitHub hasn’t provided details about what caused the outage, but ThousandEyes said there are indications that the source was within GitHub services.

WhatsApp suffered an outage for about an hour on July 14 starting about 6:45 p.m. EDT that prevented users globally from sending and receiving messages on the service. Once the outage was over, users could connect to the service, but once loaded they were unable to execute any functions. WhatsApp confirmed to ThousandEyes that the cause was an internal update to servers.

Update July 6

For the week June 29-July 5, the number of global outages across all three categories increased from 199 to 208, a 5% increase. In the U.S., however, outages dropped from 83 to 63, a 24% decrease from the week prior.

Globally, the number of ISP outages decreased 5%, from 160 to 152. The number of U.S. ISP outages decreased as well, from 77 to 55 outages. Both drops represent the lowest numbers of ISP outages since February.

Worldwide, cloud-provider outages decreased by 11%, from 28 to 25. The lone cloud-provider outage recorded in the U.S. this week was a decrease of 80% from five outages the week before.

Globally, collaboration-app network provider outages increased from 0 to 2, the first outages recorded since early June. The U.S. had zero collaboration app outages this week, recording just two outages in all of June.

There were two noteworthy outages during the period:

On June 29 at 8:15 a.m. PDT a power failure affected the Google Compute Engine in service zones us-east1-c and us-east1-d. Customers experiencing the service interruption would not have been able to reach existing Virtual Machines or create new ones. Other zones in the region were not impacted, so a redundant architecture, where workloads are hosted in multiple zones within a region, would have mitigated user impact. Google announced that all services had been restored and the issues resolved at 1:06PM PDT.

On July 4 about 5 p.m. PDT Comcast suffered a 33-minute outage affecting U.S. uses and those in multiple other countries trying to access services using the Comcast network. The outage was caused by two events over a 40-minute period and affected Comcast nodes on the U.S. east and west coast and the central region. The outage was cleared at 5:45 p.m. PDT.

Update June 29

The total number of global outages for the week of June 22-28 decreased by 29% from the week prior, reaching the lowest number of outages observed since early April. In the U.S., the number of outages was down by 20%.

ISP outages were also down to the lowest levels recorded in the past eight weeks. Globally, the number was down by 26% this week, dropping from 216 to 160. In the U.S., ISP outages were down by 20% compared to last week, from 96 to 77.

Globally, cloud-provider outages decreased by 39% this week from 46 to 28, with the bulk being attributed to South America. In the U.S. cloud provider outages were down by 55% from 11 to five compared to the week prior.

Globally, last week saw zero collaboration app network provider outages for the second week in a row.

Comcast Cable Communications suffered a 24-minute outage affecting users across the U.S. accessing services including Zoom, Visa and Bank of America. The outage was focused on Comcast infrastructure located in Seattle, Wash., and was cleared just after 2:30AM PDT.

Update June 22

Cloud provider outages spiked to new record-level highs for the week of June 15-21. Globally, the number of cloud provider outages increased from 20 to 46, a 130% increase. In the U.S., the number of outages increased 175%, from 4 to 11.

Last week also saw record-level lows. For the first time since the week of February 24, there were zero collaboration app network provider outages both globally and in the U.S.

Globally, the number of ISP outages decreased marginally last week, dropping from 221 to 216. In the U.S., however, the number of ISP outages increased by 22%, compared to the week prior.

From a total outage perspective, the number decreased marginally globally, from 287 to 282. The U.S., however, saw a 14% increase in outages relative to the week prior. From 99 to 113 outages.

An outage of note occurred June 18 at 2:45 PDT and lasted 23 minutes, affecting multiple countries including Australia, France, Germany and the U.K. The outage affected access to Microsoft services including some identity systems and appeared to originate in Microsoft nodes in Des Moines, Iowa. The outage was divided into two outages over two hours, concluding just after 5 p.m. PDT. Click here for an interactive view of the outage.

Update June 15

During the week June 8-14, worldwide the number of total outages in all three categories rose 35%, and jumped 34% in the U.S.

ISP outages globally increased 32% from 168 to 221 and rose 14% in the U.S. from 68 to 79.

Cloud-provider outages decreased from 23 to 20 (-13%) and doubled from two to four in the U.S.

Global collaboration-network outages quadrupled from one to four worldwide and remained flat in the U.S. with one outage.

Verizon Business suffered a two-minute outage June 10 at 2:50 p.m. PDT that affected users in multiple countries trying to access services from Microsoft, Zoom and Amazon, among others. The outage centered on infrastructure in Seattle. The brevity of the problem indicates the systems were reacting automatically to address a fault.

Update June 8

During the week of June 1-7 he number of outages across all three categories of providers decreased 15% (249 to 212) worldwide vs. the week before, and dropped 29% in the U.S. (104 to 74).

ISP outages worldwide fell from 196 to 168, and in the U.S. the decline was from 91 to 68.

Public-cloud outages globally dropped from 26 to 23, and from six to two in the U.S.

Outages for collaboration-app networks all took place in the U.S. over the past two weeks and fell from seven to one.

Cable operator Spectrum sustained an outage June 2 that affected multiple channels across the U.S., with East Coast viewers suffering the worst of it for about three hours from just after 9 p.m. Eastern to around midnight.

Update June 1

The total number of outages worldwide among all three categories decreased 11% from 280 to 249 during the week of May 25-31, and declined 10% in the U.S. from 115 to 104.

The number of ISP outages decreased 13% globally from 225 to 196. In the U.S. they dropped from 109 to 91, a 17% reduction.

Public-cloud outages dropped from 35 to 26 worldwide but increased in the U.S. from two to six.

Collaboration app network provider outages totaled seven worldwide, up from four the prior week, and in the U.S. jumped seven-fold from one to seven.

Amazon suffered an outage May 28 startinn about 12:12 p.m. PDT when its website because unreachable for some users globally due to a DNS failure, making it impossible for some users to reach Amazon’s servers. The DNS servers were available at the time, so the problem was likely a misconfiguration. The issue was resolved by 12:38 p.m. PDT.

Update May 25

ISP outages jumped by more than a third in the U.S. during the week ending May 24, while outages among all three categories of provider registered a small increase.

Total outages among all categories rose from 263 to 280 globally, and from 86 to 115 in the U.S.

ISP outages rose from 223 to 225 worldwide, with most of the increase due to outages in the U.S., which jumped from 80 to 109.

Public cloud outages overall rose from 24 to 35, with U.S. outages only ticking up from one to two.

Collaboration app network outages dropped from four to five compared to the week before, with a drop in U.S. outages from five to one accounting for the improvement.

There were two noteworthy outages during the week:

  • Just after 3 a.m. EDT on May 20, Google suffered an outage in the East Coast part of its network that affected users accessing site such as Uber and Shopify that are hosted by the public cloud provider. The outage lasted nine minutes and was located in the New York City metro area, and since it was during off-peak hours, impact on users was likely minimal. Click here for an interactive visualization of the outage.
  • About 8 a.m. EDT on May 22, Hurricane Electric suffered an outage that lasted more than an hour and affected several countries. The worst part lasted 44 minutes. The outage was observed at Hurricane Electric nodes across multiple global locations, and affected users reaching sites including Microsoft, Amazon, Workday and Credit Suisse. Click here for an interactive visualization of the outage.

Update May 18

The total outages globally leapt up 22% between the week of May 4-10 and the week of May 11-17, from 216 to 263.

ISP outages worldwide grew from 183 to 223, while those outages in the U.S. moved up from 74 to 80.

Public cloud outages grew from 13 to 24 worldwide, but dropped from three to one in the U.S.

Collaboration app providers dropped worldwide from six to five, while for the third week in a row the U.S. outages totaled five.

One noteworthy outage found users around the world unable to load content hosted by YouTube for about half an out on May 14 starting about 4 p.m. PDT. Users could connect to YouTube servers, but the service itself didn’t load properly. ThousandEyes said a critical object on the site was responding erroneously, indicating a web-application issue, perhaps due to a site update or change.

Update May 11

Overall outages dropped for the second week in a row, from 282 the week before last to 216 (23%) last week. In the U.S. outages fell from 98 to 83 (15%).

ISP outages worldwide were down 22% (from 236 to 183) and in the U.S. they fell 18% (from 90 to 74).

Cloud networking outages dipped from 13 to 12 (7%) worldwide, and in the U.S. dropped from 3 to 1 (66%)

Collaboration-application network outages fell worldwide from seven to six (14%) and stayed steady in the U.S. at five.

ISP Cogent Communications suffered a notable 38-minute outage starting on May 5 about 12 a.m. PDT, affecting its infrastructure in the U.S., U.K., Canada and France. Traffic terminated in Cogent’s network, and users were unable to reach sites such as Amazon, Microsoft, WorldPlay and Oracle Cloud.

Update May 3

Overall, the number of outages dropped across the board last week, from 313 to 282, a decrease of 31worldwide, mainly due to a drop in overall U.S. outages from132 to 98.

Most of that was driven by a decline by at least half of outages for both ISPs and collaboration providers.

Total ISP outages dipped from 250 down to 236 globally, while ISP outages in the U.S. went from 124 to 90. That’s the second week in a row of declines.

Worldwide, collaboration provider outages dropped from 14 to seven, but rose from three to five in the U.S.

Public-cloud outages also declined week-over-week from 26 to12 globally and from four to one in the U.S.

Virgin Media suffered a noteworthy outage during the evening of April 27 in the UK, Ireland and the Netherlands. It started at 5:15 p.m. local time and lasted 15 minutes, then  again at 6:15 for another 15 minutes. The pattern repeated several more times, transitioning to briefer outages that ended by 1:30 a.m. April 28. Based on the pattern, Thousand Eyes suggests that an automation issue could have been the root cause, though no official reason had been made public.

Update April 27

Globally, outages hit a record high during the week of April 20-26 – 313 – up 11% from the week prior and up 77% from the temporary decrease the week of April 6. The number of outages is the most since the end of March, but two issues – fiber cuts in CenturyLink’s network and a broad Tata Communications outage – helped push that number up. Outages in the U.S. hit record numbers, too, at 132.

The ISP outages worldwide tallied 250, and in the U.S. spiked to 124, a new high including the CenturyLink and Tata problems.

After a two-week downward trend, cloud provider outages globally increased to 26, the same levels registered in late March and early April. In the U.S. they went down slightly from six to four. Overall, these numbers are still in the normal range and in general, cloud providers continue to hold up well.

Collaboration-application-network outages increased slightly over the previous week from 11 to 14 but continue to remain low relative to the peak of 29 observed in the week of March 30-April 5. U.S. outages dipped from four to three.

The major outage in the Tata Communications network on April 20 affected its infrastructure in the U.K., France, Germany, and India. Around 11 a.m. local U.K. time, traffic attempting to reach services such as Amazon, ServiceNow, and Oracle Cloud began terminating in its network, affecting local users, users in the U.S. and elsewhere. The outage lasted about 20 minutes, and affected more than 80 network interfaces across multiple regions and cities.

Another far reaching outage occurred the next day in the U.S., when at least one fiber cut within CenturyLink’s network in Southern California affected enterprises and consumer users up and down the West Coast, and as far away as Raleigh, NC. It affected the Level 3 part of CenturyLink’s network, a transit provider it acquired in 2017. Merrill Lynch reported a disruption to its business as a result of the network outage, during which its brokers were intermittently unable to access their workstations. The incident started hitting enterprises and their users around 10 a.m. ET, with most disruption resolved by around 11:30 a.m. ET.

Update April 20

Total outages spiked 58% during the week of April 13-19 fueled by one prolonged outage that had a significant effect on multiple ISPs.

That one outage affected TeliaNet, Level 3, AT&T, and other ISPs on April 13. TeliaNet was the most affected of the group, and it’s not clear whether it was the cause of the outage,ThousandEyes says. During the downtime, at least one application provider withdrew route through the TeliaNet network until the next day.

Had that one outage not occurred, the total number of outages for the week would have been in the low 200s, which ThousandEyes says is in the normal range. As it turned out, total outages rose 59% from 177 to 282.

ISP outages jumped from 141 to 243 week over week, up 72% worldwide, and from 56 to 98 (75%) in the U.S.

Public cloud outages dropped off from the week before, from 19 to 14 (down 36%) worldwide, and stayed steady at six outages in the U.S.

Global application-provider networks had a slight increase in outages worldwide, up from 9 to 11 (22%), but dropped from 9 to 4 in the U.S., down 55%.

In other major outages,ThousandEyes said it appeared that several banks effectively suffered denial of service conditions when customers apparently flooded their sites seeking to find out whether they’d received their pandemic-related stimulus checks. Content-delivery networks serving the banks didn’t have network issues yet were unable to return Web content for many banking sites, “likely due to bank origin servers unable to handle the high volume of requests,” ThousandEyes says.

Update April 13

During the week April 6-Apri 12, service outages for ISPs, cloud providers, and conferencing services dropped overall. They went from 298 down to 177 globally (40%, a six-week low), and in the U.S. dropped from 129 to 72 (44%).

Globally, ISP outages were down from 229 to 141 (38%), and in the U.S. were down from 100 to 56 (44%).

Cloud provider outages were also down overall from 25 to 19 (24%), ThousandEyes says, but jumped up from one to six (500%) in the U.S., which saw the highest rate of increase in seven weeks. Even so, the U.S. total was relatively low. “Again, cloud providers are doing quite well,” ThousandEyes says.

Conferencing services recovered from a spike the week before, and all of the outages – nine – were i.n the U.S. Globally outages dropped from 29 to nine (68.9%), and in the U.S. from 25 to nine (64%).

Update April  6

Outages for ISPs globally were down 9.13% during the week of March 30 from the week before, whereas U.S. outages were down 16.7%, dropping from 120 to 100. Worldwide the outages were also down, from 252 to 229. Public cloud outages rose worldwide from 22 to 25, and in the U.S. there was one outage, up from zero the previous week.

Outages for collaboration apps rose dramatically, increasing more than 260% globally and more than 500% in the U.S. over the week before. The actual numbers were an increase from eight to 29 worldwide, and up from 4 to 25 in the U.S.

ISP Cogent Communications suffered what ThousandEyes called a significant outage April 1 from 12:30 p.m. to 12:35 p.m. Pacific time that affected the ability of users to connect to sites and service such as Office 365. Because Cogent peers with other providers, the customers of those providers might have experienced disruption to some services as well.

Access to Yelp and some applications and sites hosted by AWS and Cloudflare were unreachable between 12:35 and 12:40 p.m. Pacific time on April 1 when Russian ISP Rostelecom leaked illegitimate IP address prefixes to its ISP peers, including Level 3. Such leaks lead to incorrect or less than optimal routing, according to ThousandEyes.

In this case, the leak improperly inserted Rostelecom into the network path between users and the affected providers. Level 3 propagated those improperly advertised routes to its peers, setting off a chain of events that led to massive traffic drops during the outage time.

Update March 31

Looking at data over the past six weeks, ThousandEyes finds that the combined worldwide service outages among ISPs, public cloud providers, conferencing services and edge networks (content-delivery networks, DNS, and security as a service) has risen 42%.

Cloud-provider performance hasn’t been affected much at all, and in fact multiple weeks last year had a much higher number of outages.

Week of March 23

Between the week of March 16 and March 23, the outages suffered by ISPs worldwide went down from 230 to 203, nearly 12% lower. In the U.S., the number of outages rose from 100 to 107, up 7%.

Public cloud outages were down both worldwide and in the U.S. Worldwide, they dropped from 21 to 15 (down 28%), and in the U.S. dropped from six to zero. There was a service disruption to Google traffic due to a router failure in Atlanta, it did not meet ThousandEyes’ definition of an outage, and it wasn’t related to COVID-19.

Collaboration applications also showed a decline in outages from the week before, dropping from 15 to six worldwide, and down from seven to three in the U.S., reductions of 60% and 57%, respectively.

ThousandEyes highlighted what it considered significant outages:

  • “Cogent Communications suffered yet another significant outage this week — its fifth major outage this month. The outage occurred within parts of Cogent’s network in Northern California and Oregon and impacted users connecting to sites and services in those regions, including projectbaseline.com, the website of Verily’s much-publicized COVID-19 testing program.”
  • ”For approximately 20 minutes on March 25th, ThousandEyes observed that some users located on the East Coast may not have been able to reach Google services due to 100% traffic loss. A short time later, Google’s SVP of Engineering tweeted that the incident was due to a router failure in Atlanta, Georgia. US users outside of the Northeast were also impacted intermittently, although they would have experienced the incident as site errors when trying to reach some Google sites, such as google.com. The HTTP server errors seen during this period are consistent with an inability to reach the backend systems necessary to correctly load various services. Any traffic traversing the affected region — connecting from Google’s front-end servers to backend services — may have been impacted and seen the resulting server errors.”

With the increased use of remote-access VPNs, major carriers are reporting dramatic increases in their network traffic – with Verizon reporting a 20% week-over-week increase, and Vodafone reporting an increase of 50%.

While there has been no corresponding spike in outages in service provider networks, over the past six weeks there has been a steady increase in outages across multiple provider types both worldwide and in the U.S., all according to ThousandEyes, which keeps track of internet and cloud traffic.

This includes “a concerning upward trajectory” since the beginning of March of ISP outages worldwide that coincides with the spread of COVID-19, according to a ThousandEyes blog by Angelique Medina, the company’s director of product marketing. ISP outages worldwide hovered around 150 per week between Feb. 10 and March 19, but then increased to between just under 200 and about 225 during the following three weeks.

In the U.S. those numbers were a little over 50 in the first time range and reaching about 100 during the first week of March. “That early March level has been mostly sustained over the last couple of weeks,” Medina writes.

Cogent Communications was one ISP with nearly identical large scale outages on March 11 and March 18, with “disruptions for the fairly lengthy period (by Internet standards) of 30 minutes,” she wrote.

Hurricane Electric suffered an outage March 20 that was less extensive and shorter than Cogent’s but included smaller disruptions that altogether affected hundreds of sites and services, she wrote.

Public-cloud provider networks have withstood the effects of COVID-19 well, with slight increases in the number of outages in the U.S., but otherwise relatively level around the world. The possible reason: “Major public cloud providers, such as AWS, Microsoft Azure, and Google Cloud, have built massive global networks that are incredibly well-equipped to handle traffic surges,” Medina wrote. And when these networks do have major outages it’s due to routing or infrastructure state changes, not traffic congestion.

Some providers of collaboration applications – the likes of Zoom, Webex, MSFT Teams, RingCentral – also experienced performance problems between March 9 and March 20. ThousandEyes doesn’t name them, but does list performance numbers for what it describes “the top three” UCaaS providers. One actually showed improvements in availability, latency, packet loss and jitter. The other two “showed minimal (in the grand scheme of things) degradations on all fronts — not surprising given the unprecedented strain they’ve been under,” according to the blog.

Each provider showed spikes in traffic loss over the time period that ranged from less than 1% to more than 4% in one case. In the case of one provider, “outages within its own network spiked last week, meaning that the network issues impacting users were taking place on infrastructure managed by the provider versus an external ISP.”

“Outage incidents within large UCaaS provider networks are fairly infrequent; however, the recent massive surge in usage is clearly stressing current design limits. Capacity is reportedly being added across the board to meet new service demands,” according to the blog.

Meanwhile, ThousandEyes has introduced a new feature on its site a Global Internet Outages Map that is updated every few minutes. It shows recent and ongoing outages

Google outage unrelated to COVID-19

On March 26 Google suffered a 20-minute outage on the East Coast of the U.S., apparently from a router failure in Atlanta, ThousandEyes said, agreeing with a statement put out by Googe to that effect.

That problem affected other regions of the U.S. as evidenced by Google sites such as google.com intermittently returning server errors. “These 500 server errors are consistent with an inability to reach the backend systems necessary to correctly load various services,” ThousandEyes said in a statement. “Any traffic traversing the affected region — connecting from Google’s front-end servers to backend services — may have been impacted and seen the resulting server errors.”

ThousandEyes posted interactive results of tests it ran about the outage here and here.

.

Networking
]]>
https://www.networkworld.com/article/968540/covid-19-weekly-health-check-of-isps-cloud-providers-and-conferencing-services.html 968540
Cisco aims AI advancements at data center infrastructure Wed, 20 Mar 2024 19:17:21 +0000

It wasn’t that long ago that ideas about revamping data-center networking operations to handle AI workloads would have been confined to a whiteboard. But conditions have changed drastically in the past year.

“AI and ML were on the radar, but the past 18 months or so have seen significant investment and development – especially around generative AI. What we expect in 2024 is more enterprise data-center organizations will use new tools and technologies to drive an AI infrastructure that will let them get more data, faster, and with better insights from the data sources,” said Kevin Wollenweber, senior vice president and general manager of Cisco’s networking, data center and provider connectivity organization. Enterprises also will be able to “better handle the workloads that entails,” he said.

A flurry of recent Cisco activity can attest to AI’s growth at the enterprise level.

Cisco’s $28 billion Splunk acquisition, which closed this week, is expected to drive AI advancements across Cisco’s security and observability portfolios, for example. And Cisco’s newly inked agreement with Nvidia will yield integrated software and networking hardware that promises to help customers more easily spin up infrastructure to support AI applications.

As part of the partnership, Nvidia’s newest Tensor Core GPUs will be available in Cisco’s M7 Unified Computing System (UCS) rack and blade servers, including UCS X-Series and UCS X-Series Direct, to support AI and data-intensive workloads in the data center and at the edge, the companies stated. The integrated package will include Nvidia AI Enterprise software, which features pretrained models and development tools for production-ready AI.

“The Nvidia alliance is actually an engineering partnership, and we are building solutions together with Nvidia to make it easier for our customers – enterprises and service providers – to consume AI technology,” Wollenweber said. The technologies they deliver will enable AI productivity and will include toolsets to build, monitor and troubleshoot the fabrics so they run as efficiently as possible, Wollenweber said. “Driving this technology into the enterprise is where this partnership will grow in the future.”

AI accelerates network investments

Greater network capacity will be a requirement for AI deployments, industry watchers note.

According to research firm IDC, revenues in the data-center portion of the Ethernet switching market rose 13.6% in 2023 as enterprises and service providers required ever-faster Ethernet switches to support rapidly maturing AI workloads. “To illustrate this point, revenues for 200/400 GbE switches rose 68.9% for the full year in 2023,” IDC analyst Brandon Butler said in a Network World article.

“The Ethernet switching market in 2023 was dominated by the impact of AI, with the overall market rising 20.1% in 2023 to reach $44.2 billion,” Butler said.

The Dell’Oro Group also wrote recently about how AI networks will accelerate the transition to higher speeds. “For example, 800 Gbps is expected to comprise the majority of the ports in AI back-end networks by 2025, within just two years of the latest 800 Gbps product introduction,” wrote Sameh Boujelbene, vice president at Dell’Oro Group.

“While most of the market demand will come from Tier 1 Cloud Service Providers, Tier 2/3 and large enterprises are forecast to be significant, approaching $10 B over the next five years. The latter group will favor Ethernet,” Boujelbene stated.

Ethernet as a technology gets tons of investment, and it evolves quickly, Wollenweber said. “We’ve gone from 100G to 400G to 800G, and now we’re building 1.6 terabit Ethernet now, and it’s also the predominant networking technology for the rest of the data center,” Wollenweber said.

The 650 Group reported this week that networking speeds will continue to increase at a rapid pace to keep up with AI and machine learning (ML) workloads. Early 2024 demonstrations of 1.6 terabit Ethernet (1.6 TbE) show that Ethernet is keeping pace with AI/ML networking requirements, and 650 Group projects that 1.6 TbE solutions will be the dominant port speed by 2030.

Ethernet plus AI

Ethernet is the foundation for most enterprise data-center networks today. So, when enterprises want to add GPU-based systems for AI workloads, it makes sense to stick with Ethernet; the IT and engineering staffs understand Ethernet, and they can get consistent performance out of Ethernet technologies and integrate these AI compute nodes, Wollenweber said.

“An AI/ML workload or job – such as for different types of learning that use large data sets – may need to be distributed across many GPUs as part of an AI/ML cluster to balance the load through parallel processing,” Wollenweber wrote in a blog about AI networking. 

“To deliver high-quality results quickly – particularly for training models – all AI/ML clusters need to be connected by a high-performance network that supports non-blocking, low-latency, lossless fabric,” Wollenweber wrote. “While less compute-intensive, running AI inferencing in edge data centers will also involve requirements on network performance, scale and latency control to help quickly deliver real-time insights to a large number of end-users.”

Wollenweber cited the remote direct memory access (RDMA) over Converged Ethernet (RoCE) network protocol as a means to improve throughput and lower latency on compute and storage traffic; RoCEv2 is used to enable access to memory on a remote host without CPU involvement.

“Ethernet fabrics with RoCEv2 protocol support are optimized for AI/ML clusters with widely adopted standards-based technology, easier migration for Ethernet-based data centers, proven scalability at lower cost-per-bit, and designed with advanced congestion management to help intelligently control latency and loss,” Wollenweber wrote.

Cisco’s AI infrastructure

What customers will need are better operational tools to help schedule AI/ML workloads across GPUs more efficiently. In Cisco’s case, those tools include its Nexus Dashboard. 

“How do we actually make it simpler and easier for customers to tune these Ethernet networks and connect this massive amount of compute as efficiently as possible? That’s what we are looking at,” Wollenweber said.

Cisco’s recent spate of news builds on earlier work to shape its AI data center directions. Last summer, for example, Cisco published a blueprint defining how organizations can use existing data center Ethernet networks to support AI workloads. 

A core component of that blueprint is its Nexus 9000 data center switches, which “have the hardware and software capabilities available today to provide the right latency, congestion management mechanisms, and telemetry to meet the requirements of AI/ML applications,” Cisco wrote in its Data Center Networking Blueprint for AI/ML Applications. “Coupled with tools such as Cisco Nexus Dashboard Insights for visibility and Nexus Dashboard Fabric Controller for automation, Cisco Nexus 9000 switches become ideal platforms to build a high-performance AI/ML network fabric.”

Another element of Cisco’s AI network infrastructure is its high-end programmable Silicon One processors, which are aimed at large-scale AI/ML infrastructures for enterprises and hyperscalers.

Networking
]]>
https://www.networkworld.com/article/2067591/cisco-aims-ai-advancements-at-data-center-infrastructure.html 2067591
Cato adds AI-driven XDR to SASE to reduce network outages Wed, 20 Mar 2024 18:00:04 +0000

Cato Networks announced the availability of AI-powered tools that aim to more quickly identify outages and conduct root-cause analysis as part of its extended detection and response (XDR) and cloud-based secure access service edge (SASE) solution.

Network Stories for Cato XDR, which is part of the Cato SASE Cloud platform, uses AI algorithms that are trained to analyze network signals and detect threats and security anomalies. The AI-powered tools evaluate the alerts to identify the root cause behind network blackouts, downed links, BGP session disconnects, and SLA-related incidents. Cato AI prioritizes network incidents to help IT teams focus their efforts on the most critical incidents first, reducing the impact of potential security threats. Using generative AI, Network Stories can summarize the analysis of network events and incidents into human-relatable explanations.

“With our converged security and networking platform, we leverage advances in one domain, in this case security, to help another domain – networking,” said Shlomo Kramer, CEO and co-founder of Cato Networks, in a statement. “Our security-trained AI has been expanded to help NOC [Network Operations Center] teams become smarter, faster, and more proactive than ever.”

According to Uptime Institute’s latest outages analysis, network and connectivity issues accounted for 31% of IT outages and 53% of third-party IT provider outages last year. By identifying the true source of incidents, network teams can more quickly fix the problems and mitigate security risks with Cato Network Playbooks, a set of workflows that include step-by-step instructions on how to resolve specific issues. For instance, examples of a Network Playbook include “Socket Link Down” and “BGP Session is Disconnected.”

Internally, Cato Support’s team used Network Stories and found that the process of last-mile packet loss identification “became nearly instantaneous” rather than it taking several days to report an outage, according to Cato. “The average root-cause analysis time dropped by 30% to under 35 minutes.”

Cato SASE Cloud runs on a private global backbone of more than 75 points of presence (PoPs) connected via multiple SLA-backed network providers. The PoPs software continuously monitors the providers for latency, packet loss, and jitter to determine in real-time the best route for every packet. Cato applies optimization and acceleration to all traffic going through the backbone to enhance application performance and the user experience. To ensure all locations benefit, Cato optimizes traffic from all the edges and toward all destinations—on premises and in the cloud.

The additional capabilities in Cato’s platform align with the growing trend of network and security teams tasked to collaborate more closely to improve performance and reduce security risk. The company conducted research that shows more companies are converging their network and security efforts. According to a Cato survey of 1,694 IT leaders worldwide, 44% of respondents said networking and security teams “must work together,” another 30% said they “must have shared processes,” and 8% reported that they were working to create one networking and security group.

Industry watchers have also recorded the trend in research. Enterprise Management Associates (EMA) surveyed 304 IT professionals in October 2023 and found that 86% of enterprises are seeing increased collaboration between their network and security teams, while 49% of those surveyed have either fully or partially converged networking and security groups into one group.

“We also saw in the research that successful partnerships drive reduced security risk, operational efficiency, and fast resolutions of problems both on the networking side and the security side, which are all good arguments for doing this systematically, carefully, and effectively,” said Shamus McGillicuddy, vice president of research at EMA, in an EMA webinar sharing the research.

Network Monitoring, Networking, SASE
]]>
https://www.networkworld.com/article/2069379/cato-adds-ai-driven-xdr-to-sase-to-reduce-network-outages.html 2069379
IBM buys Pliant for network automation Wed, 20 Mar 2024 15:26:00 +0000

Looking to bolster its network and IT infrastructure capabilities, IBM said today it has acquired IT automation vendor Pliant for an undisclosed amount.

Founded in 2017, Pliant is known for its IT automation and orchestration software that works to streamline communications among platforms, services and applications and simplify network and IT operations. Pliant offers a library of out-of-the-box integrations with third-party vendors and can work with technologies that have an API or command line interface (CLI).

IBM has partnered with Pliant in the past, integrating its workflow engine and other technologies in its Cloud Pak for Automation package. The idea behind that integration is to advance service delivery speed and ensure the network maintains the customer’s desired state, IBM stated.  

Pliant’s technology will let customers simplify automation with a tool that securely integrates services and applications within their network and infrastructure environments, according to Andrew Coward, IBM’s general manager of software defined networking, who wrote a blog about the acquisition.

“Pliant adds essential capabilities to automate network and IT infrastructure tasks and abstract these functions to the application layer, enabling applications (and developers) maximum control for simplified provisioning and management of infrastructure directly within applications themselves,” Coward wrote. “These optimizations include infrastructure resource provisioning and management, traffic management and configuration management for both traditional network and IT infrastructure and public clouds.”

The acquisition will extend IBM’s software portfolio, which today includes SevOne, Cloud Pak for Network Automation, Hybrid Cloud Mesh, Edge Application Manager and IBM NS1 Connect.

IBM last month rolled out a new NS1 Connect service that uses DNS to help enterprise customers more effectively load balance highly distributed application and multicloud workloads. The IBM NS1 Connect Global Server Load Balancing (GSLB) service ties together the company’s NS1 DNS technology with real-time user data in a package that promises to bring faster connectivity along with improved failover and resiliency.

Network Management Software, Networking
]]>
https://www.networkworld.com/article/2069363/ibm-buys-pliant-for-network-automation.html 2069363
Samsung opens lab dedicated to next-gen AI chips Tue, 19 Mar 2024 18:05:04 +0000

Samsung has built a research lab dedicated to developing a new type of chip for the next iteration of artificial intelligence: AI that is even smarter than humans.

The Samsung Semiconductor AGI Computing Lab will be under the direction of Dong Hyuk Woo, Samsung senior vice president, and will be located both in the US and South Korea.

The lab’s sole purpose will be to build a semiconductor designed to meet the compute-intensive processing demands of what Samsung calls “artificial general intelligence,” or AGI, according to a LinkedIn post by Kye Hyun Kyung, CEO of Samsung Semiconductor.

Samsung’s definition of AGI is a type of AI in which the models have intelligence capabilities even greater than, or at least equal to, humans and learn on their own without the need to be trained on human data first. This type of technology is significantly more computationally intensive than the current large language models (LLMs) associated with today’s AI technology, which must be trained by humans on various data sources.

From LLM to AGI

Samsung’s lab will first focus on developing chips for LLMs, specifically on AI inference and service applications, Kyung said. This will set up the development of more sophisticated chips for AGI down the road, he said.

“To develop chips that will dramatically reduce the power necessary to run LLMs, we are revisiting every aspect of chip architecture, including memory design, light-weight model optimization, high-speed interconnect, advanced packaging, and more,” Kyung wrote in the post.

Eventually, Samsung plans to continuously release new versions of AGI Computing Lab chip designs in “an iterative model that will provide stronger performance and support for increasingly larger models at a fraction of the power and cost,” he said.

“Through the creation of the AGI Computing Lab, I am confident that we will be better positioned to solve the complex system-level challenges inherent in AGI, while also contributing affordable and sustainable methods for the future generation of advanced AI/ML models,” Kyung wrote.

Untapped market potential

Samsung’s move in part appears to be an effort to find new revenue streams in an as-yet untapped market, as its core business, which is memory, has become a commodity, noted Gaurav Gupta, VP analyst, emerging trends and technologies, at Gartner.

“They are looking for another opportunity to grow,” he said. “This is where chips for inference come in.”

Indeed, most companies that currently build components for computer processing and memory are trying to keep pace with the rapid evolution of AI in various individual strategies to provide cost-effective computing resources.

Currently, the generative AI chip market for training models is dominated largely by Nvidia, with AMD having some share in the space, Gupta said. But these are for models running on GPUs, which can be scarce and costly and thus aren’t a long-term solution for running AI models.

Companies are looking for resource alternatives to run inference-driven AI, and Samsung hopes to be an early adopter of offering semiconductors for this technology, Gupta said.

“When it comes down to AI deployment at edge or endpoint for use-cases, it is expected that inference will be done on custom chips [that are] designed specifically to run fine-tuned models,” he said. “This is what Samsung wants to do – get into design of these chips and enter this market, which is yet to really take off.”

CPUs and Processors, Generative AI
]]>
https://www.networkworld.com/article/2067586/samsung-opens-lab-dedicated-to-next-gen-ai-chips.html 2067586
Trailblazing heroes lead the charge in secure digital transformation Tue, 19 Mar 2024 16:18:26 +0000

As a way to bring real-world cybersecurity transformation stories to life and showcase the achievements of leaders from the world’s top brands, Zscaler published the IT Heroes Comic Series. By taking a lighthearted approach to customer use cases, readers have the opportunity to consume short snapshots of customer journeys on their path to zero trust in a format normally reserved for leisure. The IT Heroes Comic Series is a launching point to then dive into a complementary set of in-depth content in both written and video layouts.

Supporting rapid expansion by simplifying security and achieving greater operational efficiency


MOL Group is an energy company headquartered in Budapest, Hungary with 25,000 employees. It owns around 2,000 service stations scattered across nine countries in Central and Southeast Europe, in addition to three refineries and two petrochemical plants.

The company has been rapidly growing through new lines of business such as electric vehicle charging stations and recycling plants, as well as through mergers and acquisitions. Head of Cybersecurity Strategy and Architecture Tamás Kapócs recognized the need for a new approach to security that would better support the company’s accelerated growth.

Kapócs deployed Zscaler to standardize and simplify the company’s security technologies, policies, and operations, and to provide users with a consistent experience no matter where they work. Zscaler’s single pane of glass relieved the security team of many day-to-day tasks, streamlining operations and giving them more time to work on strategic initiatives.

Not only is the security team happy with the changes, users across the company are delighted that they no longer have to rely on clunky VPNs. Zscaler significantly improved the user experience and decreased help desk tickets. “Zscaler was the logical choice,” Kapócs says. “No other vendor could even come close to the Zscaler Zero Trust Exchange.”

zscaler comic strip 1

Zscaler



Saving millions over legacy solutions and laying the groundwork for future growth

VP and CISO of United Airlines Deneen DeFiore is charged with protecting the world’s third-largest airline company by fleet size and routes, with 80,000 employees working across 350 locations worldwide and serving 143 million customers annually. If that doesn’t sound challenging enough, DeFiore started the job just six weeks before the pandemic began.

Virtually overnight, she helped take the company’s workforce from in-the-office to completely remote by deploying the Zscaler Zero Trust Exchange with Zscaler Internet Access (ZIA), Zscaler Private Access (ZPA), and Zscaler Digital Experience (ZDX).

The full zero trust transformation took only six months to complete and saved the organization over 3 million dollars over legacy on-premises solutions. But that was only the beginning of what DeFiore achieved.

“As we grow our fleet of connected aircraft and hire more employees, we need to design our security architecture to deliver the outcomes for the next phase of our business—across people, physical infrastructure, and systems—that will enable us to enhance and automate our operation in the years ahead,” says DeFiore.

The United Airlines Zscaler deployment shifts the organization’s security model from network- to cloud-based and lays the groundwork for IoT/OT device security in future projects. “With an evolving threat environment, we need to continuously adapt and advance our detection and defense-in-depth capabilities with intelligence to remain ahead of the attackers,” says DeFiore. “Zscaler gives us peace of mind that traffic will be secure, regardless of the underlying network, for our employees, customers, and partners.”

Zscaler comic strip 2

Zscaler

Streamlining an overly complex environment while boosting security

Mindbody is the leading cloud-based online scheduling and business management software for health and fitness service businesses such as gyms, yoga studios, salons, and spas. Since its founding in 2001, Mindbody acquired multiple organizations over the years and has become overly complex to manage.

Deputy CISO Michael Jacobs felt he could no longer rely on the organization’s legacy firewalls and network intrusion detection technologies to keep its users safe. “We needed a solution that provided modern, cloud-native security capabilities and a less complex, easier experience for both users and administrators,” he says.

He deployed the Zscaler Zero Trust Exchange platform to streamline the environment, improve operational efficiency, increase productivity, boost security, and reduce overhead. The 100% cloud-based company is now in the process of transitioning to a fully zero trust model.

Since deploying Zscaler, it has taken half as long as it did before to onboard companies newly acquired through mergers and acquisitions. Another example of how Zscaler has sped things up is that, previously, when a user requested access to a particular application, granting permission took days or even weeks—even with the assistance of senior technical resources. Now any Zscaler administrator can do it in minutes. “Zscaler is helping us get where we want to be,” Jacobs asserts.

Zscaler comic strip 3

Zscaler

Rapidly enabling tens of thousands of remote workers

When confronted with the sudden challenge of enabling 50,000 municipal employees to work from home in 2020 at the onset of the COVID-19 pandemic, the City of Los Angeles Information Technology Agency’s CIO Ted Ross quickly devised a plan. What was it? Deploy the Zscaler Zero Trust Exchange across the city’s 44 departments.

Thanks to Ross’s fast thinking, the city was able to keep critical city services—such as emergency and health services, trash collection, and infrastructure repairs—fully operational for its 4 million residents. It took Ross and his team less than two weeks to deploy a remote work platform to 18,000 employees.

These days, city employees are enjoying an improved work-life balance and are spending less time commuting on Los Angeles’ notoriously crammed highways. And, long term, the city is well positioned for potential disruptions with built-in resiliency, thanks to Ross deploying Zscaler Internet Access (ZIA) and Zscaler Private Access (ZPA).

Zscaler comic strip 4

Zscaler

Rescuing rural field offices from spotty internet

As a major natural gas supplier to customers in Arizona, Nevada, and California, Southwest Gas was struggling with spotty internet at many of its remote field offices. Although the company’s operational technology (OT) network is air-gapped, its IT network still needs to provide employees with fast internet connection to critical data and applications.

The company’s legacy system relied on VPNs and backhauling traffic to a data center to apply on-premises security controls. The result was high latency, poor user experience, and questionable security. Senior Infrastructure Architects Robert Woodfin and David Petroski, along with Manager, Network Services Larry Rosenbusch decided to do something about it.

The trio scoured industry analyst reports and evaluated multiple vendors before ultimately choosing the Zscaler Zero Trust Exchange. The deployment process was smooth and seamless, requiring only four to six weeks to achieve positive results. “The proof is in the reduction of help-desk tickets,” notes Petroski.

In addition to a vastly improved user experience, this team of IT heroes brought their company a robust integration platform. They integrated Zscaler with Duo for multi-factor authentication (MFA) to streamline administration and confirm user identities and with Splunk to log data and provide rich telemetry into real-time policy violations, vulnerabilities, and potential threats. Thanks to these IT Heroes, natural gas deliveries will be more secure and reliable in the Southwest.

Zscaler comic strip 5

Zscaler

See the Zscaler IT Heroes Comic Series and read more digital transformation stories of Zscaler customers protecting their organizations.

Cloud Architecture
]]>
https://www.networkworld.com/article/2066582/trailblazing-heroes-lead-the-charge-in-secure-digital-transformation.html 2066582
Nvidia expands partnership with hyperscalers to boost AI training and development Tue, 19 Mar 2024 16:02:11 +0000

Nvidia is extending its existing partnerships with hyperscalers Amazon Web Services (AWS), Google Cloud Platform, Microsoft Azure, and Oracle Cloud Infrastructure, to make available its latest GPUs and foundational large language models (LLMs), and to integrate its software across their platforms.

AWS, for instance, will offer Nvidia’s Blackwell GPU platform, featuring the latest GB200 NVL72 server rack that comes with 72 Blackwell GPUs and 36 Grace CPUs interconnected by Nvidia’s high-speed GPU connecting framework NVLink, as part of its cloud. 

“When connected with Amazon’s powerful networking (EFA), and supported by advanced virtualization (AWS Nitro System) and hyper-scale clustering (Amazon EC2 UltraClusters), enterprises can scale to thousands of GB200 Superchips,” the companies said in a joint statement.

Further, the companies said they expect the availability of Nvidia’s Blackwell platform on AWS to speed up inference workloads for multi-trillion parameter LLMs.

Nvidia will also make the Blackwell GB200 GPUs available in the AWS cloud via its own DGX Cloud AI training service, which hosts in other vendors’ clouds. DGX was initially only available in Microsoft Azure and Oracle Cloud Infrastructure, but last November AWS said it would begin offering it too.

Another feature of the expanded partnership is that Nvidia will offer its NIM microservices inside Amazon SageMaker, AWS’ machine learning platform, to help enterprises deploy foundational LLMs that are pre-compiled and optimized to run on Nvidia GPUs. This will reduce the time-to-market for generative AI applications, the companies said.

Other collaborations between AWS and Nvidia include the use of Nvidia’s BioNeMo foundational model for generative chemistry, protein structure prediction, and understanding how drug molecules interact with targets via AWS’ HealthOmics offering. The two companies’ healthcare teams are also working together to launch generative AI microservices to advance drug discovery, medtech, and digital health, they said.

Google Cloud to get Blackwell-powered DGX Cloud

Google Cloud Platform, like AWS, will be getting the new Blackwell GPU platform and integrating Nvidia’s NIM suite of microservices into Google Kubernetes Engine (GKE) to speed up AI inferencing and deployment. In addition, Nvidia DGX Cloud is now generally available on Google Cloud A3 VM instances powered by NVIDIA H100 Tensor Core GPUs, Google and Nvidia said in a joint statement.  

The two companies are also extending their partnership to bring Google’s JAX machine learning framework for transforming numerical functions to Nvidia’s GPUs. This means that enterprises will be able to use JAX for LLM training on Nvidia’s H100 GPUs via MaxText and Accelerated Processing Kit (XPK), the companies said.

In order to help enterprises with data science and analytics, Google said that its Vertex AI machine learning platform will now support Google Cloud A3 VMs powered by Nvidia’s H100 GPUs and G2 VMs powered by Nvidia’s L4 Tensor Core GPUs.

“This provides MLops teams with scalable infrastructure and tooling to manage and deploy AI applications. Dataflow has also expanded support for accelerated data processing on Nvidia GPUs,” the companies said.

Oracle and Microsoft too

Other hyperscalers, such as Microsoft and Oracle, has also partnered with Nvidia to integrate the chipmaker’s hardware and software to beef up their offerings.

Not only are both companies adopting the Blackwell GPU platform across their services, they are also expected to see the adoption of Blackwell-powered DGX Cloud.

IBM, on the other hand, said nothing about Nvidia hardware — but its consulting team will integrate Nvidia software components such as the NIM microservices suite to help enterprises on their AI development journeys.

Cloud Computing, Generative AI, GPUs
]]>
https://www.networkworld.com/article/2067515/nvidia-expands-partnership-with-hyperscalers-to-boost-ai-training-and-development.html 2067515
Nile boosts NaaS offering with AI, customizable services Tue, 19 Mar 2024 15:51:13 +0000

Network-as-a-Service vendor Nile has reinforced its secure wired and wireless package with features that aim to let customers more easily buy integrated components and manage them with AI-based software.

Nile was founded by former Cisco CEO John Chambers and Pankaj Patel, Cisco’s former chief development officer. The NaaS startup developed a subscription-based cloud offering, called Nile Access Service, for setting up and managing campus network operations without the need for customers to purchase and maintain their own networking infrastructure and security hardware.

Nile Access Service includes a core package of wired and wireless campus infrastructure components and sensors; they’re controlled by Nile AI software, which automates installation and other steady-state controls and enables management and observability features that are tailored for customer installations.

Since launching in 2023, Nile has been focused on bolstering its NaaS package to include better analytics, automation tools and network monitoring features.

Nile’s service is based on zero-trust security principals and enforces strict user verification and access controls to minimize cyberattack risks, according to Austin Hawthorne, vice president of solution architects at Nile. It features a number of extensions to third-party vendors, including Palo Alto, Zscaler, Splunk (now Cisco), Infoblox, AWS, Google Cloud and more, he said.

The Nile package is akin to the cloud services a customer would get with AWS, for example, Hawthorne said. “If you think about an AWS EC2 instance, you don’t worry about anything that’s going on underneath the covers as it relates to the servers, the storage and top-of-rack networking, the data centers. Amazon takes care of all that,” Hawthorne said.

“Our cloud-native networking stack – we configure it and install it, we handle the lifecycle management, network optimization, the patching, and we guarantee performance, availability, capacity, and coverage. But then we essentially hand the keys to the customer,” Hawthorne said.

With its latest upgrade, Nile added Service Blocks to its NaaS architecture. Nile Service Blocks make up the foundation of the Nile Access Service, and each block represents a collection of physical Wi-Fi sensors, Wi-Fi access points, access switches, or distribution switches. The blocks are customizable and can be delivered in different-size packages for customers, depending on their needs.

The idea is that instead of requiring manual configuration and separate software release management for different network elements, service blocks are supported with cloud-native delivery and a microservices-based architecture, Hawthorn said.

Also part of the Service Blocks architecture is a digital twin feature that lets customers set up a virtual replica of their environment to simulate and troubleshoot network operations and help manage proposed changes and additions.

On the AI side, Nile has added Copilot applications that simplify equipment installation and intent-based provisioning of Nile Service Blocks. In addition, new Nile Autopilot applications can offload what are today manual network functions, such as software maintenance, and automate manual workflows for infrastructure performance monitoring and troubleshooting, Hawthorne said.  

Core to Nile’s AI software is its ability to constantly evaluate network conditions and traffic patterns and align them with available resources to keep traffic flowing as well as identify and fix trouble spots in real time. This closed-loop automation ensures that the network maintains itself and is in the “best practice state” at all times, Hawthorne said. 

“Our AI data set allows us to start now making really quick decisions,” said Özer Dondurmacıoğlu, vice president of services marketing at Nile. “We collect telemetric data from network elements, environmental user experience, device experience, application performance, device availability, network design, log files, detailed CPU, memory information, all validated via a proper design of the network during installation and after installation, in software – that is our design pipeline, that’s our foundation,” Dondurmacıoğlu said.

“We have this idea that when something goes wrong in the physical space, we have to be responsible for detecting [it] in software, and driving ticket generation, and driving automation to resolve it. And if we cannot resolve it automatically, we need to notify the customer to help us resolve the issue,” Dondurmacıoğlu said.

Nile’s enhancements, particularly the Service Blocks and enhanced AI applications, will accelerate network designs and deployments and help improve the provisioning and ongoing management of the network, according to Brandon Butler, research manager, enterprise networks, at IDC.

“Nile’s overall offering is representative of a broader trend of enterprise Network as a Service (NaaS) offerings that continue to mature in the market,” Butler said.

“IDC defines Enterprise NaaS as enterprise network infrastructure that is consumed via a flexible consumption operating expense (opex) model, inclusive of: hardware, software, management tools, licenses and lifecycle services,” Butler said. “Typically, enterprise NaaS is used for WLAN, access switching and routing/SD-WAN.”

Enterprise NaaS offerings are appealing for midsize organizations and larger enterprises that value predictability in their networking costs, as well as those that are looking to embrace opex models for networking, Butler said.

“Other benefits include faster access to new technology and software capabilities compared to traditional network management models, and cloud-based management of enterprise network infrastructure,” Butler said. “Fundamentally, NaaS offerings allow customers to outsource the undifferentiated parts of their network to a NaaS provider so their IT and networking teams can focus on more strategic, business-enabling tasks.”

Nile’s competition in the NaaS market includes HPE Aruba, through its Greenlake offering, as well as startups such as Meter, Ramen and Join Digital, according to Butler.

Network Security, Network Virtualization, Networking, Networking Devices
]]>
https://www.networkworld.com/article/2067455/nile-boosts-naas-offering-with-ai-customizable-services.html 2067455
Nvidia debuts massive Blackwell-powered systems Mon, 18 Mar 2024 23:33:49 +0000

Along with its new Blackwell architecture, Nvidia is unveiling new DGX systems that offer significant performance gains compared to the older generation.

There are several iterations of Nvidia’s existing DGX servers, ranging from 8 Hopper processors to 256 processors and with prices that start at $500,000 and scale to several million. Nvidia is following a similar configuration structure for the Blackwell generation, but no prices are available yet.

At the high end of the new lineup is the Nvidia GB200 NVL72 system. It’s a 72-node, liquid-cooled, rack-scale system for the most compute-intensive workloads. Each DGX GB200 system features 36 Grace Blackwell Superchips — which include 72 Blackwell GPUs and 36 Grace CPUs — connected by the newest generation NVLink interconnect. The platform acts as a single GPU with 1.4 exaflops of AI performance and 30TB of fast memory.

The new DGX systems are about more than just speeds and feeds; they offer a whole new form of interchip communication, said Charlie Boyle, vice president of DGX systems at Nvidia. “On a very large AI training job … you might spend 60% of your time just talking to each other on the GPU. If I can increase that network speed dramatically by putting that over NVlink, which is a memory-based network not a traditional database network, I can get that work done much more efficiently,” he said.

A DGX rack is a 44U cabinet with 18 compute trays, nine switch trays, a couple of power distribution units, a management switch power, a liquid cooling manifold, and NVlink backplane. Up until now, DGX systems have been air-cooled – this is the first unit with liquid cooling, a tacit admission that these things run hot. Boyle declined to comment on rumors that the Blackwell processor would run at over 1000 watts of power.

“It was done for efficiency and density,” said Boyle. “To get 72 GPUs in a rack and get that NVlink all together, they have to be very dense in there. We make this technology available to our OEM and ODM partners. They could choose to bring out different configurations, different densities. But for the product that I’m selling as DGX, it’s liquid cooled because of the high density in the system.”

The new version of the DGX SuperPOD with DGX GB200 systems won’t make other versions obsolete, but it does have unique capabilities that are only in this system. For example, RAS (reliability, availability, scalability) features are built into the chip and extend into the server with capabilities such as predictive maintenance, system health, and monitoring thousands of data points at all times.

Nvidia has developed what it calls the DGX Ready data center program; it has worked with its data center partners to be ready to host these systems with minimal set up effort, and that includes the liquid cooling.

“When these systems ship to customers, and I believe most of these will wind up in colocation data centers, some customers do have native liquid and some are building next gen data centers, but we make it easy for customers that want to adopt this,” he said.

The new DGX systems are set to ship later this year.

CPUs and Processors, Data Center, High-Performance Computing
]]>
https://www.networkworld.com/article/2066554/nvidia-debuts-massive-new-blackwell-powered-systems.html 2066554
Nvidia launches Blackwell GPU architecture Mon, 18 Mar 2024 21:14:10 +0000

Nvidia kicked off its GTC 2024 conference with the formal launch of Blackwell, its next-generation GPU architecture due at the end of the year.

Blackwell uses a chiplet design, to a point. Whereas AMD’s designs have several chiplets, Blackwell has two very large dies that are tied together as one GPU with a high-speed interlink that operates at 10 terabytes per second, according to Ian Buck, vice president of HPC at Nvidia.

Nvidia will deliver three new Blackwell data center and AI GPUs: the B100, B200, and GB200. The B100 has a single processor, the B200 has two GPUs interconnected, and the GB200 features two GPUs and a Grace CPU.

Buck says the GB200 will deliver inference performance that’s seven times greater than the Hopper GH200 can deliver. It delivers four times the AI training performance of Hopper, 30 times better inference performance overall, and 25 times better energy efficiency, Buck claimed. “This will expand AI data center scale to beyond 100,000 GPUs,” he said on a press call ahead of the announcement.

Blackwell has 192GB of HBM 3E memory with more than 8TB/sec of bandwidth and 1.8 TB of secondary link. Blackwell also supports the company’s second-generation transformer engine, which tracks the accuracy and dynamic range of every layer of every tensor and the entire neural network as it proceeds in computing.

Blackwell has 20 petaflops of FP4 AI performance on a single GPU. FP4, with four bits of floating point precision per operation, is new to the Blackwell processor. Hopper had FP8. The shorter the floating-point string, the faster it can be executed. That’s why as floating-point strings go up – FP8, FP16, FP32, and FP64 – performance is cut in half with each step. Hopper has 4 Pflops of FP8 AI performance, which is less than half the performance of Blackwell.

Blackwell also has a new transformer engine to automatically detect what layers of the model can deal with what precision, ranging from FP4 to FP64. The higher the precision, the longer it takes to process and the more energy it uses. This new transformer engine automatically switches to a lesser or greater precision as it is needed. Previous generations required programming the processor to switch math precision.

“Our big innovation here is you don’t need to hand code that as a user. You can let the system take care of that for you,” said Charlie Boyle, vice president of DGX systems at Nvidia. “And it does it safely, meaning it stores the weights at higher precision than it needs to to maintain that accuracy, and in areas where you don’t need that level of precision to get the same amount of accuracy.”

The high-speed interconnect, NVLink, is as significant as the GPU technology itself. This is the fifth generation of NVLink designed to provide efficient scaling for a trillion-parameter mixture of disparate models, said Buck. This allows Blackwell to deliver 18 times faster throughput and performance in multi-node interconnects.

In addition to new GPUs, Nvidia is announcing its next generation InfiniBand, the Quantum-X800 QDR, an AI-dedicated Infrastructure with advanced feature sets crucial for multi-tenant generative AI clouds and large enterprises.

The X800 includes the Nvidia Quantum Q3400 switch and the Nvidia ConnectXR-8 SuperNIC, which together achieve end-to-end throughput of 800Gb/s. This is five times the bandwidth capacity and a nine-fold increase to 14.4Tflops of in-network computing compared to the previous generation.

Blackwell products are planned for release later this year, while Quantum-X800 and Spectrum-X800 will be available next year. GTC runs this week in San Jose, Calif.

CPUs and Processors, Data Center, High-Performance Computing
]]>
https://www.networkworld.com/article/2066534/nvidia-launches-blackwell-gpu-architecture.html 2066534