Cloudflare explains Tuesday’s outage that temporarily took down ChatGPT

Cloudflare explains Tuesday’s outage that temporarily took down ChatGPT

Cloudflare has developed bot management controls to mitigate issues related to web crawlers, including those scraping data for generative AI applications. Recently, the company announced the introduction of the “AI Labyrinth,” a system designed to utilize generative AI to create content that hinders and consumes the resources of non-compliant AI crawlers and bots.

Despite these advancements, Cloudflare attributes recent operational challenges to changes in the permissions system of its database, rather than to its generative AI technology, DNS issues, or initial speculations about cyberattacks, including the possibility of a large-scale Distributed Denial of Service (DDoS) attack.

According to a representative, the machine learning model responsible for assessing bot activity relies on a configuration file that is constantly updated to identify automated requests. However, a modification in the underlying ClickHouse query behavior resulted in a proliferation of duplicate entries in this configuration file. This unexpected increase in size led to the file exceeding its designated memory limits, subsequently impacting the core proxy system that facilitates traffic processing for Cloudflare’s customers reliant on the bot module.

The consequence of these issues was significant; companies utilizing Cloudflare’s bot management rules experienced false positives that inadvertently blocked legitimate traffic. In contrast, clients who did not employ the generated bot scores in their traffic management remained unaffected and continued to operate normally.

Source: https://www.theverge.com/news/823711/cloudflare-outage-postmortem

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top