Cloudflare revealed a serious bug in its software today that caused sensitive data like passwords, cookies, authentication tokens to spill in plaintext from its customers’ websites. The announcement is a major blow for the content delivery network, which offers enhanced security and performance for more than 5 million websites.
This could have allowed anyone who noticed the error to collect a variety of very personal information that is typically encrypted or obscured.
Remediation was complicated by an additional wrinkle. Some of that data was automatically cached by search engines, making it particularly difficult to clean up the aftermath as Cloudflare had to approach Google, Bing, Yahoo and other search engines and ask them to manually scrub the data.
The leak may have been active as early as Sept. 22, 2016, almost five months before a security researcher at Google’s Project Zero discovered it and reported it to Cloudflare.
However, the most severe leakage occurred between Feb. 13 and Feb. 18, when around 1 in every 3,300,000 HTTP requests to Cloudflare sites would have caused data to be exposed. Attackers could have accessed the data in real-time, or later through search engine caches.
Cloudflare notes in its announcement of the issue that even at its peak, data only leaked in about 0.00003% of requests. It doesn’t sound like much, but Cloudflare’s massive customer base includes categories like dating websites and password managers, which host particularly sensitive data.
“At the peak, we were doing 120,000 leakages of a piece of information, for one request, per day,” Cloudflare chief technology officer John Graham-Cumming told TechCrunch. He emphasized that not all of those leakages would have contained secret information. “It’s random stuff in there because it’s random memory,” he said.
Bug was found in an HTML parser that Cloudflare uses to increase performance
The bug occurred in an HTML parser that Cloudflare uses to increase website performance — it preps sites for distribution in Google’s publishing platform AMP and upgrades HTTP links to HTTPS. Three of Cloudflare’s features (email obfuscation, Server-side Excludes and Automatic HTTPS Rewrites) were not properly implemented with the parser, causing random chunks of data to become exposed.
Ultimately, even Cloudflare itself was affected by the bug. “One obvious piece of information that had leaked was a private key used to secure connections between Cloudflare machines,” Graham-Cumming wrote in Cloudflare’s announcement. The encryption key allowed the company’s own machines to communicate with each other securely, and was implemented in 2013 in response to concerns about government surveillance.
Graham-Cumming emphasized that Cloudflare discovered no evidence that hackers had discovered or exploited the bug, noting that Cloudflare would have seen unusual activity on their network if an attacker were trying to access data from particular websites.
“It was a bug in the thing that understands HTML,” Graham-Cumming explained. “We understand the modifications to web pages on the fly and they pass through us. In order to do that, we have the web pages in memory on the computer. It was possible to keep going past the end of the web page into memory you shouldn’t be looking at.”
Cloudflare’s teams in San Francisco and London handed off shifts to one another, working around the clock to fix the bug once it was reported. They had stopped the most severe issue within seven hours. It took six days for the company to completely repair the bug and to work with search engines to scrub the data.
Tavis Ormandy, an engineer at Google, first noticed the bug, which he jokingly called “Cloudbleed” in reference to the Heartbleed vulnerability. He said in a blog post that he encountered unexpected data during a project and wondered at first if there was a bug in his own code. Upon further testing, he realized the leak was coming from Cloudflare.
Photo by TechCrunch