Unisami AI News

How OpenAI’s bot crushed this seven-person company’s web site ‘like a DDoS attack’

January 10, 2025 | by AI

pexels-photo-16125027

Unveiling the AI Scraping Dilemma: A Cautionary Tale from Triplegangers

Picture this: It’s a regular Saturday, and Oleksandr Tomchuk, CEO of Triplegangers, is jolted by an alarming notification — their ecommerce site has gone offline. The culprit? A relentless bot from OpenAI attempting to scrape their extensive database.

  • Over 65,000 products on the site
  • Each product page features at least three photos

“OpenAI was sending tens of thousands of server requests, trying to download hundreds of thousands of photos,” Tomchuk shared with TechCrunch.

— Oleksandr Tomchuk, CEO of Triplegangers

Imagine having your digital storefront — built meticulously over a decade — suddenly besieged. Triplegangers, specializing in selling 3D image files of human models to artists and game developers, found itself in this precarious situation.

The Technical Oversight

The root of the issue lay in an oversight. Their website’s terms of service explicitly prohibited unauthorized bots from scraping images. However, missing was a properly configured robots.txt, critical for warding off unwanted bots like OpenAI’s GPTBot.

Essentially, without these configurations, bots assumed free rein. This wasn’t an opt-in system — if you don’t say no, it means yes in the digital realm.

The Financial and Operational Impact

Beyond the technical breach, the implications were financial. The attack not only disrupted business during U.S. hours but also led to a spike in AWS bills due to increased CPU activity.

“It’s scary because there seems to be a loophole that these companies are using to crawl data,” Tomchuk noted, emphasizing the burden on business owners to block these intrusions.

— Oleksandr Tomchuk

A Wider Industry Challenge

Tomchuk’s ordeal is not isolated. Other businesses have reported similar disruptions and inflated costs due to AI crawlers. In fact, a recent study by DoubleVerify highlighted an 86% increase in invalid traffic caused by AI bots in 2024.

“Now we have to daily monitor log activity to spot these bots,” Tomchuk warns.

— Oleksandr Tomchuk

Lessons Learned and Path Forward

The incident underscores a pressing need: businesses must proactively shield their digital assets. Understanding and configuring tools like robots.txt is crucial in this tech-driven era.

This tale serves as a wake-up call for small online businesses everywhere — vigilance against data scraping is essential. As Tomchuk poignantly puts it, “They should be asking permission, not just scraping data.”

“`

Image Credit: Sanket Mishra on Pexels

RELATED POSTS

View all

view all