Author Topic: GPTBot crawl  (Read 483 times)

ergophobe

  • Inner Core
  • Hero Member
  • *
  • Posts: 9631
    • View Profile
GPTBot crawl
« on: November 10, 2024, 12:26:10 AM »
I was just looking at a few raw server logs and noticed that GPTBot was crawling like mad in October. A site with between 100-200 pages got

5700 so far in Nov
97,000 hits from GPTBot in October
2068 in September
1700 in August

I looked at some other mini sites with maybe a dozen pages and they got 37,000 to 47,000 hits in October and just a handful in September.

I can't imagine the crawl budget OpenAI must have
« Last Edit: November 10, 2024, 12:43:22 AM by ergophobe »

rcjordan

  • I'm consulting the authorities on the subject
  • Global Moderator
  • Hero Member
  • *****
  • Posts: 16840
  • Debbie says...
    • View Profile
Re: GPTBot crawl
« Reply #1 on: November 10, 2024, 01:34:42 AM »
Debbie says they may be running full blast before the copyright lawsuits throttle them.

https://www.shacknews.com/article/141313/openai-needs-copyright-material

OpenAI insists it can't sufficiently train AI models without copyrighted material | Shacknews

---

Take a look....

https://www.google.com/search?q=ai+bots+courts+copyright

ai bots courts copyright - Google Search



ergophobe

  • Inner Core
  • Hero Member
  • *
  • Posts: 9631
    • View Profile
Re: GPTBot crawl
« Reply #2 on: November 10, 2024, 02:22:02 AM »
Well.... they have my content. I wonder how hard it would be to seed an AI, like the SEO comps where people would compete to rank some previously unique phrase.

rcjordan

  • I'm consulting the authorities on the subject
  • Global Moderator
  • Hero Member
  • *****
  • Posts: 16840
  • Debbie says...
    • View Profile
Re: GPTBot crawl
« Reply #3 on: November 10, 2024, 11:56:52 AM »
>seed

related:

Annoyed Redditors tanking Google Search results illustrates perils of AI scraper

https://th3core.com/talk/traffic/annoyed-redditors-tanking-google-search-results-illustrates-perils-of-ai-scraper/msg86053/#msg86053

ergophobe

  • Inner Core
  • Hero Member
  • *
  • Posts: 9631
    • View Profile
Re: GPTBot crawl
« Reply #4 on: November 10, 2024, 07:04:41 PM »
Ah yes. I thought we had had some discussion of that