AI bots everywhere. Does anyone have a good whitelist for robots.txt?

My niche little site, http://golfcourse.wiki seems to be very popular with AI bots. They basically become most of my traffic. Most of them follow robots.txt, and that's nice and all, but they are costing me non-trivial amounts of money.

I don't want to block most search engines. I don't want to block legitimate institutions like archive.org. Is there a whitelist that I could crib instead of pretty much having to update my robots file every damn day?


Comments URL: https://news.ycombinator.com/item?id=42861047

Points: 15

# Comments: 8

https://news.ycombinator.com/item?id=42861047

Created 6mo | Jan 29, 2025, 4:50:08 AM


Login to add comment

Other posts in this group

Ask HN: Who is hiring? (August 2025)

Please state the location and include REMOTE for remote work, REMOTE (US) or similar if the country is restricted, and ONSITE when remote work is not an option.

Please only post if you pe

Aug 1, 2025, 4:50:18 PM | Hacker news
Show HN: Pontoon – Open-source customer data syncs

Hi HN,

We’re Alex and Kalan, the creators of Pontoon (https://github.com/pontoon-data/Pontoon). Pontoon is an open-source data export platfo

Aug 1, 2025, 4:50:16 PM | Hacker news