Discussion:
/ads.txt
(too old to reply)
Eli the Bearded
2018-08-24 23:14:05 UTC
Permalink
Anyone here?

I'm curious about /ads.txt. I've read some background material on it
that outlines how it is supposed to be used for validating ad sales
inventory or something like that.

https://digiday.com/marketing/wtf-ads-txt/

I do not put any ads on my site, I do not run any ads for my site, and I
do not sell or host ads for anyone else.

So why am I seeing so many hits to ads.txt?

Some are from Google, some are from who knows where (35.224.0.0/12 is
"Google Cloud"; 165.227.0.0/16 is Digital Ocean):

35.229.103.78 - - [24/Aug/2018:12:29:22 -0400] "GET /ads.txt HTTP/1.1" 404 398 "-" "bidswitchbot/1.0"
66.249.70.24 - - [24/Aug/2018:12:52:38 -0400] "GET /ads.txt HTTP/1.1" 404 398 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
165.227.100.219 - - [24/Aug/2018:13:22:09 -0400] "GET http://qaz.wtf/ads.txt HTTP/1.1" 404 398 "-" "lua-resty-http/0.11 (Lua) ngx_lua/10010"

I've created a blank ads.txt now. Will that work to tell all of these
bots that no ads should exist for my site?

Elijah
------
organic traffic only
Doc O'Leary
2018-08-25 18:45:57 UTC
Permalink
For your reference, records indicate that
Post by Eli the Bearded
So why am I seeing so many hits to ads.txt?
Malicious scans. Do you also see a lot of bogus WordPress URLs?
Same thing.
Post by Eli the Bearded
I've created a blank ads.txt now. Will that work to tell all of these
bots that no ads should exist for my site?
It’s better to block their IP address completely. Even better, block
entire ranges by those “cloud” providers. Stop the abuse rather than
the notification of the problem.
--
"Also . . . I can kill you with my brain."
River Tam, Trash, Firefly
Eli the Bearded
2018-08-26 02:47:14 UTC
Permalink
In comp.infosystems.www.misc,
Post by Doc O'Leary
For your reference, records indicate that
Post by Eli the Bearded
So why am I seeing so many hits to ads.txt?
Malicious scans. Do you also see a lot of bogus WordPress URLs?
Same thing.
These days the biggest malicious scan offender is the D-Link one
(tries to use /login.cgi to wget and run a shell script). I don't
have any reason to think Googlebot doing a GET on a .txt file is
a malicious scan.
Post by Doc O'Leary
It’s better to block their IP address completely. Even better, block
entire ranges by those “cloud” providers. Stop the abuse rather than
the notification of the problem.
Advice like this I get can get from any hypochondriac webmaster forum.
I'm perfectly capable of deciding what to block or not block on my own.
My question was just about how ad agencies use ads.txt.

Elijah
------
doesn't think the dlink scan has yet repeated a source IP address
Doc O'Leary
2018-08-26 16:20:48 UTC
Permalink
For your reference, records indicate that
Post by Eli the Bearded
I don't
have any reason to think Googlebot doing a GET on a .txt file is
a malicious scan.
Since you aren’t in a business relationship with them for AdSense or
any other advertising service, there’s really no legitimate reason for
them to be scanning unpublished URLs like that. Save, of course, for
that fact they they’re looking to hoover up any and all information
about everyone they can get their hands on. I see them probing under
/.well-known/ and random 404 URLs as well. Google stopped playing
nice a long time ago.
Post by Eli the Bearded
Post by Doc O'Leary
It’s better to block their IP address completely. Even better, block
entire ranges by those “cloud” providers. Stop the abuse rather than
the notification of the problem.
Advice like this I get can get from any hypochondriac webmaster forum.
I'm perfectly capable of deciding what to block or not block on my own.
My question was just about how ad agencies use ads.txt.
There’s plenty of information online about legitimate uses for that
file. What should concern you, since you don’t apparently buy or show
ads, is the improper uses. Same as for any other scans for invalid
URLs on your site. Serve up a blank file if you like. I personally
issue a 204 response for things like that, saving the bans for probes
that are directly going after exploit URLs.
--
"Also . . . I can kill you with my brain."
River Tam, Trash, Firefly
Loading...