Handling of abusive DNSBL/WL clients

Discussion:

d***@chaosreigns.com

2011-12-22 15:12:08 UTC

This spamassassin bug comment provides some information on what happens
when various methods of blocking abusive DNS queries are attempted.
The tests were conducted on dnswl.org, a public email whitelist enabled
by default in spamassassin, and presumably other things.

There seems to be a surprising lack, in RFCs and BCPs, of statements
that clients and forwarding DNS servers should stop querying if they
receive an NXDOMAIN, REFUSED, or an answer with the TLD of "invalid",
which seem likely to help if they were widely implemented.

It also seems like it would be good to define best practices for handling
this situation, quite possibly based on the information below.

SpamAssassin is currently asking DNS black/white list providers
to indicate the client is being blocked via a specified returned
IP value for all queries, in the case of DNSWL, 127.0.0.255.
There has been some debate on what would be an ideal value.
This is not in line with the blacklist BCP's suggestion to check
the values of 127.0.0.1 and 127.0.0.2, I guess because the SA devs
feel it's easier to implement. Some related discussion was here:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6724

(My involvement - I'm not a SA dev, I've been participating on the dev
mailing list for a number of months. I've been helping DNSWL on and
off for about 5 years.)

----- Forwarded message from bugzilla-***@bugzilla.spamassassin.org -----

Date: Thu, 22 Dec 2011 10:32:43 +0000
From: bugzilla-***@bugzilla.spamassassin.org
To: ***@ChaosReigns.com
Subject: [Bug 6728] DNSBLs need a way to turn off queries based on BLOCKED
rules triggering

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6728

--- Comment #13 from Matthias Leisi <***@leisi.net> 2011-12-22 10:32:43 UTC ---
I did some additional tests on how best to block abusive query sources. "Best"
is defined as three goals:

1) Reduce the overall traffic on parent (dnswl.org) and data (list.dnswl.org)
zone
2) Avoid or minimize collateral damage on root and gTLD servers
3) Make it easy for operators of abusive query sources to find out what is
happening

We have built the mechanism to redirect defined IPs to a special view of the
dnswl.org zone as part of bug 6724 (using BIND views). I wanted to do actual
tests to base at least the decision on the first goal on hard facts. We tested
three combinations:

A. Explicit nameserver in "nowhere land"
| list.dnswl.org. 21600 IN NS blockedview.dnswl.org.
| blockedview.dnswl.org. 21600 IN A 127.0.0.255

B. Explicit nameserver for data zone in .invalid
| list.dnswl.org. 21600 IN NS _
| you.are.blocked.from.using.dnswl.org.thorugh.public.nameservers.invalid.

C. No zone apex
(no NS records for list.dnswl.org)

In all cases, we returned 127.0.0.255 for *.list.dnswl.org in this view. Also
in all cases, we return 127.0.0.255 for the nameservers of the original data
zone (a through l.ns.dnswl.org), which affected clients should not actually
ever have seen. Also, if an affected client would ask a through l.ns.dnswl.org
they would always receive 127.0.0.255 as an answer.

A. and B. showed no measurable difference in traffic levels on the parent and
the data zone.

With C., the traffic on the parent zone nameservers grew by about 30%; traffic
on the data zone did only shrink by about half the amount that was added on the
parent zone.

This rules out C. as a viable option and makes the choice depend only on goals
2 and 3 above: minimize collateral damage (on root servers) and maximize
identifiability for operators.

It can be expected that some resolvers will ask the roots for invalid., and it
can also be expected that not all resolvers will do proper negative caching for
B.

This leaves A. as the most efficient option with the least collateral damage
(except for the timeouts on the affected DNS resolver / forwarder when trying
to reach 127.0.0.255).

It should be remembered that this only applies to query sources who generate
excessive amounts of traffic over some period of time, and who do not react to
reasonable attempts at communication.

The first line of defense would be to return 127.0.0.255 (or other BLOCKED
triggering value, to be defined) from the regular data zone nameservers, as
discussed in this bug.
--
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

----- End forwarded message -----

--
"Force, my friends, is violence; the supreme authority
from which all other authority is derived."
- Michael Ironside, Starship Troopers
http://www.ChaosReigns.com

Chris Lewis

2011-12-22 16:06:34 UTC

Permalink

The BCP is fairly clear on this, especially when the DNSBL is being shut
down. As the BCP is this >< close to publication (I'm expecting
official notification any minute now), it's way too late to make any
substantive changes to it. However, I was able to slip in a phrase in
to further clarify how an answer outside of 127.0.0.0/8 should be
treated - that was an oversight entirely aside from this issue.

Having a DNSBL return a value _inside_ 127.0.0.0/8 for _all_ queries in
the "go away" case is an extremely bad idea, because many DNSBL clients,
even strict RFC/BCP compliant ones (including my recollections of a
brief glance at SA a few versions ago - I even STR SA treating TXT
records back from SBL-XBL as "listed"), will treat it as "listing the
world".

The DNSBL shutdown process would _also_ be perfectly appropriate for
blocking abusive DNS queries, _without_ listing the world, _and_ by its
very nature shedding the abusive queries.

Matthias Leisi

2011-12-22 21:24:55 UTC

Permalink

Post by Chris Lewis
The DNSBL shutdown process would _also_ be perfectly appropriate for
blocking abusive DNS queries, _without_ listing the world, _and_ by its very
nature shedding the abusive queries.

Note that the case referred to by the OP is not about shutting down a
DNSxL, but about signaling to client applications (and
resolvers/forwarders) that their use is considered not acceptable by
the operator of the service.

Unfortunately, a straightforward REFUSED rcode results in a three-fold
increase in queries due to retries in most cases. A dedicated return
value which would cause at least certain applications to at least
temporarily suspend queries is helpful.

-- Matthias

Chris Lewis

2011-12-25 19:21:30 UTC

Permalink

Post by Matthias Leisi

I realized that before I commented.

The point is that the "shutdown procedure" has the right result -
shedding load and trying to signal to client applications that they
should stop querying it. All without touching any client code
whatsoever. A more sophisticated client could check the name server
returned and thereby identify immediately that the DNSBL is in shutdown
mode (if for an individual querier or in general).

Post by Matthias Leisi
Unfortunately, a straightforward REFUSED rcode results in a three-fold
increase in queries due to retries in most cases. A dedicated return
value which would cause at least certain applications to at least
temporarily suspend queries is helpful.

The problem is that with the installed base, returning any A record
(whether 127/8 or not) has the risk of causing "list the world"
behaviour in the client.