Discussion:
Report-as-Spam header
Brendan Hide
2012-06-10 16:26:12 UTC
Permalink
Hi all

Has a spam reporting header been considered, similar to the
List-Unsubscribe header in RFC2369? Already many Bulk Mail service
providers provide readable headers directing recipients where to send
spam complaints. I don't think this is standardised in any way however.

One of my specialities is handling abuse complaints and monitoring
outbound spam blocking systems. The performance of my team and I have a
direct correlation to network reputation and RBL listings as a result of
compromised end-user accounts. Automated (or even semi-automated)
suspension of compromised accounts would aid in this field and would
probably outperform any army of abuse-complaint "specialists".

The notion of being able to report back to the responsible ISP/mail
server directly without going through a long process is attractive. In
some environments a relay cluster with outbound anti-spam scanning could
even report high-spam-scoring mails with little to no human intervention.

Differences between List-Unsubscribe and "Report-as-Spam":
List-Unsubscribe is added by List Maintainers or Mailing-List
Distribution servers. Report-as-Spam could be added by the mail or relay
server independently of the sender's MUA when the server receives the
mail. The implication is that every "MAIL FROM" could result in a unique
Report-as-Spam ID, similar to where Message-Id would be generated. If a
relay server implicitly trusts a mail server that already adds a
Report-as-Spam header then the relay server would not bother to add
another header.

Drawbacks:
Broken Header/Header Abuse? - My guess is that we will probably end up
with RBLs targeting servers that relay with invalid headers
Performance - Mail servers implementing an automated suspension system
would need to maintain a database (or would need to be able to trawl its
own logs) for the account responsible for the abuse. This does not
necessarily have to be a large database, however it is conceivable that
some mail systems will want to trim this database to only include recent
outgoing mail.

Advantages:
Better First Response against compromised accounts
Standardisation of as-of-yet incompatible/incongruent features

Any comments, positive or negative, would be appreciated, unless its
simply a link to those lists of "why your idea won't stop spam". This
isn't supposed to stop spam, its supposed to make fighting spam more
effective for responsible ISPs.
--
__________
Brendan Hide
http://swiftspirit.co.za/
Web Africa - Internet Business Solutions
http://www.webafrica.co.za/?AFF1E97
SM
2012-06-10 19:41:23 UTC
Permalink
Hi Brendan,
Post by Brendan Hide
Has a spam reporting header been considered, similar to the
List-Unsubscribe header in RFC2369? Already many Bulk Mail service
providers provide readable headers directing recipients where to
send spam complaints. I don't think this is standardised in any way however.
[snip]
Post by Brendan Hide
The notion of being able to report back to the responsible ISP/mail
server directly without going through a long process is attractive.
In some environments a relay cluster with outbound anti-spam
scanning could even report high-spam-scoring mails with little to no
human intervention.
Have you considered using ARF (RFC 5965)? Please also see some of
the work from the IETF MARF working group.

Regards,
-sm
Brendan Hide
2012-06-11 03:36:59 UTC
Permalink
On 2012/06/10 09:41 PM, SM wrote:
[snip]
Have you considered using ARF (RFC 5965)? Please also see some of the
work from the IETF MARF working group.
I hadn't considered utilising ARF here. While incoming complaints do
typically contain the whole mail or at least all headers, an automated
system would not require so much data. In my case relayed mails are kept
in a retrievable format for at least a few days. In theory, I only need
very few of the headers of a mail in order to find the full content,
perhaps even only one header. A full report in ARF format is far too
much information when all I really want is enough information to
know/determine:
a) A recipient reckons the mail was spam
b) The account responsible for having sent the mail

Having gone through all of the IETF MARF working group's draft
documents, ARF is a MIME format for reports and does not define how the
reports are to be sent. I like the concept however, per above, it seems
inappropriate, perhaps even superfluous. The working group is looking
towards utilising DNS in order to determine reporting mechanisms and
protocols. Again, this appears far too complicated for reporting spam.

Legitimate bulk mail services are already successfully using simple
headers and unique IDs. Typically their "unsubscribe" and
"report-as-spam" links embedded in the headers and in the mail itself
all use the same ID with only the URL being slightly different, for
example /unsubscribe?id=xyz and /report?id=xyz. This has required
minimal investment and research for the providers to implement yet, in
theory, it already achieves a reporting mechanism that can be automated.
The only hindrance with this type of reporting is critical mass and
standardisation. I'm not aware of any two bulk mail services that use
the same format or header.

A concern I'm looking at is development time and achievable results. How
many days and lines of code will it take to implement a server-side
report-as-spam header (and corresponding support in MUAs) vs
implementing the reporting mechanisms the IETF MARF Working Group are
working on?
--
Brendan Hide

http://swiftspirit.co.za/
SM
2012-06-11 09:05:48 UTC
Permalink
Hi Brendan,
content, perhaps even only one header. A full report in ARF format
is far too much information when all I really want is enough
a) A recipient reckons the mail was spam
b) The account responsible for having sent the mail
[snip]
Legitimate bulk mail services are already successfully using simple
headers and unique IDs. Typically their "unsubscribe" and
"report-as-spam" links embedded in the headers and in the mail
itself all use the same ID with only the URL being slightly
different, for example /unsubscribe?id=xyz and /report?id=xyz. This
has required minimal investment and research for the providers to
implement yet, in theory, it already achieves a reporting mechanism
that can be automated. The only hindrance with this type of
reporting is critical mass and standardisation. I'm not aware of any
two bulk mail services that use the same format or header.
A concern I'm looking at is development time and achievable results.
How many days and lines of code will it take to implement a
server-side report-as-spam header (and corresponding support in
MUAs) vs implementing the reporting mechanisms the IETF MARF Working
Group are working on?
There are systems which add a reporting URL for users to provide
feedback. That can be used for point (a). A server-side header can
be added in less than an hour if you already have the backend in
place to process the report. The corresponding support in MUAs would
take years. The likelihood of that happening is very small due to legacy.
Broken Header/Header Abuse? - My guess is that we will probably end
up with RBLs targeting servers that relay with invalid headers
There will be abuse.
Performance - Mail servers implementing an automated suspension
system would need to maintain a database (or would need to be able
to trawl its own logs) for the account responsible for the abuse.
This does not necessarily have to be a large database, however it is
conceivable that some mail systems will want to trim this database
to only include recent outgoing mail.
It's possible to track even on a large system.
Better First Response against compromised accounts
That only works if both sides are proactive. That's rare in practice
as abuse handling is an expense people would prefer not to have.
Standardisation of as-of-yet incompatible/incongruent features
That can take a year or more. Some could write a draft as a starting
point for the discussion.

Regards,
-sm
Brendan Hide
2012-06-11 18:51:59 UTC
Permalink
Hi, SM

You've made some good points I'll be considering. Thank you for your
comments thus far. :)
Post by SM
Post by Brendan Hide
Better First Response against compromised accounts
That only works if both sides are proactive. That's rare in practice
as abuse handling is an expense people would prefer not to have.
In my case we're proactive and the current process is expensive on man
hours. Automation would be more cost-effective.
--
__________
Brendan Hide
Web Africa - Internet Business Solutions
http://www.webafrica.co.za/?AFF1E97
Alessandro Vesely
2012-06-11 14:27:32 UTC
Permalink
Post by Brendan Hide
Legitimate bulk mail services are already successfully using simple
headers and unique IDs. Typically their "unsubscribe" and
"report-as-spam" links embedded in the headers and in the mail itself
all use the same ID with only the URL being slightly different, for
example /unsubscribe?id=xyz and /report?id=xyz. This has required
minimal investment and research for the providers to implement yet, in
theory, it already achieves a reporting mechanism that can be
automated. The only hindrance with this type of reporting is critical
mass and standardisation. I'm not aware of any two bulk mail services
that use the same format or header.
Having such links is required by law in some countries. However, some
of them just don't work. Some of them seem to work and tell
recipients they're unsubscribed from stream "xyz", but then they get
spam with /unsubscribe?id=xyzbis, /unsubscribe?id=xyzter, and so
forth. In addition, spammers can add such kind of links pretending to
be a reputable originator, in the same way that they fake "From:" and
"Return-Path:" header fields. Thus, getting at least a part of the
header and body of reported messages would seem to be appropriate in
order to reliably determine the originator's identity.
Post by Brendan Hide
A concern I'm looking at is development time and achievable results.
How many days and lines of code will it take to implement a
server-side report-as-spam header (and corresponding support in MUAs)
vs implementing the reporting mechanisms the IETF MARF Working Group
are working on?
A disadvantage of reporting spam directly, from final recipients to
senders, is that each end user would have to keep track of the
complaints she sent. The reporting entity needs to assess the
trustworthiness of each sender, at least to the extent of learning
whether abuse reporting has any effect. For example, in some cases it
may be better to send reports to the sender's network provider. Thus,
it may be convenient to delegate abuse reporting to a trusted central
service, such as the recipient's mailbox provider.

In the latter scenario, MUA support can be limited to flagging
messages to be reported as spam. It would work much like "spam"
buttons on webmail sites. That strategy requires less updates to MUA
functionality, as it is sufficient to upgrade server software on both
sides. However, server software itself is not updated as often as
needed, since there are still MXex that understand HELO but not EHLO.
Brendan Hide
2012-06-11 18:52:58 UTC
Permalink
Hi Allesandro
Post by Alessandro Vesely
Post by Brendan Hide
Legitimate bulk mail services are already successfully using simple
headers and unique IDs.
Having such links is required by law in some countries. However, some
of them just don't work. Some of them seem to work and tell
recipients they're unsubscribed from stream "xyz", but then they get
spam with /unsubscribe?id=xyzbis, /unsubscribe?id=xyzter, and so
forth.
I hadn't thought about the links being required by law. Re the differing
IDs, it *has to* be unique for every mail sent, not just per recipient.
This aids in tracking down a specific offence.
Post by Alessandro Vesely
In addition, spammers can add such kind of links pretending to
be a reputable originator, in the same way that they fake "From:" and
"Return-Path:" header fields. Thus, getting at least a part of the
header and body of reported messages would seem to be appropriate in
order to reliably determine the originator's identity.
I don't see this part being an issue at all. If my report-handling
server responds saying the report is invalid then the relaying IP
Post by Alessandro Vesely
A disadvantage of reporting spam directly, from final recipients to
senders, is that each end user would have to keep track of the
complaints she sent. The reporting entity needs to assess the
trustworthiness of each sender.
This IS a very good point (even if it is made indirectly). If reports
are sent directly to the spammer, the spammer is not going to do
anything about it while the end-recipient believes he has properly
reported the issue. This would not be such an issue if the ISP is
pro-active and aware of the Report-as-Spam header (the ISP might insert
their own header above the spammer's). Additionally, if the spammer has
a dedicated server (ie the ISP does not intercept/relay the mail
directly) then, again, the ISP won't be able to insert its own header.

Ultimately, there's no way for the end-user to place any trust in the
header's origin. DNS is probably the only saving grace but, regardless,
its back to the drawing board.

Thank you, Allesandro. :)
--
__________
Brendan Hide
Web Africa - Internet Business Solutions
http://www.webafrica.co.za/?AFF1E97
John Levine
2012-06-11 15:00:04 UTC
Permalink
Post by Brendan Hide
Has a spam reporting header been considered, similar to the
List-Unsubscribe header in RFC2369?
Yes, but I doubt it would be very useful.

The problem is that bad guys can add the same headers as good guys, so
it would likely be mostly used as a way to deflect complaints away
from the ISP to the spammer. There are web control panels that add
X-Anti-Abuse headers that I find mostly useful as a high scoring
indicator in Spamassassin.

You could envision some way to check and see if a reporting header was
valid, but if you do that, whatever authority you contacted to check
for validity could as easily provide the reporting address directly.

The vast majority of ARF reports are sent as part of privately
arranged feedback loops, where a sender tells a recipient what its
outgoing IPs or DKIM signatures are, and the recipient sends a report
when a user marks a message as spam. I also use ARF for unsolicited
reports, finding the contact address through a combination of
abuse.net and a modest (3000 entry) private list of IP ranges. That
works reasonably well, but I'm not sure how well it scales.

The question of how to figure out where to send abuse reports has
come up many times in the past. You might want to look through the
archives and the wiki.

R's,
John
Brendan Hide
2012-06-11 18:57:07 UTC
Permalink
Post by John Levine
The problem is that bad guys can add the same headers as good guys, so
it would likely be mostly used as a way to deflect complaints away
from the ISP to the spammer.
You've hit the nail on the head. I suspect a reverse DNS scheme *could*
be the key here - but, per my other mail, its back to square one.

Thank you for your input.
--
__________
Brendan Hide
Web Africa - Internet Business Solutions
http://www.webafrica.co.za/?AFF1E97
Loading...