Discussion:
Unique innovations made to anti-spam system
(too old to reply)
Michael Kaplan
2006-01-22 06:15:06 UTC
Permalink
For those that don't already know I am the ASRG's anti-spam kook.



After a great deal of thought, multiple innovations, and revision I believe
I have perfected my anti-spam system. Umm, well, at least I can assure you
that in my own mind the system is perfected.



Among the innovations is the introduction of trusted domains. Trusted
domains, in combination with sub-addresses, exceed current email
authentication proposals in terms of efficacy and utility… at least as far
as I am able to perceive.



Once again I only approach the members of this board because I believe I
have a contribution to make and I value your knowledge. I hope that you
will find that I am, at the very least, sincere in my efforts.



http://home.nyc.rr.com/spamsolution/An%20Effective%20Solution%20for%20Spam.htm




Thank you,



Michael G. Kaplan
Frank Ellermann
2006-01-22 12:29:07 UTC
Permalink
Post by Michael Kaplan
I am the ASRG's anti-spam kook.
No, you're a CAPTCHA kook, and I'm an SPF kook, we got enough
kooks here.
Post by Michael Kaplan
I believe I have perfected my anti-spam system. Umm, well,
at least I can assure you that in my own mind the system is
perfected.
It's apparently more complex than "reject anything that's no
SPF PASS, and challenge any SPF PASS from unknown strangers."

As long as you don't send challenges to me for any SPF FAILs
do whatever you like. If your CAPTCHAs come as PNG you offer
that I can signup for a database for the blind. And finally
your memo claims:

| The spammer realizes the tremendous cost/futility associated
| with attempting to design software to crack the 3-D CAPTCHA.
| The spammer gives up.

So far they didn't give up to send hundreds of mails per day
to my Message-IDs, just an example. At least they gave up to
abuse them as Return-Path, either an SPF FAIL effect or luck.

Bye, Frank
--
I think we've seen that forcing spammers to send more spam
hasn't been an effective way to make them stop sending spam.
[John L. <http://archive.iecc.com/article/spamtools/20030521001>]
John Levine
2006-01-22 14:29:51 UTC
Permalink
Post by Michael Kaplan
After a great deal of thought, multiple innovations, and revision I believe
I have perfected my anti-spam system. Umm, well, at least I can assure you
that in my own mind the system is perfected.
I took a look. It has the same broken threat model and unrealistic
assumptions of every other C/R and CAPTCHA system.

The worst thing about it is that like every other C/R system, it's a
spam amplifier. The vast majority of mail arriving from unknown
addresses is spam, all of which has forged return addresses. I know
that that nearly all of the challenges I get are due to spam I didn't
send.

It makes the naive assumption that real mail is all sent by people,
and that people will behave the way the designer of the C/R system
wants. Neither is true. There is, for example, at least one comment
on my weblog that nobody will ever see because the guy's broken C/R
system is waiting for my blog system to answer his challenge.
I get
vast amounts of real mail from machines, including a lot of mail that
I would be rather dismayed to lose, such as confirmations for airplane
tickets that I bought. C/R advocates seem to assume that recipients
will all know how to whitelist this mail in advance, but experience
offers little support for that theory.

Re user behavior, I ignore challenges from mail I did send on the
theory that if they wanted to hear from me, they would have read my
mail. But I tend to respond to the ones due to spam when I have time,
since that's the quickest way to get into those users' whitelists so
the C/R system will stop bothering me. (If this isn't what you want,
perhaps you should reconsider the wisdom of asking me to sort your
mail.) I also routinely observe that I send mail, I get a challenge
that I ignore, and a few minutes later I get a live response because
C/R users know that their systems are broken and read their challenged
mail anyway.

There are other more arguable model failures, like the assumption
that brute force is the only way to break challenges, but these two
are plenty to kill any C/R system and always will be.

The final flaw with C/R is that it is a retrospective authentication
system that is inferior in every way to a forward system like DKIM.
C/R takes incoming mail and attempts to go back to the sender and ask
"did you really send this?" while a forward system like DKIM includes
"look! we really sent this!" in the message itself. It's true, the
DKIM signatures are easily added to mail sent by machines, but for
those of us who would prefer to know when our plane leaves, that's a
good thing.

R's,
John
Michael Kaplan
2006-01-22 15:33:23 UTC
Permalink
Wow. I have to admit that I am completely baffled. The system that you are
critiquing doesn't seem to match my system at all.

With ISACS fully functional email addresses are distributed. Receiving mail
from machines or unknown third parties will occur with that same ease that
it does today.

In the example on my website the character "Joe" just has to treat the
address Joe^***@domain.com as his normal address. When he gives it into
an airline web page the airline computer will email him back without any
problem. No C/R is involved. If Joe had initiate his correspondence with
the airline via email then the airline would have the unique address
Joe^***@domain.com and this would be used.

An ISACS user could leave a comment on your weblog without any C/R issue
coming up since ISACS only uses fully functional addresses.

If anyone out their knows how I can amend my web page so that this confusion
with C/R doesn't occur then let me know.


On a side note: For greater clarity I just adjusted my web page by putting
the description of the 3-D CAPTCHA on an entirely different page since I no
longer believe that this innovation is essential and since an earlier poster
on this board quoted something from that section out of context.

Michael Kaplan
Post by Michael Kaplan
Post by Michael Kaplan
After a great deal of thought, multiple innovations, and revision I
believe
Post by Michael Kaplan
I have perfected my anti-spam system. Umm, well, at least I can assure
you
Post by Michael Kaplan
that in my own mind the system is perfected.
I took a look. It has the same broken threat model and unrealistic
assumptions of every other C/R and CAPTCHA system.
The worst thing about it is that like every other C/R system, it's a
spam amplifier. The vast majority of mail arriving from unknown
addresses is spam, all of which has forged return addresses. I know
that that nearly all of the challenges I get are due to spam I didn't
send.
It makes the naive assumption that real mail is all sent by people,
and that people will behave the way the designer of the C/R system
wants. Neither is true. There is, for example, at least one comment
on my weblog that nobody will ever see because the guy's broken C/R
system is waiting for my blog system to answer his challenge.
I get
vast amounts of real mail from machines, including a lot of mail that
I would be rather dismayed to lose, such as confirmations for airplane
tickets that I bought. C/R advocates seem to assume that recipients
will all know how to whitelist this mail in advance, but experience
offers little support for that theory.
Re user behavior, I ignore challenges from mail I did send on the
theory that if they wanted to hear from me, they would have read my
mail. But I tend to respond to the ones due to spam when I have time,
since that's the quickest way to get into those users' whitelists so
the C/R system will stop bothering me. (If this isn't what you want,
perhaps you should reconsider the wisdom of asking me to sort your
mail.) I also routinely observe that I send mail, I get a challenge
that I ignore, and a few minutes later I get a live response because
C/R users know that their systems are broken and read their challenged
mail anyway.
There are other more arguable model failures, like the assumption
that brute force is the only way to break challenges, but these two
are plenty to kill any C/R system and always will be.
The final flaw with C/R is that it is a retrospective authentication
system that is inferior in every way to a forward system like DKIM.
C/R takes incoming mail and attempts to go back to the sender and ask
"did you really send this?" while a forward system like DKIM includes
"look! we really sent this!" in the message itself. It's true, the
DKIM signatures are easily added to mail sent by machines, but for
those of us who would prefer to know when our plane leaves, that's a
good thing.
R's,
John
Bart Schaefer
2006-01-22 18:26:22 UTC
Permalink
On Jan 22, 10:33am, Michael Kaplan wrote:
}
} With ISACS fully functional email addresses are distributed. Receiving
} mail from machines or unknown third parties will occur with that same
} ease that it does today.

What happens when the address that was handed out to a machine becomes
compromised and has to be revoked? Who/what keeps track of all of the
addresses that have ever been handed out and to whom, and who provides
the means to update all of them to new valid subadresses?

And, minor quibble, but how does this become widely deployed at little
or no cost, given that AT&T has apparently patented the subaddressing
technique? (For that matter, how does Reflexion get away with it?)

} If anyone out their knows how I can amend my web page so that this
} confusion with C/R doesn't occur then let me know.

I don't think it's possible, because ISACS employs C/R techniques. In
"phase 1" everyone in the whitelist will receive an automated reply,
which is equivalent to a challenge even though they aren't required to
respond.

In "phase 2" you say that "Email from white-listed correspondents will
continue to always be received ... only non-white-listed correspondents
who attempt to use a deactivated sub-address need" [to react to the
challenge]. However, unless you combine ISACS with something similar
to DKIM, spam that forges the address of the whitelisted senders will
also continue to be recieved.

You also say "The entire content of the stranger's message is now being
stored in the stranger's email inbox. It is waiting to be resent."
This presents an obvious recipe for a spammer attack:

- Discover the "base" address (***@domain.com) for a fairly large
number of ISACS users by watching for ISACS autoresponses to an
ordinary spam run or a deliberate probe
- Launch a spam run of several hundreds of message to each of the
known-valid base addresses, forging the real desired targets of
the spam as the senders
- ISACS challenges containing the spam pour into the mailboxes of the
intended victims

This can also be mitigated by using DKIM-style authentication to avoid
sending challenges for forged mail. (I'm not normally much of a DKIM
advocate, but any "mailbox protection" system that employs automatic
responses without some kind of forgery detection is flatly unworkable.)

I disagree with John Levine that "retrospective authentication" is a
flaw in this system -- DKIM is an assertion that the sender is who he
claims to be; the ISACS subaddress is an assertion that the sender has
been given permission to contact the recipient, whether or not the
sender is who he claims to be. However, neither of those alone is
sufficient.

A few other remarks on the manifesto:

- "The stranger decodes the sub-address. The stranger then copies and
pastes his bounced message into a newly composed email message. The
email is successfully dispatched using the sub-address."

This is a naive statement on several levels. First, the assumption
that the stranger is willing to bother decoding the sumaddress, and
won't simply delete the challenge and take his business elsewhere.
Second, that the bounced message is something that can be copied and
pasted, i.e., not a complex multipart. Third, that all transports
involved can successfully transmit the original message, undamaged,
in both directions, without running afoul of various filtering and
size limitations.

I suppose this is all meant to be worked out during "phase 1" also.

- "Eventually the stranger's email provider can update its system so
that these bounces are even easier to deal with. ... The server
alters the email" [to turn it into an interactive form].

This seems to assume a webmail system, because otherwise you'd be
talking about putting this capability into email user agents, not
servers. If the UA is assumed capable of dealing with this, why
not build it in to the system that sends the challenge?

- "A central database of trusted email domains will be established."

If you postulate a global reputation system for trusted domains,
where is the added advantage of requiring senders at those domains
to use subaddresses? Would there be some odd class of domains
that allows spammers to send mail but not to harvest?

- "The address book of ***@TrustedDomain.com is automatically
updated"

Another apparent assumption that all email management, down to the
user's personal information, resides on some kind of centralized
server at every interesting domain.
--
Bart Schaefer Brass Lantern Enterprises
http://www.well.com/user/barts http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net
Michael Kaplan
2006-01-22 20:17:48 UTC
Permalink
Thank you for your very insightful commentary.
Post by Bart Schaefer
}
} With ISACS fully functional email addresses are distributed. Receiving
} mail from machines or unknown third parties will occur with that same
} ease that it does today.
What happens when the address that was handed out to a machine becomes
compromised and has to be revoked?
Under the current email system you can't revoke your address, you just have
to suck it up when you get spam. If you desperately don't want to revoke a
sub-address that you gave to a machine months ago then you don't have to.
The multiplicity of sub-addresses makes it less likely that any one would
become compromised. Before I revoked the sub-address I would notify the
"machine." If the machine belongs to a trusted domain that has upgraded its
software then the sub-address can be deactivated without fear of being cut
off.

If a trusted domain was not established and the machine email came from an
reputable business that was sending an important email then I would hope
that the reputable business would spend the absolutely trivial amount of
expense need to decode the CAPTCHA and return the email.

I'll also add that I think it will be a relatively rare event for an
sub-address to be revoked. The person who created this sub-address system:
http://www.vsta.org/spam/Traveler.html
reports that in 14 months of use he has only cancelled 2 sub-addresses.
Almost all of my web site is devoted to discussing challenges but I believe
it will be a very rare occurrence for anyone to actually encounter one.

Who/what keeps track of all of the
Post by Bart Schaefer
addresses that have ever been handed out and to whom, and who provides
the means to update all of them to new valid subadresses?
You'll have to look at how existing sub-address systems such as Zoemail,
Reflexion, and Traveler do it and extrapolate from there. Different types
of email systems will need different types of software upgrades.
Post by Bart Schaefer
And, minor quibble, but how does this become widely deployed at little
or no cost, given that AT&T has apparently patented the subaddressing
technique? (For that matter, how does Reflexion get away with it?)
I don't know, but I sure that AT&T ain't gettin' rich off of Zoemail.

} If anyone out their knows how I can amend my web page so that this
Post by Bart Schaefer
} confusion with C/R doesn't occur then let me know.
I don't think it's possible, because ISACS employs C/R techniques. In
"phase 1" everyone in the whitelist will receive an automated reply,
which is equivalent to a challenge even though they aren't required to
respond.
It's more of a Vacation message than a Challenge.

In "phase 2" you say that "Email from white-listed correspondents will
Post by Bart Schaefer
continue to always be received ... only non-white-listed correspondents
who attempt to use a deactivated sub-address need" [to react to the
challenge]. However, unless you combine ISACS with something similar
to DKIM, spam that forges the address of the whitelisted senders will
also continue to be recieved.
How does the spammer figure out who is on your white-list? The white-list
only contains the addresses of your personal contacts.
I also stated that ISACS would work synergisticly with filters, and DKIM is
a tool used to enhance a filter.

You also say "The entire content of the stranger's message is now being
Post by Bart Schaefer
stored in the stranger's email inbox. It is waiting to be resent."
number of ISACS users by watching for ISACS autoresponses to an
ordinary spam run or a deliberate probe
- Launch a spam run of several hundreds of message to each of the
known-valid base addresses, forging the real desired targets of
the spam as the senders
- ISACS challenges containing the spam pour into the mailboxes of the
intended victims
The spammer will spam ***@domain.com knowing that Joe will not receive a
single piece of spam? 95% of this spam will be filtered immediately and 5%
will go towards victims. If the victims filters are set to filter out ISACS
bounces that don't correspond to recently sent emails then 0% will reach the
victims. If the victims filters have not been updated for ISACS then the
filters will detect the words "Cheap Viagra!" in the bounce and another 95%
of the remaining 5% will be filtered. I don't see the motivation.
Post by Bart Schaefer
- "The stranger decodes the sub-address. The stranger then copies and
pastes his bounced message into a newly composed email message. The
email is successfully dispatched using the sub-address."
This is a naive statement on several levels. First, the assumption
that the stranger is willing to bother decoding the sumaddress, and
won't simply delete the challenge and take his business elsewhere.
Existing sub-address systems (as best as I can determine from outside
sources) rarely need to deactivate sub-addresses, and ISACS is designed to
make this an even rarer occurrence. Any sub-address recently distributed to
the stranger should be active. But yes, you are right - people can still
refuse to handle resend their email.

Second, that the bounced message is something that can be copied and
Post by Bart Schaefer
pasted, i.e., not a complex multipart. Third, that all transports
involved can successfully transmit the original message, undamaged,
in both directions, without running afoul of various filtering and
size limitations.
I will defer to other knowledgeable members as to if these are
insurmountable issues. I suspect not, but I will defer to the group
consensus.
Post by Bart Schaefer
- "Eventually the stranger's email provider can update its system so
that these bounces are even easier to deal with. ... The server
alters the email" [to turn it into an interactive form].
This seems to assume a webmail system, because otherwise you'd be
talking about putting this capability into email user agents, not
servers. If the UA is assumed capable of dealing with this, why
not build it in to the system that sends the challenge?
My description was more appropriate for webmail systems, but yes, email
user agents will need to be updated to use ISACS.
Post by Bart Schaefer
- "A central database of trusted email domains will be established."
If you postulate a global reputation system for trusted domains,
where is the added advantage of requiring senders at those domains
to use subaddresses?
Senders at trusted domains that have been updated to automatically resend
ISACS bounces (see Figure 5 from my website) will never need to use a
sub-address. I consider this a tremendous advantage. Please see the last
section of my web page where I talk about the evolution of ISACS.

Would there be some odd class of domains
Post by Bart Schaefer
that allows spammers to send mail but not to harvest?
Every trusted domain can still send spam. A trusted domain is simply one
that prevents harvesting.
Post by Bart Schaefer
updated"
Another apparent assumption that all email management, down to the
user's personal information, resides on some kind of centralized
server at every interesting domain.
For webmail this is true. Email user agents will need to be reprogrammed to
recognized the bounce seen in Figure 4 and act on this info to automatically
update the address book.

Your input is greatly appreciated,
Michael Kaplan
Bart Schaefer
2006-01-22 22:06:29 UTC
Permalink
On Jan 22, 3:17pm, Michael Kaplan wrote:
}
} Thank you for your very insightful commentary.

You're welcome.

} If a trusted domain was not established and the machine email came
} from an reputable business that was sending an important email then
} I would hope that the reputable business would spend the absolutely
} trivial amount of expense need to decode the CAPTCHA and return the
} email.

Many reputable businesses send very large volumes of email. If it is
economically infeasible for spammers to decode the CAPTCHAs, why do you
believe it will be feasible for other businesses?

It's true that ideally the reputable businesses start out with a valid
subaddress and only have to deal with revocations. (Never mind that
this will NOT be true during "phase 1".) On the other hand, even for
reputable businesses, one primary motivation for using email is its
extremely low cost compared to postal mail and other communications.
I can't predict the economic consequences, but I would predict that the
need to process CAPTCHAs will be a disincentive for deployment.

} > "phase 1" everyone in the whitelist will receive an automated reply,
} > which is equivalent to a challenge even though they aren't required
} > to respond.
}
} It's more of a Vacation message than a Challenge.

It requires that the recipient take action, or the notification has not
served its purpose. That's much closer to a challenge than to a mere
out-of-office response.

} > spam that forges the address of the whitelisted senders will
} > also continue to be recieved.
}
} How does the spammer figure out who is on your white-list?

By raiding the address books of the people to whom you send mail. This
happens *all* the time, usually (I suspect) via virus or worm or other
compromise of the correspondent's system. I'd expect that subaddress
compromise will most often occur this way as well. I've seen spam that
was directed to VERPed addresses generated by an automated system for
confirmation of hotel reservations; the only possible ways for those
VERPs to have been obtained by the spammer would be network sniffing or
infection of a recipient's PC.

I've also seen "your mail was not delivered" responses sent to those
VERPs from virus filters, sometimes months after the VERP was created.

I have little faith in the statistics that have been collected so far
for systems like zoemail/reflexion/traveler, because I have no evidence
that they are yet in use by the general public. Things that appear to
work sensibly when tested on techies go wrong in all sorts of unexpected
ways when loosed on the less-educated masses and their poorly-secured
home computers.

} > - ISACS challenges containing the spam pour into the mailboxes of the
} > intended victims
}
} The spammer will spam ***@domain.com knowing that Joe will not receive a
} single piece of spam?

Yep. "Bounce spamming" is less common now than it was a couple of years
ago, if the examples in my trapped spam archives are representative, but
it's not unheard-of. (Yes, I'm one of those weird people who have the
past month's worth of spam sitting around in gzip'd folders, just in
case my filters went wrong.)

} 95% of this spam will be filtered immediately

So despite the claim of near-perfect performance for ISACS, all domains
are expected to continue using and maintaining their adaptive filters?
Why would I take on the added cost of ISACS for only that remaining 5%
of the problem, if I can't get rid of any other costs?

} If the victims filters are set to filter out ISACS
} bounces that don't correspond to recently sent emails

I'll direct you to the archives of this list for discussions of the
problems of keeping track of recently-sent email and matching it to
arriving bounces. You can't handwave this away.

} If the victims filters have not been updated for ISACS then the
} filters will detect the words "Cheap Viagra!" in the bounce and
} another 95% of the remaining 5% will be filtered. I don't see the
} motivation.

It's already fairly well accepted that the response of spammers to
having a smaller fraction of their mail get through is to send larger
amounts of it.

Further, I'd dispute that applying two 95%-effective spam filters has
a net 99.75% success rate. It's much more likely that the same 5% of
spam that makes it through the first filter will also make it through
the second filter -- the things that both filters are looking for must
be pretty similar, almost by definition.
Michael Kaplan
2006-01-22 23:05:54 UTC
Permalink
Many reputable businesses send very large volumes of email. If it is
Post by Bart Schaefer
economically infeasible for spammers to decode the CAPTCHAs, why do you
believe it will be feasible for other businesses?
On my website I assume that the spammer would spend a tenth of a cent to
manually decode a CAPTCHA and I demonstrate how this would be a crippling
expense.

Let's assume that over the course of a year Amazon.com emails 10 million
customers. I'll say that 5% of these sub-addresses are deactivated without
the customers bothering to notify amazon. I'll say that it costs Amazon 5
cents to decode a CAPTCHA (fifty times as expensive as what I assumed the
spammer would have to pay!). It would cost Amazon $25,000 over the course
of the entire year - and that is for an enormous company.
Not a great example because I'm sure Amazon.com would be a trusted domain
and they would have the software upgrade to automatically resend the
bounces. The same calculations for a small company with 20,000 customers
would be $50 a year.

And another point: You have to purchase Adobe Acrobat but you can get Adobe
Acrobat Reader for free. Likewise you may have to pay to use ISACS to rid
yourself of spam but I'm sure that the software to appropriately process
ISACS bounces will be distributed freely and aggressively for web mail and
email user agents.
Post by Bart Schaefer
} It's more of a Vacation message than a Challenge.
It requires that the recipient take action, or the notification has not
served its purpose. That's much closer to a challenge than to a mere
out-of-office response.
Ultimately once the software upgrade to process bounces is installed (free
of charge I should add) the recipient will take no action of any kind.


} How does the spammer figure out who is on your white-list?
Post by Bart Schaefer
By raiding the address books of the people to whom you send mail. This
happens *all* the time, usually (I suspect) via virus or worm or other
compromise of the correspondent's system.
The following is taken from my website:
"*People will have malware infesting their computers, raiding their
address book and constantly supplying spammers with valid addresses.*
** This is an argument *for*, not against ISACS. All of the
contacts of the person infected with malware will be able to identify the
source of the security breach based on the sub-address. In this case this
system is a true blessing since the situation will become readily apparent
and it can be remedied, saving anyone who would later be added to that
address book. Almost no other anti-spam system aids in the identification
of such malware."

I have little faith in the statistics that have been collected so far
Post by Bart Schaefer
for systems like zoemail/reflexion/traveler, because I have no evidence
that they are yet in use by the general public.
I quote some outside reviews and even a comment from this board supporting
Reflexion on my web page. I know of a lot of anecdotal evidence of email
accounts being spam free for months until one little security breach
resulted in endless spam. You are right, I don't have absolute proof, but
what evidence I do have is suggestive.
Post by Bart Schaefer
Yep. "Bounce spamming" is less common now than it was a couple of years
ago, if the examples in my trapped spam archives are representative, but
it's not unheard-of.
I see how this is possible, but I don't see how this is advantageous to the
spammer. Use of the free ISACS bounce filtering software upgrade will make
this completely futile for the spammer.
Post by Bart Schaefer
} 95% of this spam will be filtered immediately
So despite the claim of near-perfect performance for ISACS, all domains
are expected to continue using and maintaining their adaptive filters?
Why would I take on the added cost of ISACS for only that remaining 5%
of the problem, if I can't get rid of any other costs?
Because ISACS will result in near total elimination of spam (I'll
guesstimated that you'll still get 3 or 4 spams a year - a speculative but I
think reasonable estimate).

} If the victims filters are set to filter out ISACS
Post by Bart Schaefer
} bounces that don't correspond to recently sent emails
I'll direct you to the archives of this list for discussions of the
problems of keeping track of recently-sent email and matching it to
arriving bounces. You can't handwave this away.
Most email systems I have interacted with have a list of sent messages
immediately available. If this is a problem then ISACS bounces can be
cached for one hour or ten hours or one day or for whatever amount of time
is needed to correlate the bounce with the sent email list.


Further, I'd dispute that applying two 95%-effective spam filters has
Post by Bart Schaefer
a net 99.75% success rate.
Very well, but I still don't see why bounce spamming is preferable to
directly spamming users. It only adds a barrier, even if you feel it is not
a great barrier.

Thank you once again,

Michael Kaplan
Bart Schaefer
2006-01-23 00:07:49 UTC
Permalink
On Jan 22, 6:05pm, Michael Kaplan wrote:
}
} Let's assume that over the course of a year Amazon.com emails 10
} million customers. I'll say that 5% of these sub-addresses are
} deactivated without the customers bothering to notify amazon. I'll
} say that it costs Amazon 5 cents to decode a CAPTCHA (fifty times as
} expensive as what I assumed the spammer would have to pay!).

I suspect that all of those estimates are low and will vary based on
the size and type of business, but there's no way to know.

} I'm sure that the software to appropriately process ISACS bounces will
} be distributed freely and aggressively for web mail and email user
} agents.

Perhaps ... once you overcome the chicken-and-egg problem of getting
such a system widely enough deployed to be interesting to OSS developers.

} ** [Malware] is an argument *for*, not against ISACS. All of the
} contacts of the person infected with malware will be able to identify
} the source of the security breach based on the sub-address.

Again, perhaps. This assumes that all contacts have "personalized"
subaddresses.

I expect you to say that this will be true because the correspondent
receives an automatically-generated subaddress in the first contact
between the ISACS user and the correspondent -- but if the correspondent
is also using ISACS, what address was originally used for that first
contact? At least half of the users of the system must begin in the
state where they have either the "Joe^lucky" subaddress or one obtained
from a third party -- or where they have no subaddress at all and must
undertake to update their address book upon challenge.

} I still don't see why bounce spamming is preferable to directly
} spamming users.

Using bizarre HTML encodings to spell out words vertically or diagonally
or to represent pornographic images as ASCII art isn't preferable to
directly displaying the words or images to the user, either, but both
are techniques that spammers have employed. You're postulating a
spam-prevention system and then wondering why spammers would attempt
to circumvent it?
Michael Kaplan
2006-01-23 01:40:06 UTC
Permalink
Post by Bart Schaefer
} ** [Malware] is an argument *for*, not against ISACS. All of the
} contacts of the person infected with malware will be able to identify
} the source of the security breach based on the sub-address.
Again, perhaps. This assumes that all contacts have "personalized"
subaddresses.
I expect you to say that this will be true because the correspondent
receives an automatically-generated subaddress in the first contact
between the ISACS user and the correspondent -- but if the correspondent
is also using ISACS, what address was originally used for that first
contact? At least half of the users of the system must begin in the
state where they have either the "Joe^lucky" subaddress or one obtained
from a third party -- or where they have no subaddress at all and must
undertake to update their address book upon challenge.
Soon after you've used ISACS many of your contacts will have personalized
sub-addresses, so deactivating Joe^lucky will be intrinsically less
disruptive then when ***@domain.com was deactivated. Joe^lucky can also be
deactivated in the same stepwise manner over a month or two just like
***@domain.com was deactivated. Some of the correspondents that were given
Joe^lucky will have already transitioned to personalized sub-addresses as
all subsequent emails from the ISACS user will contain personalized
sub-addresses in the "From" field.

Also of course if you've corresponded with this account then that account is
white-listed.

The total neophyte may rely on one custom sub-address "lucky," but many
slightly more experienced ISACS users will likely have 2 or 3 custom
sub-address, once again decreasing the impact if one is compromised.
With ISACS neophytes get a chance to make sloppy mistakes like posting their
address on the internet; they have a chance to learn.

If two correspondents are using ISACS then their domains are certainly
trusted. Use of a sub-address would be completely unnecessary in that case.

I will concede that disruptions will occur, but the severity of the
disruption will decrease over time.

And I'll repeat my mantra of how challenges will vanish as trusted domains
and the relevant software upgrades proliferate.
Post by Bart Schaefer
} I still don't see why bounce spamming is preferable to directly
} spamming users.
Using bizarre HTML encodings to spell out words vertically or diagonally
or to represent pornographic images as ASCII art isn't preferable to
directly displaying the words or images to the user, either, but both
are techniques that spammers have employed. You're postulating a
spam-prevention system and then wondering why spammers would attempt
to circumvent it?
Sorry, I still don't follow what you are describing. I don't follow how
sending bounce generating spam to an ISACS account circumvents the ISACS
security. Or is this somehow helping to spam non ISACS users?


When judging this system one should take its annoyances into account, but
please balance that against its benefits. I often hear the refrain "We
won't end spam until we completely rework the SMTP architecture" following
by the standard response "yes but there is no way to do that."
In the short term I believe that ISACS will provide an impressive degree of
immediate benefit to its users with a tolerable level of annoyance. But I
also ask you to look towards the long term. ISACS *can* be superimposed
upon current email architecture. ISACS can proliferate to more and more
users, and nearly every domain can be entered into to database of trusted
domains. As ISACS grows it becomes more effective and less annoying.
Now picture the day when ISACS is almost universal, and nearly every domain
is trusted. No one will every have to deal with a sub-address or a
challenge. Efficacy would be extreme, and annoyance would be almost
non-existent. At that point could ISACS be the FUSSP? Is there any other
way to achieve the FUSSP?

Yes we should consider its annoyances, but we should balance it out against
its short term and long term promise.

Thank you,
Michael Kaplan
Bart Schaefer
2006-01-23 19:42:06 UTC
Permalink
On Jan 22, 8:40pm, Michael Kaplan wrote:
}
} Yes we should consider its annoyances, but we should balance it out
} against its short term and long term promise.

It's not really a matter of annoyance.

I don't think ISACS has any single major flaw. Rather, I think it faces
a number of impediments to implementation, any one of which you might
argue is not insurmountable, but which taken together leave me doubting
its viability. In no particular order (and all in my opinion) ...

- You've underestimated the frequency with which user still type email
addresses manually. If random subaddress are to become universal yet
"invisible", it will an require unprecedented degree of automated email
address management in every UA.

- It'll be a slow and difficult process to establish automatic handling
of challenges among "trusted" domains. Your assessment of the simplicity
of identifying challenges resulting from forgery is too optmistic. Also,
automated management requires extensive cooperation between servers and
UAs -- who decides which one processes the challenge and handles address
book updates?

- The extra traffic from both real and spurious challenges is an unknown
cost. Your proposal for forgery detection upon receipt of the challenge
has ramifications for both network and storage costs.

- The costs to non-spammers of managing rejections may be higher than you
predict, especially in "phase 1".

- It'll be very difficult to establish a global reputation system, for
political reasons if not for technical ones.

- The typical end user is lazy and apathetic. The average non-techie
will not properly manage/use his own subaddresses, and is likely to be
unwilling to expend extra effort to manage his contacts' subaddresses,
even if he doesn't simply find the process too confusing. (The number
of people who still think they need to prefix their email address with
"www." is mind-boggling.)

- You've overestimated the simplicity of resending rejected messages, and
underestimated the number of mailboxes that will never be "protected"
because, e.g., the owner feels he can't afford to make it more difficult
for new contacts to get their messages through.

I'll probably think of a few more, but I've spent long enough on this.
Richard Clayton
2006-01-23 20:29:08 UTC
Permalink
Post by Michael Kaplan
Many reputable businesses send very large volumes of email.  If
it is
economically infeasible for spammers to decode the CAPTCHAs, why do you
believe it will be feasible for other businesses?
is it infeasible ?

Where is the evidence ? I suggest spammers don't decode CAPTCHAs
because they are not yet widely employed... so there's no point.

As it happens, I think they are missing out unnecessarily... I think
the main difficulty in dealing with CAPTCHAs is more the wide range of
systems offering them, rather than an inherent difficulty in solving
what is on offer today.

I've recently been receiving a lot of C-R response email (the Pharmacy
guys seem to like using my domain for their junk)... and so I have
started looking at how easy it would be to process automatically.

I'm currently corresponding with a handful of C-R users who object to
my responding to the challenges ... apparently they don't think I'm
behaving myself in arranging for them to read Pharmacy spam which
they are too lazy to filter for themselves :( One even reported me to
my own abuse@ address !

Anyway, a lot of the C-R's I am currently receiving merely require 3rd
Grade reading skills and the ability to reply to the email. These could
be trivially automated since there is no perceptible variation in the
text that is presented :(

Some websites provide the challenge as text embedded in the page -- and
that is ever so easy to move to the POST response.

Most websites provide simple images that are trivial to process (there's
several other researchers breaking the trivial ones on a regular basis,
try Google -- at least one of the breakers is selling a service to soup
up your CAPTCHAs using the knowledge they've got from breaking others)

There was a paper at last year's CEAS showing that the hard part of
breaking text CAPTCHAs was the glyph separation -- after that computers
were better than humans at distinguishing mangled shapes!

Strong CAPTCHAs are currently the exception. However, if I was going to
process a lot of them I think I'd automate as far as possible and then
spend my money in the Third World...

But first don't forget to allow for stupidity -- a large ISP [from which
I have received several hundred C-R emails] has some pretty strong
looking text-based CAPTCHAs ... unfortunately they only have 30 of them!
so it's easy to provide a dictionary of responses :-( I'm currently
trying to reverse-engineer how they select amongst the 30 because that
would make it even quicker to respond !

[BTW Kaplan's website has most of this information (though not the story
about the ISP with only 30 images). I also note that his CAPTCHAs are
not text based. I'd need to do some more work to comment as to whether
his stick-figures are genuinely harder to solve. They looked as if they
made some cultural assumptions that might not travel well.]
Post by Michael Kaplan
On my website I assume that the spammer would spend a tenth of a
cent to manually decode a CAPTCHA and I demonstrate how this would
be a crippling expense.
Just to be clear -- the tenth of a cent is the right sort of number.

A primary (grades 1-4) school headmistress in a Tamil Nadu (in rural
India), earns about $15 a week and is solidly in the middle classes.

The particular person I was told of (a colleagues relation) owned a nice
home (worth maybe $6000) in a salubrious leafy suburb.

So one could get appropriate skills for about $10 or so a week [labour
rates are higher for towns with broadband]. For a 50 hour week that
means you're paying about 20 cents an hour.

I've never tried solving CAPTCHAs at speed, so I couldn't predict how
fast I could do them for hours on end. But it looks to me that the cost
is definitely going to be in fractions of a cent/solution.

Of course you need to add in the cost of the connectivity and the kit
($100 laptops anybody?) but people who think of CAPTCHAs in terms of the
hourly charging rate of their attorney (or plumber!) are entirely
missing the point.


OK... so let's look at Kaplan's analysis which initially assumes that it
doesn't matter if all CAPTCHAs are broken for free:


His assumptions are:

1 The email service provider can filter 95% of spam.
2 The CAPTCHA is broken 100% of the time.
3 The spammer has a 3:1 ratio of bogus to real email addresses.
4 A spammer sends 100 million emails using a valid return address.
5 Users click on a "This Is Spam" button when spam arrives.

He then shows that the spammer has to send 1600 emails to get one
spam to its destination.

This sum is based on a key assumption which I think is incorrect.

He assumes that the spammer sends 1600 emails, and just 80 get through
the filter. This is not inconsistent with measured values for filters in
the real world. So far so good.

He then assumes that the spammer solves the 80 CAPTCHAs and resends.
This then results in a further attrition of 95% (ie 4 get through) and
only then is it discovered that 3 are bogus addresses and the final 1 is
delivered.

However, this is dumb by the spammer (and/or magical by the filter).

Why does the filter suddenly improve when the email is sent for the
second time (viz: it starts to discard 95% of the email that it approved
earlier ?). Or -- same idea but different: why does the spammer send
something that is filterable at the first stage ?

It seems to me that the scheme (which is just filtering and nothing to
do with CAPTCHAs at this point) only ensures that the spammer must send
80 emails to get one delivered. (ie: it's 20x worse than Kaplan
proposes).

Kaplan has a second sum

Assumptions:

6 A spammer must pay $0.001 per manually solved CAPTCHA
7 The spammer wants to successfully deliver one million spam per day

He then calculates a cost of $80,000/day for getting the one million
spam emails delivered.

However, with the adjustment to the sums that I suggest is more
reasonable [not assuming that the filtering of the two stages is
independent] then to deliver one million spams then 80 million emails
must be sent and 4 million CAPTCHAs must be solved, costing $4,000/day

Kaplan multiplies his number by 365 to make it sound even bigger, but
this just obscures things....

... the question is whether the expense of solving the CAPTCHAs can be
afforded by the spammer.

Note that sending the emails is essentially free -- the spammer will use
zombies to send the emails via innocent (insecure) end users, so there's
no costs for electricity of bandwidth to worry about.

There's some consensus around a response rate (up to a couple of years
ago) of about 0.003% for spam (these figures come via journalists from
Laura Betterley and the Iraqi playing cards interviews).

So the 1 million delivered maps to about 30 customers a day. This means
that you'd need a profit margin of about $133 per sale to make spamming
worthwhile. That's quite a lot (though if you're selling fake pills or
Rolls Royces then you might still press on).

However, even in Betterley's day there was some filtering and spam
discarding going on -- so we're not comparing like with like. The one
million spams are DELIVERED SPAM -- ie: they have got through the
filters and are sitting there waiting to be opened by the gullible.

If we assume the 0.003% came from a time when filters were 50% effective
(ie only about 50% of people had any) then the profit margin necessary
drops to $62/sale.

That's still a lot -- but if the spammer weeds out their list better
(only 25% valid addresses isn't too brilliant) then the required profit
margin would drop again.

Also [and this is key to spammer success], if they improved their
message (hire a Madison Avenue executive to teach them how to make their
advert more compelling) then the abysmal response rate would rise [[for
example, the Iraqi playing cards were 4 times more likely to be ordered
than the spammers usual fare of pills and toner cartridges.]].

BUT this is the spammer working the way the system wants him to. Why on
earth would he do that [even though he may do OK that way with high
profit margin goods, it's still eating into his lifestyle]

There's a much simpler approach that the clever spammer would take.

Instead of solving CAPTCHAs to send spam, he would solve CAPTCHAs to
acquire a valid sub-address. Once he had this, he would then send as
many different pieces of spam as possible as fast as possible to this
sub-address. He'd advertise pills and mortgages and anatomy enhancers
and lotto winnings and poker sites and... etc. ((or he could just sell
it to fellow spammers and they would send the spam...))

Viz: he'd get more than one email delivered per validated sub-address

Clearly there are things that could be done to improve end-user software
to counter this, but in the meantime, profitability would be restored.

Bottom line is that I agree that the CAPTCHAs raise spammers costs, but
I don't agree that they do anything more than freeze out low profit
margin spam (and make the pills more likely to be fake).

Even if challenge-response systems were perfect, I'd not be in favour
because of the damage to innocent third parties. But they are not (on
these assumptions) as effective as claimed :(

Plus of course there are other objections as put forward by others, but
I wanted to concentrate on the economics because I've written about
these before (in the context of proof-of-work schemes)

http://www.cl.cam.ac.uk/~rnc1/proofwork2.pdf

and most of the analysis carries over just fine.
Post by Michael Kaplan
Let's assume that over the course of a year Amazon.com emails 10
million customers.  I'll say that 5% of these sub-addresses are
deactivated without the customers bothering to notify amazon.  I'll
say that it costs Amazon 5 cents to decode a CAPTCHA (fifty times
as expensive as what I assumed the spammer would have to pay!).
actually Amazon are experimenting with the Mechanical Turk... so they
might be able to manage Third World rates :)

http://www.mturk.com

[ ah I see that Bart also spotted the double application of the
filtering stage ]
Post by Michael Kaplan
Further, I'd dispute that applying two 95%-effective spam filters has
a net 99.75% success rate.  
 
Very well
hmm... I think it needs more than that as a reply :(

- --
richard Richard Clayton

Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety. Benjamin Franklin 11 Nov 1755
Justin Mason
2006-01-23 21:09:00 UTC
Permalink
Post by Richard Clayton
So one could get appropriate skills for about $10 or so a week [labour
rates are higher for towns with broadband]. For a 50 hour week that
means you're paying about 20 cents an hour.
I've never tried solving CAPTCHAs at speed, so I couldn't predict how
fast I could do them for hours on end. But it looks to me that the cost
is definitely going to be in fractions of a cent/solution.
Of course you need to add in the cost of the connectivity and the kit
($100 laptops anybody?) but people who think of CAPTCHAs in terms of the
hourly charging rate of their attorney (or plumber!) are entirely
missing the point.
By the way, I heard a few years back of a case of large-scale
CAPTCHA-decode farming, utilising teenagers in internet cafes in Thailand.

Thailand and India both typically feature at least one internet cafe in
every large town, and having spent time travelling in both countries and
visiting those facilities, I can tell you that the local teens are regular
and willing inhabitants. ;) Equipment really isn't a problem.

The UI apparently was that account-signup bots would proxy the CAPTCHA
images to the kid's screen, who was otherwise occupied IM'ing pals,
playing online games, etc. Solve a CAPTCHA, get back to IM'ing.

(This was all anecdotal, nothing on the record etc., unfortunately.)
Post by Richard Clayton
Post by Michael Kaplan
Let's assume that over the course of a year Amazon.com emails 10
million customers.  I'll say that 5% of these sub-addresses are
deactivated without the customers bothering to notify amazon.  I'll
say that it costs Amazon 5 cents to decode a CAPTCHA (fifty times
as expensive as what I assumed the spammer would have to pay!).
actually Amazon are experimenting with the Mechanical Turk... so they
might be able to manage Third World rates :)
Agreed, I was just thinking how much the Mechanical Turk system
reminded me of the Thai CAPTCHA farming ;)

- --j.
Michael Kaplan
2006-01-24 03:42:10 UTC
Permalink
Post by Richard Clayton
I also note that his CAPTCHAs are
not text based. I'd need to do some more work to comment as to whether
his stick-figures are genuinely harder to solve. They looked as if they
made some cultural assumptions that might not travel well.]
The 3-D CAPTCHA is not text based but as I explain on my site I believe that
existing text based CAPTCHA such as the Microsoft CAPTCHA provides more than
enough security.
Post by Richard Clayton
So one could get appropriate skills for about $10 or so a week [labour
rates are higher for towns with broadband]. For a 50 hour week that
means you're paying about 20 cents an hour.
I've never tried solving CAPTCHAs at speed, so I couldn't predict how
fast I could do them for hours on end. But it looks to me that the cost
is definitely going to be in fractions of a cent/solution.
Try solving a few of the Microsoft CAPTCHA. An experienced person should
take about 3 seconds. Working nonstop 12 hours a day would get you 14,400
solved CAPTCHA. I'll use my figure of 80 million CAPTCHA solved in order to
deliver one million spam. That means that every day the spammer is
employing 5,556 workers using 5,556 computers that use electricity and may
need air conditioning. And the third world owner of this business needs a
cut and you'll need security guards so that the computers won't get stolen
and...

However you crunch the numbers this is a major expense.
Post by Richard Clayton
Why does the filter suddenly improve when the email is sent for the
second time (viz: it starts to discard 95% of the email that it approved
earlier ?). Or -- same idea but different: why does the spammer send
something that is filterable at the first stage ?
Post by Michael Kaplan
Post by Bart Schaefer
Further, I'd dispute that applying two 95%-effective spam filters has
a net 99.75% success rate.
Very well
hmm... I think it needs more than that as a reply :(
During the harvesting phase the spammer must do what spammers never do: use
a real and functional return address. We can speculate about how crippling
this would be for the spammer. I'll assume that spammers will be forced to
send poorly filterable material during the first round but the incredible
burden of using a real return address may still allow for a degree of
filtering.

So we will say that it is on the second round that real spam is sent and
that 95% of this will be filtered. Almost every commonly used domain is
trusted, but this spam is using a sub-address that was sent to an untrusted
domain; a stronger filter can be applied to sub-addresses sent to untrusted
domain.

But also remember that it is very obvious which domains are sending harvest
spam. An ISACS utilizing email service provider may normally get only 50
bounce generating emails a day from the little known untrusted domain
Sleazy.com. Now over the last 30 minutes 100,000 bounce generating emails
come in from Sleazy.com.

Now the second round of spam comes in using real sub-address but spoofed
"From" fields. The email service provider can reject and send ISACS bounces
to all of these extremely suspicious sub-addresses if they do not use the
Sleazy.com domain. Legitimate correspondents usually would resend the
bounce from the same domain but ISACS usually allows them to use any
domain. Extra restrictions can be placed on these extraordinarily
suspicious sub-address. Or this extra-suspicious sub-addresses can just
have a ridiculously strong filter applied to them.

There are endless ways to play with the numbers, but I'll stick with the
estimate of 1.6 billion spam emails with real return addresses sent in order
to deliver one million spam (And I repeat the question - Is this even
possible?)

Thank you,
Michael Kaplan
Bart Schaefer
2006-01-24 04:56:07 UTC
Permalink
On Jan 23, 10:42pm, Michael Kaplan wrote:
}
} During the harvesting phase the spammer must do what spammers never
} do: use a real and functional return address. We can speculate about
} how crippling this would be for the spammer.

Not especially crippling. Spammers already use dozens (sometimes more)
of throwaway domains. [In fact I believe one of hotmail or yahoo has
plans to use the registration lifetime of a domain as a crude measure of
its reputation.] A lot of mail can be sent before the volume emanating
from any given domain draws attention.

Further, if an army of zombie spam senders can be organized, so can an
army of bounce collectors. Use the mailbox of the hijacked PC as the
return address, scan mail as it's downloaded, and snatch the bounces
out of the stream before the user sees them (perhaps by masquerading
as (gasp!) a spam filter). ISACS subaddresses are the perfect VERPs;
the bounces can be flawlessly identified without looking at the content,
and the address will look perfectly normal to all outside observers.

And hey, that zombie PC is in a trusted domain, so there's no CAPTCHA
to decode. OK, so that domain doesn't stay trusted forever ... but
there's always another PC somewhere else, hiding behind a POP download
from someplace you don't expect.

} So we will say that it is on the second round that real spam is sent
} and that 95% of this will be filtered.

I'm still nonplussed by that assertion. I'd like to see some analysis
comparing the costs of having 5% of spam get through, to the costs of
operating ISACS; since you don't propose that ISACS will eliminate any
of the costs of managing the other 95%.

} Almost every commonly used domain is trusted, but this spam is using a
} sub-address that was sent to an untrusted domain

Was it?
Michael Kaplan
2006-01-24 06:12:23 UTC
Permalink
Post by Bart Schaefer
}
} During the harvesting phase the spammer must do what spammers never
} do: use a real and functional return address. We can speculate about
} how crippling this would be for the spammer.
Not especially crippling. Spammers already use dozens (sometimes more)
of throwaway domains. [In fact I believe one of hotmail or yahoo has
plans to use the registration lifetime of a domain as a crude measure of
its reputation.] A lot of mail can be sent before the volume emanating
from any given domain draws attention.
Spammer domains that exist for the sole purpose of collecting ISACS bounces
by sending out spam with real return addresses would be discovered almost
instantly and could placed on a blacklist. A spammer will likely need to
register well in excess of a thousand new domains a day to successfully
collect the bounces needed to spam a million ISACS accounts. How expensive
is this? And he still needs to deal with the CAPTCHA.


Further, if an army of zombie spam senders can be organized, so can an
Post by Bart Schaefer
army of bounce collectors. Use the mailbox of the hijacked PC as the
return address, scan mail as it's downloaded, and snatch the bounces
out of the stream before the user sees them (perhaps by masquerading
as (gasp!) a spam filter). ISACS subaddresses are the perfect VERPs;
the bounces can be flawlessly identified without looking at the content,
and the address will look perfectly normal to all outside observers.
And hey, that zombie PC is in a trusted domain, so there's no CAPTCHA
to decode. OK, so that domain doesn't stay trusted forever ... but
there's always another PC somewhere else, hiding behind a POP download
from someplace you don't expect.
A single zombie spam sender can send out multiples of thousands of spams a
day. A zombie bounce collector can only spam the small number of people who
correspond with the owner of the hijacked PC. This is unlikely to be enough
spam to take anything but a very small domain off of the trusted domain
list. The owner of the zombie bounce collector PC will soon get a lot of
angry emails; rectifying action will quickly be taken.

Under the current email system zombies that snoop email address can exist
for prolonged periods of time without being discovered. Yet even today they
are not perfectly efficient since a number of users actually are able to
conceal their address from spammers. With ISACS these snooping zombies will
be readily discovered, and the damage that they do will be readily repaired
when the compromised sub-addresses are deactivated.

Thank you for you input,
Michael Kaplan
Bart Schaefer
2006-01-24 16:20:22 UTC
Permalink
On Jan 24, 1:12am, Michael Kaplan wrote:
}
} A single zombie spam sender can send out multiples of thousands of
} spams a day. A zombie bounce collector can only spam the small number
} of people who correspond with the owner of the hijacked PC.

You miss the point. The bounce collector isn't spamming the people who
correspond with the hijacked PC -- in fact, it isn't necessary for it
to spam anyone. All it's doing is filtering the [forged] reply mailbox
for an ordinary spam run, or more likely for a special harvesting run
that doesn't [other than the forgery] appear to be spam. The victims
of the harvesting don't have to be existing correspondents of anyone, as
the whole point is to induce their ISACS system to emit a challenge.

The bounce collector then phones home and drops off the ISACS bounces
for decoding, and a second spam run later ensues using the harvested
addresses, possibly with different and this time likely undeliverable
return addresses. And remember that this is initially occurring from
to-this-point trusted domains, so there aren't any CAPTCHAs to decode.

} The owner of the zombie bounce collector PC will soon get a lot of
} angry emails

Which he'll never see, because the bounce collector is filtering out
the email that is sent to the ISACS-format subaddress that it created!
It can silently discard them (maybe after phoning home to report that
it has found a live person whose filters didn't block the spam).

} With ISACS these snooping zombies will be readily discovered, and the
} damage that they do will be readily repaired when the compromised
} sub-addresses are deactivated.

With ISACS, the harvester can, in an automated fashion, "grow" (by
inducing challenges) an unlimited number of subaddresses to target,
and (by direct infection) create a nearly unlimited number of them to
use as bounce traps; and can efficiently filter for known-good base
addresses. One could even, with the appearance of innocence, collect
several subaddresses for each known-good target before beginning the
first real spam run against any of them, then continue the harvesting
process while rolling over to the next such subaddress when the first
becomes disabled. It'd be a long time before he ran out of ammo; the
registry of trusted domains would be emptied first, except for a few
one-user vanity domains who could carry on their private conversations.

This imaginary arms race we're conducting is kind of amusing, but I
think it's time to stop.
Michael Kaplan
2006-01-25 03:55:25 UTC
Permalink
Post by Bart Schaefer
}
} A single zombie spam sender can send out multiples of thousands of
} spams a day. A zombie bounce collector can only spam the small number
} of people who correspond with the owner of the hijacked PC.
You miss the point. The bounce collector isn't spamming the people who
correspond with the hijacked PC -- in fact, it isn't necessary for it
to spam anyone. All it's doing is filtering the [forged] reply mailbox
for an ordinary spam run, or more likely for a special harvesting run
that doesn't [other than the forgery] appear to be spam. The victims
of the harvesting don't have to be existing correspondents of anyone, as
the whole point is to induce their ISACS system to emit a challenge.
The bounce collector then phones home and drops off the ISACS bounces
for decoding, and a second spam run later ensues using the harvested
addresses, possibly with different and this time likely undeliverable
return addresses. And remember that this is initially occurring from
to-this-point trusted domains, so there aren't any CAPTCHAs to decode.
I will concede that you can describe a mechanism whereby bots can be used to
attack ISACS, and I have no doubt that this will happen, but I think that
the impact is greatly exaggerated. This is the myth of the spam zombie's
omnipotence/omnipresence/infinite ease of deployment. I say this because
even today under the current email system many users have little or no spam
while others who use the identical email system get flooded. If spammers
had limitless access to every ones email address then everyone would get
flooded. The currently existing sub-address email systems would be of no
benefit whatsoever (even the head of our group has stated that one of the
sub-address email systems "apparently has lots of happy users").

These bots are a problem but their ability to snoop is still limited.
The problems involved in snooping ISACS sub-address can *only *be more
difficult.
The damage done to these accounts is reversible, and the sub-address helps
in tracking down the zombie.
Post by Bart Schaefer
With ISACS, the harvester can, in an automated fashion, "grow" (by
inducing challenges) an unlimited number of subaddresses to target,
and (by direct infection) create a nearly unlimited number of them to
use as bounce traps; and can efficiently filter for known-good base
addresses. One could even, with the appearance of innocence, collect
several subaddresses for each known-good target before beginning the
first real spam run against any of them, then continue the harvesting
process while rolling over to the next such subaddress when the first
becomes disabled. It'd be a long time before he ran out of ammo; the
registry of trusted domains would be emptied first, except for a few
one-user vanity domains who could carry on their private conversations.
What you describe can happen but it is treatable. The spam bot will reveal
itself as soon as some of those collected sub-addresses are used; the spam
bot can then be killed.

But what if this bounce trap collected an unlimited number of CAPTCHA-free
bounces? A time limit can be placed on sub-addresses during which they must
be used at least once or they will expire. I'll say seven days. This means
that an ISACS user who fell victim to this spam bot would get as much spam
for seven days as a normal email user. After that the "limitless" number of
bounces collected will be garbage.
Post by Bart Schaefer
I'll use my figure of 80 million
CAPTCHA solved in order to deliver one million spam.
hmm... I did try to explain that 4 million might be wiser :(
I think you miscalculated. How about I say the spammer harvests bounces by
sending 80 million emails using a real return address. I'll be generous and
say that the spammer has a 100% return because he sent the mail without any
kind of filterable material.

Now the spammer has 80 million CAPTCHA that he pays people to solve. He now
sends 80 million pieces of good old-fashioned spam with spoofed addresses.
95% of this is filtered and 4 million gets through. 75% of these addresses
are bogus accounts so only one million pieces of spam has hit its target.

The spammer has paid to decode 80 million CAPTCHA.
But there is another big expense: The spammer has sent 80 million email
with a real return address. The spammer may have registered a bunch of
personal domains for this purpose. Honeypots can be used to detect these
newly created spammer domains. Lets say that the spammer can use each domain
100,000 times before this untrusted domain actually makes it onto a
blacklist and becomes useless to the spammer. At just $5 to register a
domain this single mailing has cost the spammer $4,000.
Post by Bart Schaefer
Almost every commonly
used domain is trusted, but this spam is using a sub-address that
was sent to an untrusted domain; a stronger filter can be applied
to sub-addresses sent to untrusted domain.
Unless that stronger filter is "drop all" then I don't accept that
somehow there are better filters :(
By "stronger" I meant one with a greater true positive rate but worse false
positive rate compared to what would usually be tolerated.
Post by Bart Schaefer
You seem to be redesigning your system :(
Yes, constantly. The interaction on this board has been invaluable. I
appreciate the criticism because after a while I use this criticism to
improve this system.

I believe that have found a way around a certain MAJOR criticism that has
come up over the past few days. As soon as I get some time I am going to
completely redo my site to reflect the solution.

Thank you all for your input,
Michael Kaplan
Richard Clayton
2006-01-24 10:11:01 UTC
Permalink
 
Post by Richard Clayton
I've never tried solving CAPTCHAs at speed, so I couldn't predict how
fast I could do them for hours on end. But it looks to me that the cost
is definitely going to be in fractions of a cent/solution.
 
Try solving a few of the Microsoft CAPTCHA.  An experienced person
should take about 3 seconds.  Working nonstop 12 hours a day would
get you 14,400 solved CAPTCHA. 
Whether it is 1 second or 3 is to some extent in the noise compared with
the other assumptions
I'll use my figure of 80 million
CAPTCHA solved in order to deliver one million spam.
hmm... I did try to explain that 4 million might be wiser :(
  That means
that every day the spammer is employing 5,556 workers using 5,556
computers that use electricity and may need air conditioning. 
http://laptop.media.mit.edu/
And
the third world owner of this business needs a cut and you'll need
security guards so that the computers won't get stolen and...
 
However you crunch the numbers this is a major expense.
I agree it is an expense. It's just that you think it is 20 x what I do.

Unfortunately for the scheme design, that 20 moves it into an area where
spammers could continue to operate efficiently.

For major disruption I'd like to see schemes where spammers had to
achieve savings of 100 or 1000 times what legitimate businesses had to.

Sadly, proof-of-work (however dressed up) does not have that property :(

Hence the only way to get it into the ballpark is to tack onto it some
sort of whitelisting scheme [or an equivalent blacklisting one]

The sub-addresses in the Kaplan scheme are whitelisting. However, I
don't think (hence my sums) that this proposal has a sufficient
multiplier effect to quite make it :( [there are other issues as well,
but that's sufficient to kill it in my mind]
Post by Richard Clayton
Why does the filter suddenly improve when the email is sent for the
second time (viz: it starts to discard 95% of the email that it approved
earlier ?).  Or -- same idea but different: why does the spammer
send
something that is filterable at the first stage ?
 
Post by Richard Clayton
       Further, I'd dispute that applying two 95%-effective
spam
       filters has
       a net 99.75% success rate.
    
    Very well
hmm... I think it needs more than that as a reply :(
 
During the harvesting phase the spammer must do what spammers never
do:  use a real and functional return address. 
they "never do" it because it isn't necessary in 2006

Once upon a time spammers did have return addresses ... which is why
"public.com" is nailed into codebases all over the planet :(
We can speculate
about how crippling this would be for the spammer. 
I'd prefer some figures based on analysis. I'd note that receiving 4
million emails a day is less than a rack of kit... think of it as being
equivalent to handling the incoming email for an ISP with about 50K
customers -- so not trivial, but not rocket science either
I'll assume
that spammers will be forced to send poorly filterable material
during the first round but the incredible burden of using a real
return address may still allow for a degree of filtering.
I don't see that -- the "real return address" will continue to function
just fine until it gets onto a blacklist. That will not happen until the
spam is sent -- which can be a long time after the sub-address was
handed out.
So we will say that it is on the second round that real spam is
sent and that 95% of this will be filtered. 
I'm accepting your figure there. Over time I expect that to get worse
rather than better (as spam morphs to be more like real email) but at
the moment that's realistic.
Almost every commonly
used domain is trusted, but this spam is using a sub-address that
was sent to an untrusted domain; a stronger filter can be applied
to sub-addresses sent to untrusted domain.
Unless that stronger filter is "drop all" then I don't accept that
somehow there are better filters :( Leastwise not if they don't use
humans in the loop [which might be a better use of cheap labour than
solving CAPTCHAs -- the Good Guys can hire them to clean mailboxes]
But also remember that it is very obvious which domains are sending
harvest spam. 
I don't see that at all -- you specifically make the point right at the
start of the explanation of the scheme that sub-addresses are entirely
transferrable. You even put "used by anyone" into italics to emphasise
this point :(
An ISACS utilizing email service provider may
normally get only 50 bounce generating emails a day from the little
known untrusted domain Sleazy.com.  Now over the last 30 minutes
100,000 bounce generating emails come in from Sleazy.com.
Spammers aren't that dumb -- the emails will be a wide range of
addresses...

For example sleazy-***@yahoo.com, sleazy-***@msn.com and so on.

If you're relying on Yahoo! and MSN to weed out Mr Sleazy (and I cannot
quite understand why you assume that) then the emails will come from
***@sleazy1.plausible.com, ***@sleazy2.plausible.com etc

I don't accept the need for the spammer to set up all the sub-addresses
over "30 minutes" or to be consistent in their return addresses.

I also don't accept that it is easy to tell the difference between
plausible.com (apologies to the owner of that domain, but there is just
one) and uk.com (who sell example.uk.com domains to thousands of
distinct businesses) -- hence sub-domains will work well.
Now the second round of spam comes in using real sub-address but
spoofed "From" fields.  The email service provider can reject and
send ISACS bounces to all of these extremely suspicious
sub-addresses if they do not use the Sleazy.com domain. 
You seem to be redesigning your system :( Your webpage specifically
says (in fact it puts it into italics) "These addresses can be used by
anyone." but you now seem to be associating sub-addresses with
particular sources of email.

That puts your scheme right into the horrible mess that is forwarding
and is therefore unwise. I suggest you redesign it back again :(
Legitimate
correspondents usually would resend the bounce from the same domain
but ISACS usually allows them to use any domain.  Extra
restrictions can be placed on these extraordinarily suspicious
sub-address.  Or this extra-suspicious sub-addresses can just have
a ridiculously strong filter applied to them.
I'm sorry, it's not possible to critique a system that is changing under
ones' feet (or one that uses mythical filters with mutable qualities).

Set out more clearly what "extra restrictions" are.

Set out more clearly how you deal with forwarding.

Set out more clearly why you believe no damage is done to legitimate
emails by "ridiculously strong filters"
There are endless ways to play with the numbers, but I'll stick
with the estimate of 1.6 billion spam emails with real return
addresses sent in order to deliver one million spam (And I repeat
the question - Is this even possible?)
The "Dutch botnet" discovered last Autumn is reported to have 1.6
million machines in it (off-the-record reports say that there were a lot
more). If each machine sends 1000 emails a day (which is a factor of 50
or so less than can be easily achieved) then you have the volume desired

So the answer to "is this even possible" is "regrettably, yes"

- --
richard Richard Clayton

Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety. Benjamin Franklin 11 Nov 1755
B. Johannessen
2006-01-22 16:16:34 UTC
Permalink
Post by John Levine
mail.) I also routinely observe that I send mail, I get a challenge
that I ignore, and a few minutes later I get a live response because
C/R users know that their systems are broken and read their challenged
mail anyway.
I'm in no way a C/R supporter, but I have done some experiments with C/R
in that past. One "enhancement" I quickly discovered was using VERP
tagging on the challenges, and dropping messages from the pending queue
when the challenge bounced. This cut the pending queue by more then 90%.

Used with a SpamAssassin setup that rejects everything with a score
above 5 and challenges anything with a score between 5 and 3, and you
have a pretty functional spam-filter. It's still evil cost-shifting, but
pretty functional :-)


Bob
John Levine
2006-01-22 19:40:09 UTC
Permalink
Post by B. Johannessen
I'm in no way a C/R supporter, but I have done some experiments with C/R
in that past. One "enhancement" I quickly discovered was using VERP
tagging on the challenges, and dropping messages from the pending queue
when the challenge bounced. This cut the pending queue by more then 90%.
Russ Nelson has experimented with a R (no C) technique based on this
observation. When a message from an unfamiliar address arrives, his
setup sends an auto-ack and puts the mail into a holding pen. If the
auto-ack bounces, he moves the message into the spam folder. If after
15 minutes or so there's no bounce, the message moves into the inbox.

He said it works quite well.

R's,
John
Peter J. Holzer
2006-01-22 20:07:40 UTC
Permalink
Post by John Levine
Post by B. Johannessen
I'm in no way a C/R supporter, but I have done some experiments with C/R
in that past. One "enhancement" I quickly discovered was using VERP
tagging on the challenges, and dropping messages from the pending queue
when the challenge bounced. This cut the pending queue by more then 90%.
Russ Nelson has experimented with a R (no C) technique based on this
observation. When a message from an unfamiliar address arrives, his
setup sends an auto-ack and puts the mail into a holding pen. If the
auto-ack bounces, he moves the message into the spam folder. If after
15 minutes or so there's no bounce, the message moves into the inbox.
He said it works quite well.
However, it still sends mails to innocent bystanders. It is mitigated by
the fact that each address only gets one mail, but if this is widely
implemented, the owners of the forged sender addresses used by spammers
will be bombarded with auto-ack messages.

hp
--
_ | Peter J. Holzer | Ich sehe nun ein, dass Computer wenig
|_|_) | Sysadmin WSR | geeignet sind, um sich was zu merken.
| | | ***@hjp.at |
__/ | http://www.hjp.at/ | -- Holger Lembke in dan-am
Michael McConnell
2006-01-22 20:22:33 UTC
Permalink
Post by Peter J. Holzer
Post by John Levine
Post by B. Johannessen
I'm in no way a C/R supporter, but I have done some experiments with C/R
in that past. One "enhancement" I quickly discovered was using VERP
tagging on the challenges, and dropping messages from the pending queue
when the challenge bounced. This cut the pending queue by more then 90%.
Russ Nelson has experimented with a R (no C) technique based on this
observation. When a message from an unfamiliar address arrives, his
setup sends an auto-ack and puts the mail into a holding pen. If the
auto-ack bounces, he moves the message into the spam folder. If after
15 minutes or so there's no bounce, the message moves into the inbox.
He said it works quite well.
However, it still sends mails to innocent bystanders. It is mitigated by
the fact that each address only gets one mail, but if this is widely
implemented, the owners of the forged sender addresses used by spammers
will be bombarded with auto-ack messages.
That would depend if the auto-ack is an entire message, DATA and all, or
whether it stops after checking the response code to RCPT TO at the sender's
mailserver.

-- Michael "Soruk" McConnell
Eridani Star System

MailStripper - http://mailstripper.eridani.co.uk/
Mail Me Anywhere - http://www.MailMeAnywhere.com/
Peter J. Holzer
2006-01-23 17:25:33 UTC
Permalink
Post by Michael McConnell
Post by Peter J. Holzer
Post by John Levine
Russ Nelson has experimented with a R (no C) technique based on this
observation. When a message from an unfamiliar address arrives, his
setup sends an auto-ack and puts the mail into a holding pen. If the
auto-ack bounces, he moves the message into the spam folder. If after
15 minutes or so there's no bounce, the message moves into the inbox.
He said it works quite well.
However, it still sends mails to innocent bystanders. It is mitigated by
the fact that each address only gets one mail, but if this is widely
implemented, the owners of the forged sender addresses used by spammers
will be bombarded with auto-ack messages.
That would depend if the auto-ack is an entire message, DATA and all, or
whether it stops after checking the response code to RCPT TO at the sender's
mailserver.
That's a different technique, which is already implemented in some
standard MTAs. Postfix calls this "sender verification", Exim uses the
more descriptive term "smtp callback". The problem with this approach is
that a positive reply to a RCPT TO is no guarantee that the address
exists. Some sites accept all mails and then send bounces. Russ' Scheme
gets around this problem but at the cost of potentially being much more
annoying to forgery victims. (I guess that it could be combined with SPF
or DKIM to give victims an easy way to avoid being ddossed)

hp
--
_ | Peter J. Holzer | Ich sehe nun ein, dass Computer wenig
|_|_) | Sysadmin WSR | geeignet sind, um sich was zu merken.
| | | ***@hjp.at |
__/ | http://www.hjp.at/ | -- Holger Lembke in dan-am
Douglas Otis
2006-01-23 19:41:56 UTC
Permalink
Russ' Scheme gets around this problem but at the cost of
potentially being much more annoying to forgery victims.
(I guess that it could be combined with SPF or DKIM to give victims
an easy way to avoid being ddossed)
DKIM is not related to the return-path and is not expected to survive
within a DSN. Although often less, SPF has a required minimum of
more than hundred lookups and then _may_ be related to either return-
path or the PRA. SPF may produce erroneous results in some cases,
such as when applied to the PRA or 1123 5.3.6(a). SPF may provide
open-ended authorizations to enable alternative providers which
perhaps also attracts abuse at the same time. Another potential
problem occurs when SPF is considered a verification of email-address
to justify the accrual of reputation, which is dangerous in most
shared environments.

BATV, much like VERP, offers a solution for preventing any "back-
scatter" problem from affecting the users. The handful of packets
exchanged offers protection, and is better than delivering a bogus
message. This overhead is not that much worse than using a block-
list and returning an error-message indicating which list caused the
rejection.

http://mipassoc.org/batv/
http://cr.yp.to/proto/verp.txt

-Doug
Peter J. Holzer
2006-01-23 20:45:57 UTC
Permalink
Post by Douglas Otis
Russ' Scheme gets around this problem but at the cost of
potentially being much more annoying to forgery victims.
(I guess that it could be combined with SPF or DKIM to give victims
an easy way to avoid being ddossed)
DKIM is not related to the return-path and is not expected to survive
within a DSN.
It doesn't have to be. My idea was simply to exempt domains which use
DKIM from the auto-ack check.

I.e. if a message is received from a sender domain which announces that
it uses DKIM:

If the message has matching signature, accept it.

If the message has no or an incorrect signature reject it.

(Same thing for SPF, etc.)

Otherwise quarantine message and send auto-ack.

I.e., if you are flooded with lots of auto-acks because a spammer
forges your mail addresses, you can simply add an SPF record, or
(a bit less simple) implement DKIM on your outgoing mails to stop the
flood.

I still don't like that scheme, but this way it would only be annoying
instead of nasty.
Post by Douglas Otis
BATV, much like VERP, offers a solution for preventing any "back-
scatter" problem from affecting the users.
Yes, but it has to implemented by the sender. If I implement it, I will
get less (or even no) backscatter, but it won't reduce the amount of
"real" spam I get. Russ' scheme tries to achieve that (but is of course
easily circumvented by spammers once it is in wide use).

hp
--
_ | Peter J. Holzer | Ich sehe nun ein, dass Computer wenig
|_|_) | Sysadmin WSR | geeignet sind, um sich was zu merken.
| | | ***@hjp.at |
__/ | http://www.hjp.at/ | -- Holger Lembke in dan-am
Douglas Otis
2006-01-23 21:37:25 UTC
Permalink
Post by Peter J. Holzer
Post by Douglas Otis
DKIM is not related to the return-path and is not expected to
survive within a DSN.
It doesn't have to be. My idea was simply to exempt domains which
use DKIM from the auto-ack check.
I.e. if a message is received from a sender domain which announces
If the message has matching signature, accept it.
If the message has no or an incorrect signature reject it.
Not a good idea. It may be a message munged by a list-server. DKIM
allows cases where the email-address domain does not match the
signing domain, and policies permitting third-party domain
signatures. This mitigation is depending upon the prevalence that
email-addresses are confined to that of the provider. Even when the
email-address domain and the signing domain match, this still has not
confirmed the return-path should there be a reason to bounce the
message.
Post by Peter J. Holzer
(Same thing for SPF, etc.)
SPF is often open-ended. This may not offer an assurance of the
return-path either, and failure may also be in error.
Post by Peter J. Holzer
Otherwise quarantine message and send auto-ack.
I.e., if you are flooded with lots of auto-acks because a spammer
forges your mail addresses, you can simply add an SPF record, or (a
bit less simple) implement DKIM on your outgoing mails to stop the
flood.
If the concern is to ensure the delivery of the message, BATV would
be a safer option for avoiding the back-scatter. DKIM does not
prevent any back-scatter as explained. Rejection based upon SPF has
similar problems with erroneous failures.
Post by Peter J. Holzer
I still don't like that scheme, but this way it would only be
annoying instead of nasty.
Post by Douglas Otis
BATV, much like VERP, offers a solution for preventing any "back-
scatter" problem from affecting the users.
Yes, but it has to implemented by the sender. If I implement it, I
will get less (or even no) backscatter, but it won't reduce the
amount of "real" spam I get.
This comment was limited to your conclusion that DKIM or SPF solves
the back-scatter problem. They don't. SPF depends upon third-
parties reading and acting on the record or perhaps expecting the
spammers to have read your record. Use of SPF also hopes that no one
on your domain is sending to a forwarded account when closed. DKIM
has nothing to do with the return-path. Don't forget that spammers
will be able to sign as well as anyone else and take advantage of
open policies.

-Doug
Peter J. Holzer
2006-01-23 22:34:36 UTC
Permalink
Post by Douglas Otis
Post by Peter J. Holzer
Post by Douglas Otis
BATV, much like VERP, offers a solution for preventing any "back-
scatter" problem from affecting the users.
Yes, but it has to implemented by the sender. If I implement it, I
will get less (or even no) backscatter, but it won't reduce the
amount of "real" spam I get.
This comment was limited to your conclusion that DKIM or SPF solves
the back-scatter problem.
I didn't claim that they solve the problem by themselves. I suggested
that anyone implementing the auto-ack scheme should include a way for
third parties to protect themselves from getting backscatter. SPF and
DKIM were merely examples - I mentioned them because they have already
seen some deployment. Additionally, publishing a "v=spf ~all" record is
almost no effort and shouldn't have any negative consequences (except
that your peers might doubt your sanity).

And, as I said, I DON'T LIKE THIS SCHEME and I do not recommend that
anybody should implement it. I merely offered a suggestion which might
turn a very bad idea into merely a bad idea.

BATV solves the the problem of backscatter. No adaptation of the
auto-ack scheme required. However, this means that the deployer of the
auto-ack scheme expects everybody else to implement batv, which is rude
at best.

hp
--
_ | Peter J. Holzer | Ich sehe nun ein, dass Computer wenig
|_|_) | Sysadmin WSR | geeignet sind, um sich was zu merken.
| | | ***@hjp.at |
__/ | http://www.hjp.at/ | -- Holger Lembke in dan-am
Frank Ellermann
2006-01-24 01:46:50 UTC
Permalink
Post by Douglas Otis
SPF is often open-ended. This may not offer an assurance of
the return-path either, and failure may also be in error.
The latter would be a case of either an erroneous policy, then
it's the problem of the sender, or checking behind the border,
then it's a problem of the receiver. Folks who aren't up to
getting it right better stay away from SMTP and DNS, that's no
specific SPF isue.

The former ("open-ended" is your weaselword for MEUTRAL if I
finally got it) is no problem. Receivers only intrerested in
FAIL can ignore policies without a single "-" qualifier. And
if they are only interested in PASS they might be also able to
optimize the evaluation.

They can do many interesting things like check only every 112th
mail - just in case because Doug says 112 lookups are the norm.
Post by Douglas Otis
Rejection based upon SPF has similar problems with erroneous
failures.
Quite the contrary "drop FAIL" is extremely dangerous, reject
is always fine.
Post by Douglas Otis
SPF depends upon third-parties reading and acting on the
record or perhaps expecting the spammers to have read your
record.
A combination, the spammers can't be sure who does something in
the direction of "drop FAIL" like e.g. SpamAssassin. While it
is dangerous it has its uses. Get around SA is the jackpot for
a spammer, trying it with a FAIL-protected address is no plan.
Post by Douglas Otis
Use of SPF also hopes that no one on your domain is sending
to a forwarded account when closed.
If that's about 1123 5.3.6(a) with both forwarder and next hop
ignoring the issue - as it's their right -, and if the next hop
checks SPF, then FAIL emulates "551 user not local" and works
as designed. So far I had this once in 20 months, and the 551-
bounce (emulation, actually of course not 551) told me where
to send the message again directly bypassing the forwarder.

The idea behind SPF is pure KISS. And if you stick to "ip4" in
a policy it's also in practice KISS, at worst you have to know
what a "CIDR" might be. If you finally get the idea why you
use either "ip4" or "a" or both in a policy you're done. The
rest of the show like macros, "ptr", and what else is for geeks
or non-trivial mail setups of bigger ISPs with clueful admins.

Bye, Frank
Douglas Otis
2006-01-24 18:26:49 UTC
Permalink
Post by Douglas Otis
SPF is often open-ended. This may not offer an assurance of the
return-path either, and failure may also be in error.
The latter would be a case of either an erroneous policy, then it's
the problem of the sender, or checking behind the border, then it's
a problem of the receiver. Folks who aren't up to getting it right
better stay away from SMTP and DNS, that's no specific SPF isue.
I suppose this means any policy of '-all' would be in error then?
The former ("open-ended" is your weaselword for MEUTRAL if I
finally got it) is no problem.
There are at least two types of open-ended lists, '?all' and '~all'.
It is rather common to see both. The first would be what is
described as neutral, and the second would be soft-fail. For
example, Sender-ID recommends that a "spf2.0/pra ?all" be used as an
"opt-out" of the PRA algorithm. This would be one example where a
"neutral" result is _not_ the same as no record. Of course, there is
no real means for the sender to control how these records are used by
the recipient.
Receivers only intrerested in FAIL can ignore policies without a
single "-" qualifier. And if they are only interested in PASS they
might be also able to optimize the evaluation.
They can do many interesting things like check only every 112th
mail - just in case because Doug says 112 lookups are the norm.
The SPF draft calls for a _minimum_ number of 112 lookups before
giving up on resolving the records. When this is attempting to
resolve some distant domain, timeouts may mean this will be
reattempted later, perhaps again, and again. There are configuration
unable to fit within 112 lookups, even with all the tricks unless
they include foreign address space or a non-address exists.
Post by Douglas Otis
Rejection based upon SPF has similar problems with erroneous
failures.
Quite the contrary "drop FAIL" is extremely dangerous, reject is
always fine.
While drop fail is not good, rejection may still affect a large
number of users.
Post by Douglas Otis
SPF depends upon third-parties reading and acting on the record or
perhaps expecting the spammers to have read your record.
A combination, the spammers can't be sure who does something in the
direction of "drop FAIL" like e.g. SpamAssassin. While it is
dangerous it has its uses. Get around SA is the jackpot for a
spammer, trying it with a FAIL-protected address is no plan.
Are you saying a closed policy would be bad, but that only a closed
policy would offer benefits?
Post by Douglas Otis
Use of SPF also hopes that no one on your domain is sending to a
forwarded account when closed.
If that's about 1123 5.3.6(a) with both forwarder and next hop
ignoring the issue - as it's their right -, and if the next hop
checks SPF, then FAIL emulates "551 user not local" and works as
designed. So far I had this once in 20 months, and the 551-bounce
(emulation, actually of course not 551) told me where to send the
message again directly bypassing the forwarder.
Many individuals cherish the use of an email-address that affiliates
them with a school or society. Before attempting to describe the
valid path an email-address may take based upon the return-path or
the PRA, these services worked. It would seem that path registration
has created the problem.
The idea behind SPF is pure KISS.
There is very little with respect to SPF that is simple. The
combination of macro expansions, includes, redirects, and favors of
open-ended authorization makes SPF the opposite of KISS. When the
parser may make 112 lookups in an attempt to resolve an address also
suggests that something this complex has not adhered to the KISS
concept.
And if you stick to "ip4" in a policy it's also in practice KISS,
at worst you have to know what a "CIDR" might be. If you finally
get the idea why you use either "ip4" or "a" or both in a policy
you're done. The rest of the show like macros, "ptr", and what
else is for geeks or non-trivial mail setups of bigger ISPs with
clueful admins.
Expecting large and complex domains to use CDIR notation for outbound
hosts ignores how these tools normally work and invites error. The
fact that larger domains must struggle to constrain their
configuration suggests there is a fundamental problem with the
approach. The need for the 112 minimum lookup requirement also
suggests there is a fundamental problem with this approach.

-Doug
Frank Ellermann
2006-01-25 06:11:39 UTC
Permalink
Post by Douglas Otis
I suppose this means any policy of '-all' would be in error
then?
Of course not, it's perfectly okay for a domain to say that
it's neither used as HELO FQDN nor in any MAIL FROM address.

In cases where the domain also cannot be used in an RCPT TO -
because it has no MX and no IPs, or at least no smtpd - '-all'
is a bit pointless from the POV of this domain. But it could
still help others to reject spam claiming to be "from" this
domain without tricks like "call back verification".

Same idea as DKIM's 'o=.' SSP. Or Mark's old "nullmx" draft.
I vaguely recall that there used to be a DNSBL with the same
effect. Does CSV also offer a "never used as HELO" somehow ?
Yet another reason why SIQ could be a good idea.
Post by Douglas Otis
There are at least two types of open-ended lists, '?all' and
'~all'. It is rather common to see both. The first would
be what is described as neutral, and the second would be
soft-fail.
The latter is mainly for testing purposes while a domain tries
to figure out where its out-going routes are. If that turns
out to be too difficult they should pick '?all', and otherwise
they can go for '-all'.

DKIM also adopted this idea as 't=y' (testing). In the case of
SPF receivers are free to help in the testing, or to treat the
SOFTFAIL like NEUTRAL.

As we know they are in practice also free to do anything they
wish not limited to handle a SOFTFAIL like a FAIL, or refuse to
evaluate SOFTFAIL policies, or try PRA to get fresh entropy for
/dev/random, but these options are not specified.

You still missed the point that SPF policies simply places each
and every IP into precisely one of the four sets +, ?, ~, -

There's nothing special with "all", it's just convenient to use
it for the largest of these four sets. SPF records with "+all"
at the end are not only legal, they can also make sense if it's
an "included" policy:

x.example. "v=spf1 -include:not.x.example tons=of-stuff"
not.x.example. "v=spf1 -ip4:1.0.0.0/8 -ip4:2.0.0.0/8 +all"

That's a fast way to say that any IP not starting with 1.*.*.*
or 2.*.*.* is a FAIL (the "-" for the include) after it matched
the "+all" at the end of the not.x.example policy.

Nothing's "open-ended" in SPF policies. You can get the effect
of "?all" without using it, e.g. "?include:not.x.example" would
result in NEUTRAL for any IP that's not 1.*.*.* or 2.*.*.*

It's no problem if you say that you don't _like_ policies with
a potential result NEUTRAL, nor do I: A receiver wasted his
time if he wants either PASS or FAIL, and gets only NEUTRAL.

But as you pointed out there are situations where domains might
wish to say less than PASS for certain shared MTAs, but in that
case they can still offer PASS or FAIL for all other IPs.

And as you also pointed out some domains might not like to use
FAIL at all, but still offer a PASS for some IPs, to be used
in white-listing up to reputation.

So nothing's technically wrong with NEUTRAL, and DKIM SSP also
has a similar concept 'o=~' "some mails signed". I'm too lazy
to dig if CSV also offers a kind of explicit DUNNO, does it ?
Post by Douglas Otis
Sender-ID recommends that a "spf2.0/pra ?all" be used as an
"opt-out" of the PRA algorithm.
Hopefully this issue will be investigated by the IAB. As far
as "sp2.0/pra" (no "?all" needed, it's the default) results in
a NEUTRAL for PRA it might be what domains publishing it want.
Post by Douglas Otis
This would be one example where a "neutral" result is _not_
the same as no record.
I'd certainly like to see the results of investigations by some
other agencies and commissions about this "opt out" scheme. In
practice it's probably moot. SMTP folks know why MAIL FROM and
PRA can be different. Therefore it's bogus to use algorithm B
for identity B with a policy for identity A (or vice versa).
Post by Douglas Otis
Of course, there is no real means for the sender to control
how these records are used by the recipient.
ACK. Receivers are free to check say v=spf1 policies against
Message-Ids if they get some kicks out of this braindead idea.
They could also claim that you can "opt-out" of this scheme by
simply not adding a Message-Id to your mail.

That cries for an experimental RfC for a new use of spf2.0/pra
policies, just use it with Message-Ids. Unfortunately I'm not
sure that the IESG would reject this as net abuse and nonsense.
Post by Douglas Otis
The SPF draft calls for a _minimum_ number of 112 lookups
| To prevent DoS attacks, more than 10 MX names MUST NOT be
| looked up during the evaluation of an "mx" mechanism
[... dito "ptr" ...]
| SPF implementations MUST limit the number of mechanisms and
| modifiers that do DNS lookups to at most 10 per SPF check,
| including any lookups caused by the use of the "include"
| mechanism or the "redirect" modifier.

MUSTard _maximum_ 1 SPF + 1 TXT + 10 MX + 10 A * 10 MX = 112.

How many years didn't you look into a SPF spec. ? That's state
of the art for years after one Doug Otis whined about the vague
processing limits in early 2004 SPF drafts in MARID.

And all SPF implementations always had similar limits, the only
problem was that it wasn't precisely specified before MARID.
Post by Douglas Otis
There are configuration unable to fit within 112 lookups,
Well, if they have more than 100 IPs for their 10 MXs they
should try something simple like ip4:111.222.33.0/24 (256 IPs)
that doesn't neeed any lookup.

The number of excessively complex policies was very small, and
it was always possible to simplify them drastically. Using the
"mx" mechanism at all is often a bad idea to start with, an MX
is not necessarily related to the sending MTAs of an MON.

Without "mx" or "ptr" the MUSTard maximal DNS lookups are TEN:
Each "a", "include", or stuff within an included record counts.
Post by Douglas Otis
Post by Frank Ellermann
Quite the contrary "drop FAIL" is extremely dangerous,
reject is always fine.
While drop fail is not good, rejection may still affect a
large number of users.
The biggest German ESP has a FAIL policy. My ISP has a FAIL
policy. All want FAIL to be rejected, because then it works
also with receivers screwing up and checking SPF behind their
border. With "drop FAIL" receivers should be sure that their
setup is correct, otherwise they'd shoot into their own foot.
Post by Douglas Otis
Are you saying a closed policy would be bad, but that only a
closed policy would offer benefits?
No, I like "my" and all simple 'either PASS or FAIL' policies.

And any policy where PASS or FAIL is a possible outcome offers
some benefits. Only "all NEUTRAL" is useless, because it has
no effect - or at least no specified effect, receivers can of
course (ab)use it in some wild and wonderful unspecified ways.
Post by Douglas Otis
Many individuals cherish the use of an email-address that
affiliates them with a school or society.
That's fine, but unrelated to the reasons why senders cherish
a FAIL policy.
Post by Douglas Otis
It would seem that path registration has created the problem.
Yes, at first glance, but of course it fixes this bug in 1123,
and many forwarders have no problem to switch from 5.3.6(a) to
5.3.6(b). Forwarders must take the responsibility for their
actions as it used to be in STD 10 (821). If they don't like
this there's 551, the real thing or emulated by an SPF FAIL.

Receivers can also white list their more stubborn forwarders.

Last but not least it's perfectly okay if receivers just don't
want any SPF FAILs caused by their own forwarding arrangements:
An odd strategy, but its their mail, they do what they like.
Post by Douglas Otis
Expecting large and complex domains to use CDIR notation for
outbound hosts ignores how these tools normally work and
invites error.
So far all managed. The worst known sender policy ITW caused
up to 68 lookups, and IIRC it was possible to push that below
the mentioned TEN + two replacing all "mx" by "ip4".
Post by Douglas Otis
The need for the 112 minimum lookup requirement also
suggests there is a fundamental problem with this approach.
That would be true if there is any minimum 112, but that's not
the case. It's a _maximum_ involving 10 MXs each with 10 IPs.

Bye, Frank
Douglas Otis
2006-01-25 09:28:24 UTC
Permalink
Post by Frank Ellermann
Post by Douglas Otis
I suppose this means any policy of '-all' would be in error
then?
This was about unintended failures being due to a policy that ends with
'-all'. This depends upon how changes are made for forwarding to
accommodate path registration. When there is no local mailbox, what
change to the return-path is made?
Post by Frank Ellermann
Does CSV also offer a "never used as HELO" somehow ?
The Priority field is always 1 for version. The Weight field of the
_smtp._client.<ehlo> SRV RR can be either 1 no target, or 2 authorized
and target defined. There is also the Port field where a value of 1
indicates all clients within this domain must have a corresponding CSV
record.
Post by Frank Ellermann
Post by Douglas Otis
There are at least two types of open-ended lists, '?all' and
'~all'. It is rather common to see both. The first would
be what is described as neutral, and the second would be
soft-fail.
The latter is mainly for testing purposes while a domain tries
to figure out where its out-going routes are. If that turns
out to be too difficult they should pick '?all', and otherwise
they can go for '-all'.
These multiple methods of open-ending the authorization however was the
reason for using the more general terms of open-ended or closed.
Post by Frank Ellermann
You still missed the point that SPF policies simply places each
and every IP into precisely one of the four sets +, ?, ~, -
Nevertheless, these four sets do not provide clarity what the qualifier
means to the sender or the recipient. A PASS may mean many things, and
a NEUTRAL could be treated as a PASS because there was a record and it
did not indicate a FAIL. Even assuming everything is interpreted
correctly by the recipient, there remains a fair amount of risk when
reputations are justified using these results.
Post by Frank Ellermann
SPF records with "+all" at the end are not only legal, they can also
x.example. "v=spf1 -include:not.x.example tons=of-stuff"
not.x.example. "v=spf1 -ip4:1.0.0.0/8 -ip4:2.0.0.0/8 +all"
That's a fast way to say that any IP not starting with 1.*.*.*
or 2.*.*.* is a FAIL (the "-" for the include) after it matched
the "+all" at the end of the not.x.example policy.
Indeed include is a mechanism, it terminates (matches) on a PASS, which
with a '-' qualifier, upon finding a PASS within the include, would
terminate processing and then result in FAIL. The include would ignore
the -ip: mechanisms and only see the +all (a mechanism that always
matches.) The default of no match is '?'. I understand how this works.
Post by Frank Ellermann
Nothing's "open-ended" in SPF policies. You can get the effect
of "?all" without using it, e.g. "?include:not.x.example" would
result in NEUTRAL for any IP that's not 1.*.*.* or 2.*.*.*
But a NEUTRAL or PASS policy both expect the message to be accepted.
This would be an open-ended policy where perhaps any IP address would
still obtain the acceptable NEUTRAL result.
Post by Frank Ellermann
And as you also pointed out some domains might not like to use
FAIL at all, but still offer a PASS for some IPs, to be used
in white-listing up to reputation.
There would be some concern regarding whether this public record becomes
abused by other systems with access to a shared server, or whether the
accrual of behavior only assesses PASS results. Clearly, some have
elected to treat no records with a lower rating than the '?' qualifier.
Offering any value for just having a record means there must be some
method to adjust for abuse, and that adjustment will likely represent an
unfair reputation.
Post by Frank Ellermann
So nothing's technically wrong with NEUTRAL, and DKIM SSP also
has a similar concept 'o=~' "some mails signed". I'm too lazy
to dig if CSV also offers a kind of explicit DUNNO, does it ?
Things are rather cut an dry with CSV. There are modes that have been
defined, but I doubt if they would ever be used. These were defined
assuming someone may develop a new way to authenticate, but still would
want the authorization mechanism.
Post by Frank Ellermann
I'd certainly like to see the results of investigations by some
other agencies and commissions about this "opt out" scheme. In
practice it's probably moot. SMTP folks know why MAIL FROM and
PRA can be different. Therefore it's bogus to use algorithm B
for identity B with a policy for identity A (or vice versa).
Agreed.
Post by Frank Ellermann
Post by Douglas Otis
The SPF draft calls for a _minimum_ number of 112 lookups
| To prevent DoS attacks, more than 10 MX names MUST NOT be
| looked up during the evaluation of an "mx" mechanism
[... dito "ptr" ...]
| SPF implementations MUST limit the number of mechanisms and
| modifiers that do DNS lookups to at most 10 per SPF check,
| including any lookups caused by the use of the "include"
| mechanism or the "redirect" modifier.
MUSTard _maximum_ 1 SPF + 1 TXT + 10 MX + 10 A * 10 MX = 112.
How many years didn't you look into a SPF spec. ? That's state
of the art for years after one Doug Otis whined about the vague
processing limits in early 2004 SPF drafts in MARID.
The maximum that should be found in a record, would also be the minimum
that should be examined before quiting. The concern is with respect to
what is being asked of the recipient, and less about what is being
allowed the sender. A lookup may represent many queries. From this
perspective, the SPF draft demands a minimum of more than one hundred
lookups by the recipient if needed. Of course, there may be less needed.
Post by Frank Ellermann
Post by Douglas Otis
There are configuration unable to fit within 112 lookups,
Well, if they have more than 100 IPs for their 10 MXs they
should try something simple like ip4:111.222.33.0/24 (256 IPs)
that doesn't need any lookup.
A provider has better control of their address space in many cases.

The problem presents itself when there is a diverse configuration of
MTAs, perhaps as a kiosks sending information in the mail to friends and
family from diverse addresses, then combined with corporate and other
related MTAs, all attempting to utilize the same email-address domain.
Post by Frank Ellermann
Post by Douglas Otis
The need for the 112 minimum lookup requirement also
suggests there is a fundamental problem with this approach.
That would be true if there is any minimum 112, but that's not
the case. It's a _maximum_ involving 10 MXs each with 10 IPs.
A maximum to publish is also the minimum to lookup when needed.

-Doug
Frank Ellermann
2006-01-25 12:44:46 UTC
Permalink
Post by Douglas Otis
Post by Douglas Otis
I suppose this means any policy of '-all' would be in error
then?
This was about unintended failures being due to a policy that
ends with '-all'.
^^^^ ^^^^
Sorry, I wasn't sure what you meant and picked the "only -all"
case. But it's probably clear that I do like "-all" also if
it is at the end, from my POV that's the main purpose of SPF.

Others are much more interested in PASS and don't reject FAIL,
because they want the PASS for their own white-listing magic -
a prominent example is AOL.
Post by Douglas Otis
This depends upon how changes are made for forwarding to
accommodate path registration. When there is no local
mailbox, what change to the return-path is made?
In many cases "just forward" without changing the Return-Path
is fine, just don't check SPF behind your border. SA wizards
might get it right later by looking at the timestamp lines,
but the spec. discourages such schemes.

The "change Return-Path" is only relevant for different "AUs",
the "251 user not local" cases. Even then it's not the only
way, "white list forwarder" is also good enough.

Only if both admins refuse to do anything, and the MX-admin of
the MRN rejects SPF FAIL, then the sender gets the "251 twisted
into 551 emulation", in other words a bounce message from the
forwarder. No pain, no harm, working as designed. But don't
try this with a "drop FAIL" instead of a "reject FAIL" setup.

[CSV variant of "never used as HELO"]
Post by Douglas Otis
There is also the Port field where a value of 1 indicates
all clients within this domain must have a corresponding CSV
record.
Yes, I forgot John's "cut labels 6..2 left to right" algorithm,
if that finds something it's supposed to define a default, and
NEVER is then an obvious case.

[SPF NEUTRAL and SOFTFAIL]
Post by Douglas Otis
These multiple methods of open-ending the authorization
however was the reason for using the more general terms of
open-ended or closed.
When I talk about PASS / FAIL vs. NONE / NEUTRAL / SOFTFAIL I
often simplify the latter results to "DUNNO". Receivers are
free to interpret any "DUNNO" as "inconclusive".

Maybe you could say that a conformance test is "open ended" if
some test cases end with an "inconclusive" result, but it could
be also cases of "n/a".
Post by Douglas Otis
A PASS may mean many things
If it's from somebody you've white listed it means something
like "it's me, trust me". If it's from somebody you don't
know it still means "bounces or auto-replies won't hit innocent
bystanders".

We don't need to discuss that "trust me" is a very interesting
statement, and nothing forces you to start a PASS white-list.
Post by Douglas Otis
a NEUTRAL could be treated as a PASS because there was a
record and it did not indicate a FAIL
Shaky idea, you could also argue that NEUTRAL could be treated
like FAIL because there was no PASS. Receivers can do stupid
things. But the spec. says "MUST be treated exactly like the
NONE result" (for no policy). That's a 2119-MUST, receivers
breaking it own the pieces.
Post by Douglas Otis
there remains a fair amount of risk when reputations are
justified using these results.
Basing reputation on NEUTRAL results also violates this MUST,

Basing reputation on PASS is fine, "considered responsible for
sending the message" => problem of the domain owner to get it
right.

["v=spf1 ?include:not.x.example more=stuff"]
Post by Douglas Otis
This would be an open-ended policy where perhaps any IP
address would still obtain the acceptable NEUTRAL result.
Yes, perhaps, and then it's overly complex. I only wanted it
clear that NEUTRAL results are not necessarily the effect of
an "?all" at the end, the ?" can be anywhere, like the other
three qualifiers (+ - ~).

[PASS / NEUTRAL policy without FAIL]
Post by Douglas Otis
There would be some concern regarding whether this public
record becomes abused by other systems with access to a
shared server
Yes, as discussed, don't use PASS if you have reasons to be
paranoid (that might affect spamcast routes). Now you said
this in a context where you already decided that you don't
like FAIL, even harmless FAILs like -ip4:255.0.0.0/24 or for
all IPs on reliable DNSBLs (using SPF's "exists"-mechanism).

In that case, you want neither PASS nor FAIL, I'd recommend
plan B: forget SPF, it's not for your MAIL FROM domain.

Maybe you can use "v=spf1 a -all" for your HELO, because CSV
is probably not as widely supported as SPF at the moment.
Post by Douglas Otis
Clearly, some have elected to treat no records with a lower
rating than the '?' qualifier.
Some elected to treat any AOL NEUTRAL as "highly suspicious".
<shrug /> See above, receivers are free to do strange things.
In a mathematical sense (for a scoring system) that might be
even correct.
Post by Douglas Otis
that adjustment will likely represent an unfair reputation.
If receivers _want_ to violate a MUST in the spec. there's not
much you can do about it. They could also treat mails without
DKIM signature different from mails with an invalid signature,
AFAIK not covered by the spec. They can reject mails from IPs
without the digit "1" in dotted quad notation. That's not as
bad as it sounds, most IPs have a "1" (number theory), but it
is of course still stupid.

[maximum 112 = 1 + 1 + 10 + 10 * 10]
Post by Douglas Otis
The maximum that should be found in a record, would also be
the minimum that should be examined before quiting.
Yes. Implementations MUST abort with result PermError for the
11-th name (A lookup) in MX records for "mx" mechanisms. They
MUST abort with result PermError for the 11-th mechanism with
a DNS lookup.

"v=spf1 a a a a a a a a a mx ptr -all" never reaches the FAIL.
1 2 3 4 5 6 7 8 9 10 11

In that stupid example you end up with at most 1 SPF + 1 TXT +
9 A + 1 MX + 10 A lookups, if there's an 11th name for the MX
you get a PermError for the MX. Otherwise you get a PermError
for the "ptr", it's the 11th mechanism causing a lookup.

Total worst case 22 lookups if you get no PASS for "a" or "mx".

The only way to get seriously more lookups than 12 is if you
have "mx" or "ptr". Just for fun my stupid example covered it
with it's 22 including 10 MXs for the domain, But in practice
domains have less MXs, worst I remember is t-online.de with 8,
an ISP with at least 10,000,000 customers. More MXs wouldn't
work well with UDP and q=mx, but I digress.
Post by Douglas Otis
The concern is with respect to what is being asked of the
recipient, and less about what is being allowed the sender.
The recipient is asked to throw a PermError as soon as he sees
or hits the the magical 11. Just grep for "10" in the spec.

What you called minimum is a hard maximum, and while the spec.
(draft -02) doesn't say what you could do with this PermError
everybody knows (was in draft -00) that a 5xx reject is okay.

For the sender side that would be a clear "don't call again
until you have fixed your policy". The receiver could also
blacklist the offending domain for some time after PermError.
Post by Douglas Otis
A maximum to publish is also the minimum to lookup when
needed.
My stupid example shows that you can get the PermError much
earlier (after 22 lookups), and without any "mx" or "ptr" the
PermError hits after 12 lookups.
Bye, Frank
Douglas Otis
2006-01-25 17:10:56 UTC
Permalink
Post by Douglas Otis
A maximum to publish is also the minimum to lookup when needed.
My stupid example shows that you can get the PermError much earlier
(after 22 lookups), and without any "mx" or "ptr" the PermError
hits after 12 lookups.
At the same time, this SPF example only made accommodations for a
small number of machines. Without requiring any CIDR notation, or
complex text parsing routines using includes, redirects, and macro
expansions, both the EHLO and the MAILFROM could be verified within 1
or perhaps 2 lookups and still permit millions of machines per email-
address domain. This would only require an _smtp._client.<EHLO> SRV
record at the EHLO. If this domain was not within the MAILFROM,
another _csv._client.<MAILFROM-DOMAIN> PTR record could then list
permitted EHLO domains. This would seem a simpler solution, and one
that would not demand so much of DNS and receiving MTAs. Perhaps
conventions for the label used for the EHLO could enable a discovery
process, and there is also the CIDR RR to simply list all the
outbound addresses. This approach provides fewer states, but that
also seems to be a simpler solution.

-Doug

Frank Ellermann
2006-01-24 01:09:17 UTC
Permalink
SPF has a required minimum of more than hundred lookups
A _maximum_ 112, 2 + 10 mx mechanisms, each MX with 10 names.
The minimum is 2 lookups (policy with only ip4, ip6, all), as
for my address if used as Return-Path. In practice 1 lookup
(TXT RR) for those who don't use the new SPF type 99 RR yet.
then _may_ be related to either return-path or the PRA.
A receiver _may_ of course do anything that pleaes him, but a
decent random generator should be cheaper and faster than the
mentioned lookups.
SPF may produce erroneous results in some cases, such as
when applied to the PRA
NOT RECOMMENDED. Same argument as above, same reply as above,
why not simply use a random generator for bogus results ?
or 1123 5.3.6(a).
There are no erroneous results after 1123 5.3.6(a), it emulates
a "551 user not local" if the receiver screws up and tests SPF
behind his border. Working as designed.
SPF may provide open-ended authorizations to enable
alternative providers which perhaps also attracts abuse
NEUTRAL results are by definition the same as NONE, what you'd
get for no policy at all. If it's neither PASS nor FAil it's
like no policy. Nothing in NEUTRAL / NONE can "attract" abuse.

Unless it's a spammer who has learned why avoiding FAIL is a
good idea, that would be another case of working as designed.
Another potential problem occurs when SPF is considered a
verification of email-address to justify the accrual of
reputation, which is dangerous in most shared environments.
Arranging for a good PASS (= white listed by receivers) and
then becoming a zombie is of course bad. Paranoid folks can
use NEUTRAL for shared servers until the corresponding MSA
supports "enforced submission rights" 2476bis 6.1
BATV, much like VERP, offers a solution for preventing any
"back-scatter" problem from affecting the users.
No, unlike SPF it catches 100% of all identified bogus bounces.

It doesn't catch unidentified backscatter, and it doesn't help
to reject forged Return-Paths a.s.a.p. The latter includes
all cases where forged Return-Paths are _not_ bounced but make
it to their primary victims, BATV doesn't help with that part.

SPF FAIL at least offers to help for all who support it. For
the actual phase of the game it's simple for all spammers to
avoid FAIL-protected addresses, just forge another unprotected
Return-Path: Working as designed.

Probably there will be a "next phase", but I don't worry about
it until the unprotected addresses become a rare resource. Bye
Bart Schaefer
2006-01-22 20:49:17 UTC
Permalink
On Jan 22, 7:40pm, John Levine wrote:
}
} setup sends an auto-ack and puts the mail into a holding pen. If the
} auto-ack bounces, he moves the message into the spam folder.

This sounds very similar to the "callback" mechanism employed by (for
example) Verizon. During the SMTP exchange, between getting MAIL FROM:
and issuing a response, Verizon's servers attemp to connect to an MX
at the sending domain and issue a RCPT TO: that address. If Verizon
can't connect or gets a failed response, it issues a 450 response to
the original MAIL FROM:.

This is pretty effective from Verizon's viewpoint, but bombards any
forged domains with HELO/MAIL FROM:<>/RCPT TO:<...>/QUIT, which some
of those domains interpret as an attack.
der Mouse
2006-01-23 06:55:01 UTC
Permalink
Post by Bart Schaefer
This sounds very similar to the "callback" mechanism employed by (for
example) Verizon. [...]
This is pretty effective from Verizon's viewpoint, but bombards any
forged domains with HELO/MAIL FROM:<>/RCPT TO:<...>/QUIT, which some
of those domains interpret as an attack.
Not incorrectly, either, IMO.

After all, by doing this, Verizon has turned themselves into a perfect
dictionary-attack launderer; that is, the victim cannot tell who is
actually dictionary-attacking them. (Or even *whether*; the difference
between a spam run with forged senders and a dictionary attack is
approximately none from the victim's immediate point of view.)

/~\ The ASCII der Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents.montreal.qc.ca
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Danny Angus
2006-01-23 09:29:05 UTC
Permalink
Post by Bart Schaefer
This is pretty effective from Verizon's viewpoint, but bombards any
forged domains with HELO/MAIL FROM:<>/RCPT TO:<...>/QUIT, which some
of those domains interpret as an attack.
However if an explicit SMTP callback mechanism existed it might make a very
valuable contribution.

One idea I first discussed a year or two ago is that you could use ETRN to
temporarily reject mail with a 4xx, then use ETRN to fetch it.
If you don't want it, premanently reject it with a 5xx when the sending MTA
retries.
Extend ETRN to allow it to be used to specify which sender and recipient
you are ready to accept. I'm rusty and busy.. perhaps it already does this.
Add a command to allow you to say that you will not receive mail for, or
from some entity (host, domain, user) for some period.
During the time you free up you can do all kinds of checks, at your
leisure.

d.




***************************************************************************
The information in this e-mail is confidential and for use by the addressee(s) only. If you are not the intended recipient (or responsible for delivery of the message to the intended recipient) please notify us immediately on 0141 306 2050 and delete the message from your computer. You may not copy or forward it or use or disclose its contents to any other person. As Internet communications are capable of data corruption Student Loans Company Limited does not accept any responsibility for changes made to this message after it was sent. For this reason it may be inappropriate to rely on advice or opinions contained in an e-mail without obtaining written confirmation of it. Neither Student Loans Company Limited or the sender accepts any liability or responsibility for viruses as it is your responsibility to scan attachments (if any). Opinions and views expressed in this e-mail are those o
f the sender and may not reflect the opinions and views of The Student Loans Company Limi!
ted.

This footnote also confirms that this email message has been swept for the presence of computer viruses.

**************************************************************************
Continue reading on narkive:
Loading...