Qmail
[Top] [All Lists]

Re: Alternatives to sendmail and milter to control SMTP connections

To: qmail@list.cr.yp.to
Subject: Re: Alternatives to sendmail and milter to control SMTP connections
From: Kyle Wheeler <kyle-qmail@memoryhole.net>
Date: Sat, 9 Dec 2006 01:24:17 -0500
Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys
Delivered-to: sp-com-lists@consult.net
Delivered-to: gmail-qmail@securepoint.com
Delivered-to: sp.com.list@gmail.com
Delivered-to: mailing list qmail@list.cr.yp.to
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=memoryhole.net; b=fbKjuj3odtSZ3P0pVcZ4pcrMA4KdtpOZQPBjXgzvbj0YMq/F9Ora6I+53rRLflGEbnh/6fsxGFg/IScbFvXfLPhjnOHG8Yt3jQWYb9fVAxRpTXU4DMuJXluBiQwovMTuxiiN2EjQQhotUjmFdiEsGpBtgjmPe0ZFvwodG3i6FS4= ;
Domainkey-status: good
In-reply-to: <F99C82DF392938501D7A2410@xtreme.its.utexas.edu>
Mail-followup-to: qmail@list.cr.yp.to
Mailing-list: contact qmail-help@list.cr.yp.to; run by ezmlm
References: <26face530612062007q8880e62x1561ebcd44e57847@mail.gmail.com> <20061207045256.GA25355@aleut.local> <28569E396A330AB21F20E62B@xtreme.its.utexas.edu> <20061207173840.GA30896@salinan.memoryhole.net> <F5DCD01D914FE2E7E7A5560C@xtreme.its.utexas.edu> <20061207210850.GD30896@salinan.memoryhole.net> <5039794119CA33B61BFC73AD@cpe-70-113-197-215.austin.res.rr.com> <20061208070331.GG25355@aleut.local> <F99C82DF392938501D7A2410@xtreme.its.utexas.edu>
User-agent: Mutt/1.5.13 (2006-11-28)
On Friday, December  8 at 10:50 AM, quoth Donald Nash:
Unless the sending SMTP is smart enough to keep a count of the number of simultaneous outbound connections it has to a particular server, and can interpret the 4xx greeting to mean "N connections are OK, but N+1 aren't."

Do you know of any that do that?

Well, since 4xx greetings currently aren't sanctioned by any standard, no. I was merely speculating about what could be done if SMTP provided the underlying mechanism.

Well, strictly speaking, greetings are unnecessary for an SMTP deliverer to "learn" the maximum allowed concurrency for a given recipient server. If we take a "connection could not be made" as an implicit "4xx, try later", we *could* arrive at the same result. But, even though it's certainly possible with today's standards, I don't know of any that do this. That doesn't make it a bad idea, but just one that, in the 40 year history of email, no one else has thought sufficiently important to implement.

It does, however, strike me as a poor implementation idea, primarily because guessing at the configurations of other servers is a recipe for frustration and bad side-effects. If I was reworking the SMTP protocol to allow for this sort of thing, I'd encode the maximum concurrency into the SMTP greeting. That way a delivering client could know how many concurrent connections to use from the time of the very first connection. The client wouldn't have to store data about every single destination, there'd be no guessing, and if the setting for a given destination ever changes, the results are immediate.

The point is, though, even if there is someone else, is it that important?

It might very well be, especially inside an enterprise. Mail from other servers within the enterprise could very well be more important than mail from, say, AOL.

True, but a far simpler way of providing the same service is to have separate internal servers and external servers. Thus, the external servers could be *down* without interrupting internal mail. The same goes for companies spread over multiple locations. If delivery from one location to another is sufficiently important, it should be done with a dedicated server that cannot be abused by any spammer (or at least, is as un-abusable as any machine on a public network can be).

The problem here is that you're trying to justify a general-purpose limit by citing corner cases. If mail from a given location is particularly important, the simplest way to provide extra-reliable service is to have a separate service. And if mail from other locations is insufficiently important to warrant such great lengths, then I can't imagine that it's so important that it can't take being delayed for a few minutes while the email servers that got there first finish their deliveries.

Instead of arguing that you want to prevent the delay of hyper-critical mails from known sources (because that is easy to avoid by simply providing separate channels for those hyper-critical mails), the better argument is that you wish to prevent the lockout of either somewhat desirable emails or important (critical?) mails from unexpected sources. If a server is extremely busy, it's possible that some desirable emails may never get delivered because the server is always busy with something else, just as important, merely because that something else got there first.

And if the server is so busy that messages are being refused so often that some never get accepted, then we're really dealing with failure conditions, rather than general purpose behavior. Your suggested policy is attempting to address which emails get lost when the server is overloaded. Because, of course, avoiding the problem is simple: if your servers are so loaded that they may be rejecting connections, and not only that but rejecting connections so frequently that some domains or messages may never get delivered, you need more capacity. Altering how you use your capacity isn't going to solve the problem of critical emails from unpredictable sources.

If indeed we're dealing with a situation where some critical emails may be coming from unexpected sources, there's no particular argument to saying that a critical email may not come from any given server: possibly even from a server that is already delivering "too many" concurrent emails at a time. There's no guarantee that a given server is going to send the *critical* emails first.

But that said, SMTP is capable of a certain degree of timeliness, especially within an organization where they control all their own mail servers and network connectivity. In that situation, people may expect intra-organizational mail to go through with almost no delay. If some external server decides to open up and wail on them because it has had connectivity problems and so has a queue full of legitimate mail to deliver, and that's getting in the way of intra-organizational traffic, then I'd certainly want to put limits on that external server.

I guess my question would be: if intra-organizational mail should be maintained even in the face of abuse of the external mail server (which makes perfect sense), why not have separate internal and external servers? They can even be on the same machine, just with separate queues. While the policy you describe may solve *one* way in which your mail server can be interfered with, it does not address *all* of the (quite common) ways in which that server can be interfered with, and there are better solutions that do so without imposing arbitrary limits.

I've already acknowledged that I can do that already without the 4xx greeting. I'm just saying that the 4xx greeting gives me a way of telling that external server why I'm turning him down.

I know, I got that. And the "telling him" is, so far, really just a matter of saying "too many connections, try later". I don't object to it; if it's in the spec, that's cool, I don't mind. BUT, is it really necessary? I mean, from a sender's point of view, is there any really useful difference between the recipient's policy forbidding the message because the concurrency was too high and the recipient's policy forbidding the message because the frequency was too high and any other policy-specific reasons? Because the expected behavior of the SMTP server is going to be the exact same either way (it will try again later), the benefit of this 4xx greeting is really just so that when someone asks the email administrator why their email hasn't gotten to its destination yet, the admin can say something of dubious utility such as "well, we just have too much mail to that destination, and it's taking a while to send it all because they're limiting our throughput to them" or "because other people at this company sent too many messages to that domain today; maybe they'll receive it tomorrow" instead of "no idea, their server just keeps refusing to talk to us". And I can recognize that there's comfort there, but I don't know that it's actually stunningly useful. Maybe it allows you to call up the other company and give a more detailed complaint about their restrictive policy? I don't know. Again, I'm not against it, I'm just of the opinion that it's kind of redundant.

A limit on the number of messages per unit time is another.

THAT is something that *really* burns me up.

Poorly applied, yes they're a big problem. Any tool can be abused. But when implemented properly, the good guys mostly don't notice (you do have to be vigilant).

Mmmm, I guess. Imagine that a given recipient domain receives a lot of email from hotmail.com. By a lot, I mean thousands of messages an hour (it's a big company). Normally, hotmail distributes their outbound deliveries through many servers. What if hotmail changed their policy, and all outbound email messages appeared to be coming from a single IP address?

I guess, put another way, I have difficulty imagining a situation where that particular "tool" can be applied in a way that does not open the door to unintentional abuse.

Plus, I don't see the real benefit. While some spammers love to send lots of mail from a single address, most of the time they send lots of mail from botnets (at least, most of the spammers that send spam to the domains *I* administer). I don't see how such a policy would really do much against spam; the concept seems more simply arbitrary, and would more easily trip up honest senders than today's spammers (who don't usually feel like using their own bandwidth).

~Kyle
--
No one loves armed missionaries.
                                            -- Maximilien Robespierre

Attachment: pgpXEQNgyrRyU.pgp
Description: PGP signature

<Prev in Thread] Current Thread [Next in Thread>