Background: Completely new (last week) to milter-greylist. Our campus boundary MX (incoming email) gateway comprises three Linux/FC5 (dual opteron, hyper-threaded) machines, running MailScanner, SpamAssassin, McAfee, ClamAV etc. Each typically gets about 1,000,000 emails per week, attempting to deliver internally to around 30,000 accounts. (Our sendmail MTA uses "virtuser" to accept only known email addresses.) Have just installed milter-greylist (2.0.2 via RPM) to try to ward off the recent increase in image and botnet emails. This query is simply to check that what I'm seeing is reasonable (for v2.0.2 of milter-greylist) and whether there might be any obvious improvements I could make in v2.0.2 configuration. I am peering milter-greylist across the three MX machines, and that seems to be working fine. I have set "greylist 15m". (The default seems to be one hour; that feels too long to be blocking first-time attempts of genuine emails.) The size of "greylist.db" on disk is around 250MB (quarter GB), representing about 2,000,000 entries. "top" shows the milter-greylist constantly at the top, almost constantly at 100%, sometimes even higher(?!). (So does that mean that milter-greylist is threaded?). In "top", "virt" can be over 2000m and "res" about 600m. In its early days, "ls -l" indicated that it was updating "greylist.db" onto disk every 10 minutes or so (as the "greylist.conf" would suggest) but more recently the update interval seems longer: at the time of typing, about two hours. (Is it simply relatively static now, and no longer needing to refresh the disk version?) The log file had lots of occurences of: [...]: Milter (greylist): timeout before data read [...]: Milter (greylist): to error state And there were also lots of emails getting through without an "X-Greylist:" header. I guess these escapees are related to such timeouts. (There was a suggestion that these tended to increase a little around the 10-minute update interval.) So I adjusted the sendmail milter details: 1. to include "F=T" (if milter uncontactable, then temporary failure rather than accept); 2. to adjust timeouts to "T=C:2m;S:30s;R:30s;E:2m" (principally to adjust S and R from their default 10 seconds). But I'm still getting lots of these errors (this time as a triple): [...]: Milter (greylist): timeout before data read [...]: Milter (greylist): to error state [...]: Milter: to=<xxx.yyy@...>, reject=451 4.3.2 Please try again later (I wonder how much backlog of genuine email this "F=T" (as distinct from intended greylisting) is causing at other sites trying to reach us?) I'm also seeing lots of: [...]: peer 129.234.xxx.yyy queue overflow (1024 entries), discarding new entry Is that serious or merely a warning? So the overall question: Is the above behaviour roughly in line with what would be expected (v 2.0.2)? (Or have I got something fundamentally wrong?) Finally a few thoughts: 1. Could the "X-Greylist:" header be made adjustable, so that it can be site-specific for emails which pass through this milter at different sites in their worldwide transit. (Compare similar feature in MailScanner.) 2. If I read top correctly, then it takes 2GB or more of memory to support 2M entries. This seems a little excessive, when the disk representation is about an eighth of that. 3. The constant 100% CPU seems high: is there some inefficiency in the searching? 3. If not already threaded, could it be made so, particularly the update onto the "greylist.db" disk file? (I've seen, but not studied, discussion of 3.0rc; are some of the above reflected in 3.0 developments?) -- : David Lee I.T. Service : : Senior Systems Programmer Computer Centre : : Durham University : : http://www.dur.ac.uk/t.d.lee/ South Road : : Durham DH1 3LE : : Phone: +44 191 334 2752 U.K. :
Message
within expectations?
2006-11-07 by David Lee
Attachments
- No local attachments were found for this message.