Yahoo Groups archive

Milter-greylist

Index last updated: 2026-04-28 23:32 UTC

Thread

within expectations?

within expectations?

2006-11-07 by David Lee

Background:

  Completely new (last week) to milter-greylist.  Our campus boundary MX
  (incoming email) gateway comprises three Linux/FC5 (dual opteron,
  hyper-threaded) machines, running MailScanner, SpamAssassin, McAfee,
  ClamAV etc.  Each typically gets about 1,000,000 emails per week,
  attempting to deliver internally to around 30,000 accounts.  (Our
  sendmail MTA uses "virtuser" to accept only known email addresses.)
  Have just installed milter-greylist (2.0.2 via RPM) to try to ward off
  the recent increase in image and botnet emails.

This query is simply to check that what I'm seeing is reasonable (for
v2.0.2 of milter-greylist) and whether there might be any obvious
improvements I could make in v2.0.2 configuration.

I am peering milter-greylist across the three MX machines, and that seems
to be working fine.

I have set "greylist 15m".  (The default seems to be one hour; that feels
too long to be blocking first-time attempts of genuine emails.)

The size of "greylist.db" on disk is around 250MB (quarter GB),
representing about 2,000,000 entries.  "top" shows the milter-greylist
constantly at the top, almost constantly at 100%, sometimes even
higher(?!).  (So does that mean that milter-greylist is threaded?). In
"top", "virt" can be over 2000m and "res" about 600m.

In its early days, "ls -l" indicated that it was updating "greylist.db"
onto disk every 10 minutes or so (as the "greylist.conf" would suggest)
but more recently the update interval seems longer: at the time of typing,
about two hours.  (Is it simply relatively static now, and no longer
needing to refresh the disk version?)

The log file had lots of occurences of:
   [...]: Milter (greylist): timeout before data read
   [...]: Milter (greylist): to error state

And there were also lots of emails getting through without an "X-Greylist:"
header.  I guess these escapees are related to such timeouts.

(There was a suggestion that these tended to increase a little around the
10-minute update interval.)

So I adjusted the sendmail milter details:
 1. to include "F=T" (if milter uncontactable, then temporary failure
 rather than accept);

 2. to adjust timeouts to "T=C:2m;S:30s;R:30s;E:2m" (principally to adjust
 S and R from their default 10 seconds).

But I'm still getting lots of these errors (this time as a triple):
   [...]: Milter (greylist): timeout before data read
   [...]: Milter (greylist): to error state
   [...]: Milter: to=<xxx.yyy@...>, reject=451 4.3.2 Please try again later

(I wonder how much backlog of genuine email this "F=T" (as distinct from
intended greylisting) is causing at other sites trying to reach us?)

I'm also seeing lots of:
   [...]: peer 129.234.xxx.yyy queue overflow (1024 entries), discarding new entry

Is that serious or merely a warning?

So the overall question: Is the above behaviour roughly in line with what
would be expected (v 2.0.2)?  (Or have I got something fundamentally
wrong?)



Finally a few thoughts:

1. Could the "X-Greylist:" header be made adjustable, so that it can be
site-specific for emails which pass through this milter at different sites
in their worldwide transit.  (Compare similar feature in MailScanner.)

2. If I read top correctly, then it takes 2GB or more of memory to support
2M entries.  This seems a little excessive, when the disk representation
is about an eighth of that.

3. The constant 100% CPU seems high: is there some inefficiency in the
searching?

3. If not already threaded, could it be made so, particularly the update
onto the "greylist.db" disk file?


(I've seen, but not studied, discussion of 3.0rc; are some of the above
reflected in 3.0 developments?)


-- 

:  David Lee                                I.T. Service          :
:  Senior Systems Programmer                Computer Centre       :
:                                           Durham University     :
:  http://www.dur.ac.uk/t.d.lee/            South Road            :
:                                           Durham DH1 3LE        :
:  Phone: +44 191 334 2752                  U.K.                  :

Re: [milter-greylist] within expectations?

2006-11-07 by Matthias Scheler

On Tue, Nov 07, 2006 at 10:45:40AM +0000, David Lee wrote:
> ... (dual opteron, hyper-threaded) machines, ...

AMD never made any CPUs with hyper-threading support.

> The size of "greylist.db" on disk is around 250MB (quarter GB),
> representing about 2,000,000 entries.

You can probably reduce the size of that database by whitelisting the
e-mail servers of popular services like AOL, eBay, Googemail, Yahoo etc..
Or you can use milter-greylist 3.0RC7 and only greylist certain
system e.g. hosts with dynamically assigned IP addresses.

> (So does that mean that milter-greylist is threaded?).

A milters are threaded because that's how the API works.

> But I'm still getting lots of these errors (this time as a triple):
>    [...]: Milter (greylist): timeout before data read
>    [...]: Milter (greylist): to error state
>    [...]: Milter: to=<xxx.yyy@...>, reject=451 4.3.2 Please try again later

Do you SPF? I see such errors if the DNS query for SPF takes too long.
It does however not really matter on my configuration because the
DNS problem usually indicates spam anyway.

> 3. The constant 100% CPU seems high: is there some inefficiency in the
> searching?

IIRC milter-greylist 3.0RC<x> uses hashes to speed up searching.

> 3. If not already threaded, could it be made so, particularly the update
> onto the "greylist.db" disk file?

It is done by a single thread and changing that is hard and probably not
going to improve dump performance a lot.

	Kind regards

-- 
Matthias Scheler                                  http://zhadum.org.uk/

Re: [milter-greylist] within expectations?

2006-11-07 by eclark

Consider also cutting down on the length of time you greylist for. If someone doesnt return in 3 days, screw em. They are either coming from a broken mta, or a bot. Originally we greylisted for 5 days and this caused problems due to the size of the db. Since cutting that down to 2 or 3 days depending on circumstance, we have not had any problems at all. Definitely utilize whitelists as suggested by Matt+, as they definitely make the process much more bearable.

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.