Yahoo Groups archive

Milter-greylist

Index last updated: 2026-04-28 23:32 UTC

Thread

milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

2004-08-03 by milter@free.fr

Hi to all,

milter-greylist seems mature but I'm wondering about scalibility.

I'm thinking about deploying it for some 10.000/20.000 existing
email addresses (4 SMTP mail-gateways).
Will milter-greylist work in such a context ?
I'm worried about memory needed/usage and performances ...

At first before consider setting-up milter-greylist I'm trying to identify known
partners/mailing-list mailservers to whitelist to avoid
delays with important messages.
No performance problem with using a hundred "domain" statements ?

Do any of you some special log analysis tool (sma ?) to get such information
and do email traffic analysis ?

Regards,

SL/

Re: [milter-greylist] milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

2004-08-03 by Emmanuel Dreyfus

On Tue, Aug 03, 2004 at 11:19:30AM +0200, milter@... wrote:
> milter-greylist seems mature but I'm wondering about scalibility.
> I'm thinking about deploying it for some 10.000/20.000 existing
> email addresses (4 SMTP mail-gateways).
> Will milter-greylist work in such a context ?

The question is not about the number of addresses, but about the number
of messages.

> I'm worried about memory needed/usage and performances ...

You need the memory to hold the whole greylist database. I had a good
idea of the memory used before the IPv6 patch. Now, the p_addr field
in struct pending is holding some space I have trouble to evaluate.

Apart from that, each message eats 92 bytes. 

The only real performance problem we have is when we dump the database.
If it gets really big it could take some time to flush it to disk, but
you can do it every 10 minutes if you want.

> At first before consider setting-up milter-greylist I'm trying to identify known
> partners/mailing-list mailservers to whitelist to avoid
> delays with important messages.
> No performance problem with using a hundred "domain" statements ?

No, it should be okay. There is no DNS lookup caused by theses statements. 
You only pay the longer walk in the chained list, but for 100 items, it's
still very cheap.

It eats memory, but that should be okay too.

-- 
Emmanuel Dreyfus
manu@...

Re: [milter-greylist] milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

2004-08-03 by Hajimu UMEMOTO

Hi,

>>>>> On Tue, 3 Aug 2004 09:32:53 +0000
>>>>> Emmanuel Dreyfus <manu@...> said:

manu> You need the memory to hold the whole greylist database. I had a good
manu> idea of the memory used before the IPv6 patch. Now, the p_addr field
manu> in struct pending is holding some space I have trouble to evaluate.

Before IPv6 support, p_addr is a string form of an IPv4 address which
size is 17 bytes.  Now, p_addr is a pointer which points to a string
form of an IPv4 address (16 bytes) or an IPv6 address (up to 46 bytes).
An IPv6 address is larger than an IPv4 address.  So, a space to
hold an IP address was changed to a pointer which points to an IP
address to intend to minimize space disadvantage for each entry.

Sincerely,

--
Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan
ume@...  ume@{,jp.}FreeBSD.org
http://www.imasy.org/~ume/

Re: [milter-greylist] milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

2004-08-03 by Christian Pelissier

>I'm thinking about deploying it for some 10.000/20.000 existing
>email addresses (4 SMTP mail-gateways).
>Will milter-greylist work in such a context ?
>I'm worried about memory needed/usage and performances ...
>

With ~ 1100 email address milter-greylist needs ~ 13 Mb and less than
1 % CPU on a 550 Mhz/512 Mo serveur. It's a more efficient filter than 
spamassassin (both for spam and ressources used).

   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP  
 12771 smmsp    4584K 3272K sleep   59    0   0:00:00 0.2% sendmail-8.13.1/1
 18449 smmsp    2824K 2048K sleep   59    0   0:00:25 0.0% spfmilter/7
 18429 smmsp      13M   13M sleep   59    0   0:04:23 0.0% milter-greylist/9

--
C.P.

Re: [milter-greylist] milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

2004-08-03 by Emmanuel Dreyfus

On Tue, Aug 03, 2004 at 07:03:33PM +0900, Hajimu UMEMOTO wrote:
> Before IPv6 support, p_addr is a string form of an IPv4 address which
> size is 17 bytes.  Now, p_addr is a pointer which points to a string
> form of an IPv4 address (16 bytes) or an IPv6 address (up to 46 bytes).
> An IPv6 address is larger than an IPv4 address.  So, a space to
> hold an IP address was changed to a pointer which points to an IP
> address to intend to minimize space disadvantage for each entry.

I wonder if we should do the same for sender and recipient address. 
How expensive a malloc is?

-- 
Emmanuel Dreyfus
manu@...

Re: [milter-greylist] milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

2004-08-03 by Hajimu UMEMOTO

Hi,

>>>>> On Tue, 3 Aug 2004 11:41:07 +0000
>>>>> Emmanuel Dreyfus <manu@...> said:

manu> I wonder if we should do the same for sender and recipient address. 

I think it is better, too.

manu> How expensive a malloc is?

I'm not sure but it is implementation specific.  In anyway, it is more
than a few expensive.  However, malloc() is called once for each
message.  So, I believe that costs for malloc() is relatively low than
lookup pending entries.

Sincerely,

--
Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan
ume@...  ume@{,jp.}FreeBSD.org
http://www.imasy.org/~ume/

Re: [milter-greylist] milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

2004-08-03 by milter@free.fr

Quoting Emmanuel Dreyfus <manu@...>:

> On Tue, Aug 03, 2004 at 11:19:30AM +0200, milter@... wrote:
> > milter-greylist seems mature but I'm wondering about scalibility.
> > I'm thinking about deploying it for some 10.000/20.000 existing
> > email addresses (4 SMTP mail-gateways).
> > Will milter-greylist work in such a context ?
>
> The question is not about the number of addresses, but about the number
> of messages.

=> sma mentions 20.000/30.000 messages on normal days on one mail-gateway.

> > I'm worried about memory needed/usage and performances ...
>
> You need the memory to hold the whole greylist database. I had a good
> idea of the memory used before the IPv6 patch. Now, the p_addr field
> in struct pending is holding some space I have trouble to evaluate.
>
> Apart from that, each message eats 92 bytes.
=> I'm already using MIMEDefang + SA + antivirus software on these
platforms (40-50Megs per mimedefang.pl process).
So Milter-greylist will not take a lot of additional memory except in
case of harvesting attacks I guess.

> The only real performance problem we have is when we dump the database.
> If it gets really big it could take some time to flush it to disk, but
> you can do it every 10 minutes if you want.
=> I hope to not face 2Gbytes limit in some cases ...
With DOSers and spammers around ...

> > At first before consider setting-up milter-greylist I'm trying to identify
> known
> > partners/mailing-list mailservers to whitelist to avoid
> > delays with important messages.
> > No performance problem with using a hundred "domain" statements ?
>
> No, it should be okay. There is no DNS lookup caused by theses statements.
> You only pay the longer walk in the chained list, but for 100 items, it's
> still very cheap.
=> Great ! I'm starting to get the info from log analysis tools (sma +
perl scripts).


> It eats memory, but that should be okay too.
=> Thanks . So I can safely set-up version 1.4 I guess and synchronise the
database between the 4 mail-gateways.
To start smoothly how about using "-w 10"? Won't 10 minutes be enough ?

Thks,

SL/

Re: [milter-greylist] milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

2004-08-03 by manu@netbsd.org

<milter@...> wrote:

> => sma mentions 20.000/30.000 messages on normal days on one mail-gateway.

Sounds fine.

> To start smoothly how about using "-w 10"? Won't 10 minutes be enough ?

10 seconds is too short. 10 minutes will be fine. Use -w 10m

-- 
Emmanuel Dreyfus
Il y a 10 sortes de personnes dans le monde: ceux qui comprennent 
le binaire et ceux qui ne le comprennent pas.
manu@...

Re: [milter-greylist] milter-greylist scalibility/performances : 10.000 to 20.000 email addresses

2004-08-03 by manu@netbsd.org

Cyril Guibourg <cg+milter-greylist@...> wrote:

> > => Thanks . So I can safely set-up version 1.4 I guess and synchronise the
> > database between the 4 mail-gateways. 
> Nope, start with 1.5.3

There have been many bugs fixed in 1.5.x I've not pulled up to the 1.4 branch
because I'm lazy:

1.5.5 (not yet released):
        Fix bad substitutions in rc scripts
        Fix build problems on Solaris
1.5.4:
        Avoid race conditions when reloading the config (Attila Bruncsak)
        Full blown IPv6 support, from Hajimu Umemoto
        rc-debian.sh script, from Joel Bertrand
1.5.3:
        Fix unproper MX sync port on little endian machines
1.5.2:
        Add a template Makefile to manually tweak if configure fails
        Feed strtkok_r with a NULL initialized pointer
        More mixed I/O fix: another fflush after a fgets()
1.5.1:
        Fix mixed I/O in MX sync on Solaris, from Attila Bruncsak
        Check that compiler and linker accept -Wall
        Document the comment on end of line bug
        Clean up rc-solaris.sh on make clean
        syslog the expired autowhite entry correctly, from Mattieu Herrb
        Handle mailing lists with unique sender by removing '^.*=' from sender
        Minor bug fixes in queue management from Wolfgang Solfrank

If you don't run Solaris and if you don't hit the config reload race
condition, 1.4 will be fine. But if you want to help the community, use the
latest version.

-- 
Emmanuel Dreyfus
Il y a 10 sortes de personnes dans le monde: ceux qui comprennent 
le binaire et ceux qui ne le comprennent pas.
manu@...

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.