Yahoo Groups archive

Milter-greylist

Index last updated: 2026-04-28 23:32 UTC

Thread

(greylist): timeout before data read and (greylist): to error state

(greylist): timeout before data read and (greylist): to error state

2006-10-17 by Brian J. Lewis

Milter-greylist is working:
Oct 17 09:38:32 mailscanner6 milter-greylist: k9HGcLRM031656: addr 
wmflb12na02.ezweb.ne.jp[222.15.69.197] from <> to 
<xvafeyaoef@...> delayed for 00:04:14 (ACL 111)

Then all of a sudden:
Oct 17 09:38:42 mailscanner6 sendmail[31656]: k9HGcLRM031656: Milter 
(greylist): timeout before data read
Oct 17 09:38:42 mailscanner6 sendmail[31624]: k9HGcL2F031624: Milter 
(greylist): timeout before data read
Oct 17 09:38:44 mailscanner6 sendmail[31679]: k9HGcOwh031679: Milter 
(greylist): timeout before data read
Oct 17 09:38:44 mailscanner6 sendmail[31679]: k9HGcOwh031679: Milter 
(greylist): to error state
Oct 17 09:38:42 mailscanner6 sendmail[31653]: k9HGcMfq031653: Milter 
(greylist): timeout before data read
Oct 17 09:38:43 mailscanner6 sendmail[31647]: k9HGcLYm031647: Milter 
(greylist): timeout before data read
Oct 17 09:38:43 mailscanner6 sendmail[31660]: k9HGcMZq031660: Milter 
(greylist): timeout before data read
Oct 17 09:38:46 mailscanner6 sendmail[31663]: k9HGcNI5031663: Milter 
(greylist): to error state
Oct 17 09:38:46 mailscanner6 sendmail[31684]: k9HGcOle031684: Milter 
(greylist): timeout before data read
Oct 17 09:38:46 mailscanner6 sendmail[31708]: k9HGcP3Y031708: Milter 
(greylist): timeout before data read


[root@mailscanner6 log]# free
             total       used       free     shared    buffers     
cached
Mem:        499388     410436      88952          0      26160      
58844
-/+ buffers/cache:     325432     173956
Swap:      2097136       9720    2087416


If I issue a : service miltergreylist restart
It starts working again for a few minutes, then bombs again.

Not using LIBSPF, results happen on both Fedora Core 6 Test 3 and 
Fedora Core 2 servers
Fedora Core 6 Test 3 running 8.13.7 Sendmail with Milter-Greylist 
3.0rc5 with dnsrbl enabled

In sendmail.mc I have
FEATURE(`milter-greylist')
INPUT_MAIL_FILTER(`greylist',`S=local:/var/milter-greylist/milter-
greylist.sock')
define(`confMILTER_MACROS_CONNECT', `j, {if_addr}')
define(`confMILTER_MACROS_HELO', `{verify}, {cert_subject}')
define(`confMILTER_MACROS_ENVFROM', `i, {auth_authen}')
define(`confMILTER_MACROS_ENVRCPT', `{greylist}')

In greylist.conf I have
acl whitelist list "my network"
acl whitelist list "broken mta"
#acl greylist list "grey users" dnsrbl "SORBS DUN" delay 24h 
autowhite 3d
acl greylist list "grey users" delay 10m autowhite 3d
dnsrbl "SORBS DUN" dnsbl.sorbs.net 127.0.0.10
dnsrbl "SPAMCOP BL" bl.spamcop.net 127.0.0.2
acl greylist dnsrbl "SORBS DUN" delay 12h
acl greylist dnsrbl "SPAMCOP BL" delay 12h
acl greylist default delay 5m autowhite 3d


PEER's temporarily disabled because of 1024 queue error.


Anything I should try?

Re: (greylist): timeout before data read and (greylist): to error state

2006-10-17 by Brian J. Lewis

Maybe this is a clue?  Check this out, these are weird errors, its 
greylisting but its also erroring same time.  I can email you a GZIP 
of the log that has all this weird stuff in it if it will help.  
Maybe a corrupt greylist.db? Or just too large? (21 megs)

Oct 17 08:10:25 mailscanner6 sendmail[14264]: k9HF9FC6014264: Milter 
(greylist): write(Q) returned -1, expected 5: Broken pipe
Oct 17 08:10:25 mailscanner6 sendmail[14264]: k9HF9FC6014264: Milter 
(greylist): to error state
but then it corrects itself
Oct 17 08:10:26 mailscanner6 milter-greylist: k9HFAE71014964: addr 
mail-in4.tiscali.nl[195.241.79.168] from <> to 
<kuudiwtnsoa@...> delayed for 00:05:00 (ACL 111)
Oct 17 08:10:26 mailscanner6 sendmail[14964]: k9HFAE71014964: Milter: 
to=<kuudiwtnsoa@...>, reject=451 4.7.1 Greylisting in 
action, please come back in 00:05:00
then it errors again
Oct 17 08:10:26 mailscanner6 sendmail[14276]: k9HF9GCJ014276: Milter 
(greylist): write(Q) returned -1, expected 5: Broken pipe
Oct 17 08:10:26 mailscanner6 sendmail[14276]: k9HF9GCJ014276: Milter 
(greylist): to error state

--- In milter-greylist@yahoogroups.com, "Brian J. Lewis" 
<brianlewis@...> wrote:
>
> Milter-greylist is working:
> Oct 17 09:38:32 mailscanner6 milter-greylist: k9HGcLRM031656: addr 
> wmflb12na02.ezweb.ne.jp[222.15.69.197] from <> to 
> <xvafeyaoef@...> delayed for 00:04:14 (ACL 111)
> 
> Then all of a sudden:
> Oct 17 09:38:42 mailscanner6 sendmail[31656]: k9HGcLRM031656: 
Milter 
> (greylist): timeout before data read
> Oct 17 09:38:42 mailscanner6 sendmail[31624]: k9HGcL2F031624: 
Milter 
> (greylist): timeout before data read
> Oct 17 09:38:44 mailscanner6 sendmail[31679]: k9HGcOwh031679: 
Milter 
> (greylist): timeout before data read
> Oct 17 09:38:44 mailscanner6 sendmail[31679]: k9HGcOwh031679: 
Milter 
> (greylist): to error state
> Oct 17 09:38:42 mailscanner6 sendmail[31653]: k9HGcMfq031653: 
Milter 
> (greylist): timeout before data read
> Oct 17 09:38:43 mailscanner6 sendmail[31647]: k9HGcLYm031647: 
Milter 
> (greylist): timeout before data read
> Oct 17 09:38:43 mailscanner6 sendmail[31660]: k9HGcMZq031660: 
Milter 
> (greylist): timeout before data read
> Oct 17 09:38:46 mailscanner6 sendmail[31663]: k9HGcNI5031663: 
Milter 
> (greylist): to error state
> Oct 17 09:38:46 mailscanner6 sendmail[31684]: k9HGcOle031684: 
Milter 
> (greylist): timeout before data read
> Oct 17 09:38:46 mailscanner6 sendmail[31708]: k9HGcP3Y031708: 
Milter 
Show quoted textHide quoted text
> (greylist): timeout before data read
> 
> 
> [root@mailscanner6 log]# free
>              total       used       free     shared    buffers     
> cached
> Mem:        499388     410436      88952          0      26160      
> 58844
> -/+ buffers/cache:     325432     173956
> Swap:      2097136       9720    2087416
> 
> 
> If I issue a : service miltergreylist restart
> It starts working again for a few minutes, then bombs again.
> 
> Not using LIBSPF, results happen on both Fedora Core 6 Test 3 and 
> Fedora Core 2 servers
> Fedora Core 6 Test 3 running 8.13.7 Sendmail with Milter-Greylist 
> 3.0rc5 with dnsrbl enabled
> 
> In sendmail.mc I have
> FEATURE(`milter-greylist')
> INPUT_MAIL_FILTER(`greylist',`S=local:/var/milter-greylist/milter-
> greylist.sock')
> define(`confMILTER_MACROS_CONNECT', `j, {if_addr}')
> define(`confMILTER_MACROS_HELO', `{verify}, {cert_subject}')
> define(`confMILTER_MACROS_ENVFROM', `i, {auth_authen}')
> define(`confMILTER_MACROS_ENVRCPT', `{greylist}')
> 
> In greylist.conf I have
> acl whitelist list "my network"
> acl whitelist list "broken mta"
> #acl greylist list "grey users" dnsrbl "SORBS DUN" delay 24h 
> autowhite 3d
> acl greylist list "grey users" delay 10m autowhite 3d
> dnsrbl "SORBS DUN" dnsbl.sorbs.net 127.0.0.10
> dnsrbl "SPAMCOP BL" bl.spamcop.net 127.0.0.2
> acl greylist dnsrbl "SORBS DUN" delay 12h
> acl greylist dnsrbl "SPAMCOP BL" delay 12h
> acl greylist default delay 5m autowhite 3d
> 
> 
> PEER's temporarily disabled because of 1024 queue error.
> 
> 
> Anything I should try?
>

Re: [milter-greylist] Re: (greylist): timeout before data read and (greylist): to error state

2006-10-17 by manu@netbsd.org

Brian J. Lewis <brianlewis@...> wrote:

> Maybe this is a clue?  Check this out, these are weird errors, its 
> greylisting but its also erroring same time.  I can email you a GZIP 
> of the log that has all this weird stuff in it if it will help.  
> Maybe a corrupt greylist.db? Or just too large? (21 megs)

The "timeout before data read" simply means the milter stopped
responding, probably because it crashed.

There are two usual suspects
1) DNS query with a thread-unsafe resolver (if you use DNSRBL or SPF)
2) system limit (ulimit) exhausted.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: (greylist): timeout before data read and (greylist): to error state

2006-10-17 by Brian J. Lewis

First thank you for your quick replies Emmanuel.  I appreciate it.
I am not a linux expert, so I am trying the best I can on this.
I compiled using 
'./configure --prefix= --enable-dnsrbl --with-libbind=/usr/local'
on Fedora Core 5
with RPM installs of
bind-libs-9.3.2-20.FC5 and bind-utils-9.3.2-20.FC5

From what I can tell I have a libbind.so and libbind9.so in /usr/lib, 
so maybe I should be compiling using --with-libbind=/usr/lib ?

Also how can I figure out if those RPM installed versions of bind-
libs 9.3.2-20 is Thread Safe or not?

Thank you for your help...

--- In milter-greylist@yahoogroups.com, manu@... wrote:
>
> Brian J. Lewis <brianlewis@...> wrote:
> 
> > Maybe this is a clue?  Check this out, these are weird errors, 
its 
> > greylisting but its also erroring same time.  I can email you a 
GZIP 
> > of the log that has all this weird stuff in it if it will help.  
> > Maybe a corrupt greylist.db? Or just too large? (21 megs)
> 
> The "timeout before data read" simply means the milter stopped
> responding, probably because it crashed.
> 
> There are two usual suspects
> 1) DNS query with a thread-unsafe resolver (if you use DNSRBL or 
SPF)
Show quoted textHide quoted text
> 2) system limit (ulimit) exhausted.
> 
> -- 
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> manu@...
>

Re: [milter-greylist] Re: (greylist): timeout before data read and (greylist): to error state

2006-10-17 by manu@netbsd.org

Brian J. Lewis <brianlewis@...> wrote:

> First thank you for your quick replies Emmanuel.  I appreciate it.
> I am not a linux expert, so I am trying the best I can on this.
> I compiled using 
> './configure --prefix= --enable-dnsrbl --with-libbind=/usr/local'
> on Fedora Core 5
> with RPM installs of
> bind-libs-9.3.2-20.FC5 and bind-utils-9.3.2-20.FC5
> 
> From what I can tell I have a libbind.so and libbind9.so in /usr/lib,
> so maybe I should be compiling using --with-libbind=/usr/lib ?

I would be --with-libbind=/usr

> Also how can I figure out if those RPM installed versions of bind-
> libs 9.3.2-20 is Thread Safe or not?

If it's libbind from BIND9, it's thread safe. 

Run ldd on milter-greylist toheck that you actally linked with the
appropriate libbind.so

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: (greylist): timeout before data read and (greylist): to error state

2006-10-17 by Brian J. Lewis

It did, there is only one libbind.so.4 (actually a symbolic link to 
libbind.so.4.0.2) in /usr/lib/ and LDD shows that this is where its 
linked.  libbind.so is also a symbolic link to the same file as well.
These are from Bind 9.3.2 which is thread safe.

[root@mailscanner1 bin]# whereis milter-greylist
milter-greylist: /bin/milter-greylist
[root@mailscanner1 bin]# ldd milter-greylist
 linux-gate.so.1 =>  (0x00f02000)
 libbind.so.4 => /usr/lib/libbind.so.4 (0x00d2e000)
 libresolv.so.2 => /lib/libresolv.so.2 (0x00918000)
 libnsl.so.1 => /lib/libnsl.so.1 (0x008d1000)
 libpthread.so.0 => /lib/libpthread.so.0 (0x0085f000)
 libc.so.6 => /lib/libc.so.6 (0x006e8000)
 /lib/ld-linux.so.2 (0x006cb000)
[root@mailscanner1 bin]#


I recompiled without DNSRBL and removed the DNSRBL lines in 
greylist.conf and the servers have been stable!  In fact I only 
recompiled on one server, the other 3 servers I just removed the 
lines from greylist.conf and restarted Milter-Greylist and they are 
stable as well, so there seems to be a bug in Milter-Greylist when 
compiled on Fedora Core 5 or 6 with Bind 9.3.x that causes 
instability under high load (2000-3000 messages rejected per hour)



--- In milter-greylist@yahoogroups.com, manu@... wrote:
>
> Brian J. Lewis <brianlewis@...> wrote:
> 
> > First thank you for your quick replies Emmanuel.  I appreciate it.
> > I am not a linux expert, so I am trying the best I can on this.
> > I compiled using 
> > './configure --prefix= --enable-dnsrbl --with-libbind=/usr/local'
> > on Fedora Core 5
> > with RPM installs of
> > bind-libs-9.3.2-20.FC5 and bind-utils-9.3.2-20.FC5
> > 
> > From what I can tell I have a libbind.so and libbind9.so 
in /usr/lib,
Show quoted textHide quoted text
> > so maybe I should be compiling using --with-libbind=/usr/lib ?
> 
> I would be --with-libbind=/usr
> 
> > Also how can I figure out if those RPM installed versions of bind-
> > libs 9.3.2-20 is Thread Safe or not?
> 
> If it's libbind from BIND9, it's thread safe. 
> 
> Run ldd on milter-greylist toheck that you actally linked with the
> appropriate libbind.so
> 
> -- 
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> manu@...
>

Re: [milter-greylist] Re: (greylist): timeout before data read and (greylist): to error state

2006-10-18 by AIDA Shinra

At Tue, 17 Oct 2006 21:50:49 +0200,
manu@... wrote:
> 
> Brian J. Lewis <brianlewis@...> wrote:
> 
> > Maybe this is a clue?  Check this out, these are weird errors, its 
> > greylisting but its also erroring same time.  I can email you a GZIP 
> > of the log that has all this weird stuff in it if it will help.  
> > Maybe a corrupt greylist.db? Or just too large? (21 megs)
> 
> The "timeout before data read" simply means the milter stopped
> responding, probably because it crashed.
> 
> There are two usual suspects
> 1) DNS query with a thread-unsafe resolver (if you use DNSRBL or SPF)
> 2) system limit (ulimit) exhausted.

I frequently see temporary "timeout before data read" simply because
of high latency in lookuping SPF on my lesser traffic server. Brian's
server can possibly be consuming a long time due to latency in
lookuping DNSRBLs rather than deadlocking at somewhere.

Re: [milter-greylist] Re: (greylist): timeout before data read and (greylist): to error state

2006-10-18 by manu@netbsd.org

Brian J. Lewis <brianlewis@...> wrote:

> It did, there is only one libbind.so.4 (actually a symbolic link to 
> libbind.so.4.0.2) in /usr/lib/ and LDD shows that this is where its 
> linked.  libbind.so is also a symbolic link to the same file as well.
> These are from Bind 9.3.2 which is thread safe.
(snip)
>  libbind.so.4 => /usr/lib/libbind.so.4 (0x00d2e000)

Are you sure that one is from BIND9? Please run 
nm /usr/lib/libbind.so.4| grep res_ninit 

> I recompiled without DNSRBL and removed the DNSRBL lines in 
> greylist.conf and the servers have been stable!  In fact I only 
> recompiled on one server, the other 3 servers I just removed the 
> lines from greylist.conf and restarted Milter-Greylist and they are 
> stable as well, so there seems to be a bug in Milter-Greylist when 
> compiled on Fedora Core 5 or 6 with Bind 9.3.x that causes 
> instability under high load (2000-3000 messages rejected per hour)

That smells like thread insafety..

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: [milter-greylist] Re: (greylist): timeout before data read and (greylist): to error state

2006-10-18 by manu@netbsd.org

AIDA Shinra <shinra@...> wrote:

> I frequently see temporary "timeout before data read" simply because
> of high latency in lookuping SPF on my lesser traffic server. Brian's
> server can possibly be consuming a long time due to latency in
> lookuping DNSRBLs rather than deadlocking at somewhere. 

Then he would not have to restart the milter to get the whole thing
working again...

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: (greylist): timeout before data read and (greylist): to error state

2006-10-18 by hwdahm

--- In milter-greylist@yahoogroups.com, manu@... wrote:
>
> Brian J. Lewis <brianlewis@...> wrote:
> 
> > Maybe this is a clue?  Check this out, these are weird errors, its 
> > greylisting but its also erroring same time.  I can email you a GZIP 
> > of the log that has all this weird stuff in it if it will help.  
> > Maybe a corrupt greylist.db? Or just too large? (21 megs)
> 
> The "timeout before data read" simply means the milter stopped
> responding, probably because it crashed.
> 
> There are two usual suspects
> 1) DNS query with a thread-unsafe resolver (if you use DNSRBL or SPF)
> 2) system limit (ulimit) exhausted.
> 
> -- 
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> manu@...
>

I had the same problems, but I don´t use SPF or DNSRBL.
I used ./configure without arguments (I think --with-libbind is only
necessary if spf or dnsrbl, am I right?).

So - imho - there is only one "usual suspect" left, i.e. system limit
exhausted, although an ulimit -a shows an "unlimited" system.
Are there any other system restrictions to look at? Are there any
recommended sizes of RAM or virt.memory?

Deactivating mx-syncing made my systems more reliable.

Regards - and thank you all for your work and assistance
Hans

Re: (greylist): timeout before data read and (greylist): to error state

2006-10-18 by Brian J. Lewis

I run a dedicated server using rbldnsd, rsync'd databases, dnscache, 
all it does is DNS for the mailscanners!  Its wicked fast and creates 
no delays.  Restarting Milter-Greylist fixes the issue for a period 
of time.  Thanks for you comments!

--- In milter-greylist@yahoogroups.com, AIDA Shinra <shinra@...> 
wrote:
>
> At Tue, 17 Oct 2006 21:50:49 +0200,
> manu@... wrote:
> > 
> > Brian J. Lewis <brianlewis@...> wrote:
> > 
> > > Maybe this is a clue?  Check this out, these are weird errors, 
its 
> > > greylisting but its also erroring same time.  I can email you a 
GZIP 
> > > of the log that has all this weird stuff in it if it will 
help.  
> > > Maybe a corrupt greylist.db? Or just too large? (21 megs)
> > 
> > The "timeout before data read" simply means the milter stopped
> > responding, probably because it crashed.
> > 
> > There are two usual suspects
> > 1) DNS query with a thread-unsafe resolver (if you use DNSRBL or 
SPF)
> > 2) system limit (ulimit) exhausted.
> 
> I frequently see temporary "timeout before data read" simply because
> of high latency in lookuping SPF on my lesser traffic server. 
Brian's
Show quoted textHide quoted text
> server can possibly be consuming a long time due to latency in
> lookuping DNSRBLs rather than deadlocking at somewhere.
>

Re: (greylist): timeout before data read and (greylist): to error state

2006-10-18 by Brian J. Lewis

nm : libbind.so.4.0.2 : no symbols   :(

libbind.so.4.0.2 is part of bind-libs-9.3.2-20.FC5

Searching the system reveals only this libbind.so.4.0.2 (and the 
symbolic link to it), reinstall of bind-libs-9.3.2-20.FC5 results in 
the same file being created.
I went ahead and downloaded

Downloaded the latest development
bind-libs-9.3.2-41.fc6.i386.rpm
bind-utils-9.3.2-41.fc6.i386.rpm
then rpm -U bind-util* bind-libs*

New file is created Sep 11th, 2006, (libbind.so.4.0.2)
Same Symbol Results (nm: libbind.so.4.0.2: no symbols)

Obviously this is a Bind 9.3.2-41 file so it has to be thread safe.

Database size is 40 megs now (greylist.db)
System stable after 24 hours with DNSRBL disabled in greylist.conf, 
no MX Sync.

All your comments have been appreciative!  Would love to implement 
DNSRBL if it can be stable on Fedora Core 5 / 6


--- In milter-greylist@yahoogroups.com, manu@... wrote:
>
> Brian J. Lewis <brianlewis@...> wrote:
> 
> > It did, there is only one libbind.so.4 (actually a symbolic link 
to 
> > libbind.so.4.0.2) in /usr/lib/ and LDD shows that this is where 
its 
> > linked.  libbind.so is also a symbolic link to the same file as 
well.
> > These are from Bind 9.3.2 which is thread safe.
> (snip)
> >  libbind.so.4 => /usr/lib/libbind.so.4 (0x00d2e000)
> 
> Are you sure that one is from BIND9? Please run 
> nm /usr/lib/libbind.so.4| grep res_ninit 
> 
> > I recompiled without DNSRBL and removed the DNSRBL lines in 
> > greylist.conf and the servers have been stable!  In fact I only 
> > recompiled on one server, the other 3 servers I just removed the 
> > lines from greylist.conf and restarted Milter-Greylist and they 
are 
> > stable as well, so there seems to be a bug in Milter-Greylist 
when 
Show quoted textHide quoted text
> > compiled on Fedora Core 5 or 6 with Bind 9.3.x that causes 
> > instability under high load (2000-3000 messages rejected per hour)
> 
> That smells like thread insafety..
> 
> -- 
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> manu@...
>

Re: (greylist): timeout before data read and (greylist): to error state

2006-10-18 by Brian J. Lewis

While most of my scanners have 512mb of ram, the main scanner has a 
gig of ram and 'free' reveals using 946megs, 85 megs available, 445 
megs in buffers/cache so its operating just fine.  a PS -aux looks 
good in terms of the # of processes and % of memory.  I can't see 
this as being a resource issue given the above. Thanks for your 
comments!!! We'll get this figured out eventually!

--- In milter-greylist@yahoogroups.com, "hwdahm" <hwdde@...> wrote:
>
> --- In milter-greylist@yahoogroups.com, manu@ wrote:
> >
> > Brian J. Lewis <brianlewis@> wrote:
> > 
> > > Maybe this is a clue?  Check this out, these are weird errors, 
its 
> > > greylisting but its also erroring same time.  I can email you a 
GZIP 
> > > of the log that has all this weird stuff in it if it will 
help.  
> > > Maybe a corrupt greylist.db? Or just too large? (21 megs)
> > 
> > The "timeout before data read" simply means the milter stopped
> > responding, probably because it crashed.
> > 
> > There are two usual suspects
> > 1) DNS query with a thread-unsafe resolver (if you use DNSRBL or 
SPF)
> > 2) system limit (ulimit) exhausted.
> > 
> > -- 
> > Emmanuel Dreyfus
> > http://hcpnet.free.fr/pubz
> > manu@
> >
> 
> I had the same problems, but I don´t use SPF or DNSRBL.
> I used ./configure without arguments (I think --with-libbind is only
> necessary if spf or dnsrbl, am I right?).
> 
> So - imho - there is only one "usual suspect" left, i.e. system 
limit
Show quoted textHide quoted text
> exhausted, although an ulimit -a shows an "unlimited" system.
> Are there any other system restrictions to look at? Are there any
> recommended sizes of RAM or virt.memory?
> 
> Deactivating mx-syncing made my systems more reliable.
> 
> Regards - and thank you all for your work and assistance
> Hans
>

Re: (greylist): timeout before data read and (greylist): to error state

2006-10-19 by Brian J. Lewis

Yep its Thread Safe!

[root@mailscanner5 lib]# nm libbind.so -D | grep res_ninit
0003b980 T __res_ninit


Btw, its been stable all day today with MXSync enabled, so I just 
can't enable DNSRBL on Fedora Core with Bind 9.3.2!

greylist Database Size : 80 megs!

--- In milter-greylist@yahoogroups.com, manu@... wrote:
Show quoted textHide quoted text
>
> Brian J. Lewis <brianlewis@...> wrote:
> 
> > nm : libbind.so.4.0.2 : no symbols   :(
> 
> Try nm -D
> 
> -- 
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> manu@...
>

Re: (greylist): timeout before data read and (greylist): to error state

2006-10-19 by randersson2

--- In milter-greylist@yahoogroups.com, "Brian J. Lewis"
<brianlewis@...> wrote:
>
> Milter-greylist is working:
> Oct 17 09:38:32 mailscanner6 milter-greylist: k9HGcLRM031656: addr 
> wmflb12na02.ezweb.ne.jp[222.15.69.197] from <> to 
> <xvafeyaoef@...> delayed for 00:04:14 (ACL 111)
> 
> Then all of a sudden:
> Oct 17 09:38:42 mailscanner6 sendmail[31656]: k9HGcLRM031656: Milter 
> (greylist): timeout before data read
> Oct 17 09:38:42 mailscanner6 sendmail[31624]: k9HGcL2F031624: Milter 
> (greylist): timeout before data read
> Oct 17 09:38:44 mailscanner6 sendmail[31679]: k9HGcOwh031679: Milter 
> (greylist): timeout before data read

FWIW I also had milter-greylist crash on me on FC3, FC5 and RHEL4
configured with --enable-dnsrbl.

If you just do --enable-dnsrbl, then you get linking errors during
make both on FC3, FC5 and RHEL4.

If you do --enable-dnsrbl --with-libbind=/usr then the make is OK, but
the resulting binary crashes as described by Brian above.

What does work is to:
configure --enable-dnsrbl            (Note, *no* --with-libbind)
Then edit the resulting Makefile and replace:
LIBS=            -lresolv -lnsl -lpthread -lmilter
with:
LIBS=            /usr/lib/libresolv.a -lnsl -lpthread -lmilter

After this the make works and the resulting binary works fine even
under heavy load.

Something is really strange here. Why is libresolv.a and libresolv.so
different? And why doesn't libbind.so appear to be threadsafe? Go
figure, the above workaround made things work for me.

Regards, Robert.

thread-unsafety on RedHat solved

2006-10-19 by manu@netbsd.org

randersson2 <robert@...> wrote:

> What does work is to:
> configure --enable-dnsrbl            (Note, *no* --with-libbind)
> Then edit the resulting Makefile and replace:
> LIBS=            -lresolv -lnsl -lpthread -lmilter
> with:
> LIBS=            /usr/lib/libresolv.a -lnsl -lpthread -lmilter
> 
> After this the make works and the resulting binary works fine even
> under heavy load.
> 
> Something is really strange here. Why is libresolv.a and libresolv.so
> different? And why doesn't libbind.so appear to be threadsafe? Go
> figure, the above workaround made things work for me.

Very strange indeed. If you build a milter-greylist with -lresolv -lnsl
-lpthread -lmilter, does ldd shows /usr/lib/libresolv.so is used?

Do you have multiple /usr/lib/libresolv.so.*?   

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: thread-unsafety on RedHat solved

2006-10-20 by randersson2

--- In milter-greylist@yahoogroups.com, manu@... wrote:
>
> randersson2 <robert@...> wrote:
> 
> > What does work is to:
> > configure --enable-dnsrbl            (Note, *no* --with-libbind)
> > Then edit the resulting Makefile and replace:
> > LIBS=            -lresolv -lnsl -lpthread -lmilter
> > with:
> > LIBS=            /usr/lib/libresolv.a -lnsl -lpthread -lmilter
> > 
> > After this the make works and the resulting binary works fine even
> > under heavy load.
> > 
> > Something is really strange here. Why is libresolv.a and libresolv.so
> > different? And why doesn't libbind.so appear to be threadsafe? Go
> > figure, the above workaround made things work for me.
> 
> Very strange indeed. If you build a milter-greylist with -lresolv -lnsl
> -lpthread -lmilter, does ldd shows /usr/lib/libresolv.so is used?

That library mix is what I get with:
./configure --enable-dnsrbl 
This doesn't link at all, the error message is:
dnsrbl.o: In function `dnsrbl_check_source':
/local/home/robert/milter-greylist-3.0rc5/dnsrbl.c:167: undefined
reference to `__ns_initparse'
/local/home/robert/milter-greylist-3.0rc5/dnsrbl.c:174: undefined
reference to `__ns_parserr'

ns_initparse and ns_parserr are #defined to __ns_initparse and
__ns_parserr in /usr/include/arpa/nameser.h, but in
/usr/lib/libresolv.so which symlinks to /lib/libresolv.so.2 which
symlinks to /lib/libresolv-2.4.so these two symbols are defined as local.

In /usr/lib/libresolv.a the two symbols are global and the link succeeds.

Hmm, looks like a glibc 2.4 bug if you ask me.


> Do you have multiple /usr/lib/libresolv.so.*?  

No.


It seems this has been an issue in the glibc resolver for years, just
google after:
ns_initparse glibc


Regards, Robert.

Re: [milter-greylist] Re: thread-unsafety on RedHat solved

2006-10-20 by manu@netbsd.org

randersson2 <robert@...> wrote:

> Hmm, looks like a glibc 2.4 bug if you ask me.

I'm not sure I want to fight against that. 
Could you draft some documentation on the problem and the workarounds?
We could include that in the README, section 14.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: thread-unsafety on RedHat solved

2006-10-20 by randersson2

--- In milter-greylist@yahoogroups.com, manu@... wrote:
>
> randersson2 <robert@...> wrote:
> 
> > Hmm, looks like a glibc 2.4 bug if you ask me.
> 
> I'm not sure I want to fight against that. 
> Could you draft some documentation on the problem and the workarounds?
> We could include that in the README, section 14.

I'd like some more feedback from other users here first. I feel I sort
of stumbled onto something that worked for me, without really
understanding why.

Why doesn't the libbind distributed by Fedora and Redhat work
multithreaded?

And why have the glibc maintainers deliberately hidden some of the
documented interfaces in libresolv?

Surely I'm not the only one here using milter-greylist on a glibc 2.x
based Linux distribution with dnsrbl enabled?

Regards, Robert.

Re: [milter-greylist] (greylist): timeout before data read and (greylist): to error state

2006-11-20 by Jacques Beigbeder

Hello,

It was discussed last October, starting October 17.

I also get messages like this:

  >> Oct 17 09:38:42 mailscanner6 sendmail[31656]: k9HGcLRM031656: Milter (greylist): timeout before data read
  >> Oct 17 09:38:42 mailscanner6 sendmail[31624]: k9HGcL2F031624: Milter (greylist): timeout before data read

But my problem is related to the size of my tables.

My OS: FreeBSD 4.8, 1 Gb memory, SCSI disks.
My mail-server gets 700.000 hits per day, that is 10 connections per second.
With a limited set of rules, I now have:
	-rw-------  1 root  wheel  128422056 Nov 20 14:35 /var/milter-greylist/greylist.db
	root   57002 11.2 23.9 249312 248876  ??  Ss   Tue10PM 381:08.14 /usr/sbin/milter-greylist...
/var/milter-greylist/greylist.db has now 1.263.835 lines.

The message 'timeout before data read' occurs every 30m (= dumpfreq),
for 14 seconds,
when milter-greylist writes /var/milter-greylist/greylist.db.

If all my email address get protection with milter-greylist,
the file will be 300 Mb, the process will be 600 Mb,
the delay to write the file will be 40 seconds,
and within 40s, 10 sendmail/second give 400 sendmail process
waiting for an extended timeout:
	Xgreylist, S=local:/var/milter-greylist/milter-greylist.sock, T=C:1m;S:30s;R:2m;E:2m
                                                                      ^^^^^^^^^^^^^^^^^^^^^^
Splitting in 2 MX doesn't change the delay for writing greylist.db.

How can I solve this problem?

Some ideas:
. splitting /var/milter-greylist/greylist.db in several files,
  for instance 10 times 30 Mb. A file will be every 30mn,
  and milter-greylist writes a file every 3 mn.
  ( 3 mn = dumpfreq / #files )
. splitting milter-greylist in several process, may be
  on several computers. NB: this approach also solves
  the problem of CPU and memory: a set of MX talking
  to a set of milter-greylist daemons.

Thanks,

--
Jacques Beigbeder                    |  Jacques.Beigbeder@...
Service de Prestations Informatiques |     http://www.spi.ens.fr
Ecole normale sup\ufffdrieure             |
45 rue d'Ulm                         |Tel : (+33 1)1 44 32 37 96
F75230 Paris cedex 05                |Fax : (+33 1)1 44 32 20 75

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.