Yahoo Groups archive

Milter-greylist

Index last updated: 2026-04-28 23:32 UTC

Thread

Milter-greylist instability

Milter-greylist instability

2005-07-10 by Will Aoki

Since putting milter-greylist 2.0 into production a few days ago on a
low- to moderate-volume Sendmail 8.12.10 system, it's started failing
regularly after about 24 hours of operation.

During the live testing phase, I had it set to greylist about 8
addresses and whitelist everything else. During that time I never
observed it to die. When I put it into production, I removed the 'acl
greylist rcpt' and 'acl whitelist default' lines and replaced them with
'acl greylist default'.

Symptoms of the failure are the following lines in the syslog:

Jul  9 13:34:23 bluebird sm-mta[5352]: j69JYDPV005352: Milter (greylist): timeout before data read
Jul  9 13:34:23 bluebird sm-mta[5352]: j69JYDPV005352: Milter (greylist): to error state
Jul  9 13:34:23 bluebird sm-mta[5352]: j69JYDPV005352: Milter (greylist): init failed to open
Jul  9 13:34:23 bluebird sm-mta[5352]: j69JYDPV005352: Milter (greylist): to error state

During normal operation there are five milter-greylist processes
running. When failure occurs, there's only a single running
milter-greylist process.

For the failure reported above, the greylist.db database was last
written about 13 minutes before the crash and contains 654 records.

I have not yet noticed any pattern in the messages which it processes
just before failure.

I'm going to investigate further with a debugger to see if I can
determine the cause.

-- 
William Aoki     KD7YAF    waoki@...    5-1924

Re: [milter-greylist] Milter-greylist instability

2005-07-11 by manu@netbsd.org

Will Aoki <waoki@...> wrote:

> Since putting milter-greylist 2.0 into production a few days ago on a
> low- to moderate-volume Sendmail 8.12.10 system, it's started failing
> regularly after about 24 hours of operation.

Maybe you hit a system limit? 

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: [milter-greylist] Milter-greylist instability

2005-08-01 by Will Aoki

On Mon, Jul 11, 2005 at 08:02:06AM +0200, manu@... wrote:
> Will Aoki <waoki@...> wrote:
> 
> > Since putting milter-greylist 2.0 into production a few days ago on a
> > low- to moderate-volume Sendmail 8.12.10 system, it's started failing
> > regularly after about 24 hours of operation.

I changed the milter options to T=C:3s;E:8s to reduce the effects of the
hanging milter, and milter-greylist started staying up for four to ten
days at a time. I wonder if it's related or just a coincidence?

> Maybe you hit a system limit? 

Possibly, but I can't think of what. System logs don't indicate anything
running out of threads or being killed for exceeding resource limits.
Milter-greylist memory size starts around 12M and slowly grows to the
20M to 25M range. Resource limits are set high enough that they
shouldn't cause problems unless something is broken and out of control.

I don't yet have this data for a milter-greylist compiled with debugging
symbols, but attaching GDB to a hung milter-greylist process and getting
a backtrace showed this call stack:

 #0  0x0fe0f4d8 in sigset () from /lib/libc.so.6
 #1  0x0ff38c90 in sigwait () from /lib/libpthread.so.0
 #2  0x10010288 in mi_signal_thread ()
 #3  0x0ff35428 in pthread_start_thread () from /lib/libpthread.so.0
 #4  0x0feb5e68 in clone () from /lib/libc.so.6

I should have more useful data the next time it dies, since I'm now
running a milter-greylist compiled with debugging symbols.



On a somewhat related note, has anyone else noticed that the supplied
rc-debian.sh doesn't work to stop the milter? It usually won't die
unless I kill all five of its processes.

-- 
William Aoki     KD7YAF    waoki@...    5-1924

Re: [milter-greylist] Milter-greylist instability

2005-08-01 by manu@netbsd.org

Will Aoki <waoki@...> wrote:

> I don't yet have this data for a milter-greylist compiled with debugging
> symbols, but attaching GDB to a hung milter-greylist process and getting
> a backtrace showed this call stack:
> 
>  #0  0x0fe0f4d8 in sigset () from /lib/libc.so.6
>  #1  0x0ff38c90 in sigwait () from /lib/libpthread.so.0
>  #2  0x10010288 in mi_signal_thread ()
>  #3  0x0ff35428 in pthread_start_thread () from /lib/libpthread.so.0
>  #4  0x0feb5e68 in clone () from /lib/libc.so.6

That's probably not the thread that caused the crash. Try bt on all the
threads.
 
> On a somewhat related note, has anyone else noticed that the supplied
> rc-debian.sh doesn't work to stop the milter? It usually won't die
> unless I kill all five of its processes.

Can you contribute a fix?

-- 
Emmanuel Dreyfus
Un bouquin en français sur BSD:
http://www.eyrolles.com/Informatique/Livre/9782212114638/livre-bsd.php
manu@...

Re: [milter-greylist] Milter-greylist instability

2005-08-04 by Will Aoki

On Mon, Aug 01, 2005 at 10:26:17PM +0200, manu@... wrote:
> Will Aoki <waoki@...> wrote:
> 
> > I don't yet have this data for a milter-greylist compiled with debugging
> > symbols, but attaching GDB to a hung milter-greylist process and getting
> > a backtrace showed this call stack:
> > 
> >  #0  0x0fe0f4d8 in sigset () from /lib/libc.so.6
> >  #1  0x0ff38c90 in sigwait () from /lib/libpthread.so.0
> >  #2  0x10010288 in mi_signal_thread ()
> >  #3  0x0ff35428 in pthread_start_thread () from /lib/libpthread.so.0
> >  #4  0x0feb5e68 in clone () from /lib/libc.so.6
> 
> That's probably not the thread that caused the crash. Try bt on all the
> threads.

All the other threads seem to have vanished mysteriously.

$ ps auxw | grep milter-grey | grep -v grep
milterg  24832  0.0  0.3 15600 2280 ?        S    Aug01   0:00 /usr/local/bin/milter-greylist -P /var/run/greylist.pid -u miltergreylist -p /var/run/milter-greylist/greylist.sock
$ sudo gdb -p 24832
GNU gdb 2002-04-01-cvs
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "powerpc-linux".
Attaching to process 24832
Reading symbols from /usr/local/stow/milter-greylist/bin/milter-greylist...done.
Reading symbols from /usr/lib/libspf-1.0.so.0...done.
Loaded symbols for /usr/lib/libspf-1.0.so.0
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libpthread.so.0...done.
[New Thread 1024 (LWP 24828)]
Error while reading shared library symbols:
Can't attach LWP 24828: No such process
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld.so.1...done.
Loaded symbols for /lib/ld.so.1
Reading symbols from /lib/libnss_compat.so.2...done.
Loaded symbols for /lib/libnss_compat.so.2
Reading symbols from /lib/libnss_ldap.so.2...done.
Loaded symbols for /lib/libnss_ldap.so.2
Reading symbols from /usr/lib/libldap.so.2...done.
Loaded symbols for /usr/lib/libldap.so.2
Reading symbols from /usr/lib/liblber.so.2...done.
Loaded symbols for /usr/lib/liblber.so.2
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /usr/lib/libsasl.so.7...done.
Loaded symbols for /usr/lib/libsasl.so.7
Reading symbols from /usr/lib/libkrb4.so.2...done.
Loaded symbols for /usr/lib/libkrb4.so.2
Reading symbols from /usr/lib/libdes425.so.3...done.
Loaded symbols for /usr/lib/libdes425.so.3
Reading symbols from /usr/lib/libkrb5.so.3...done.
Loaded symbols for /usr/lib/libkrb5.so.3
Reading symbols from /usr/lib/libk5crypto.so.3...done.
Loaded symbols for /usr/lib/libk5crypto.so.3
Reading symbols from /lib/libcom_err.so.2...done.
Loaded symbols for /lib/libcom_err.so.2
Reading symbols from /usr/lib/libssl.so.0.9.6...done.
Loaded symbols for /usr/lib/libssl.so.0.9.6
Reading symbols from /usr/lib/libcrypto.so.0.9.6...done.
Loaded symbols for /usr/lib/libcrypto.so.0.9.6
Reading symbols from /lib/libdb2.so.2...done.
Loaded symbols for /lib/libdb2.so.2
Reading symbols from /lib/libpam.so.0...done.
Loaded symbols for /lib/libpam.so.0
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /usr/lib/sasl/libgssapiv2.so...done.
Loaded symbols for /usr/lib/sasl/libgssapiv2.so
Reading symbols from /usr/lib/libgssapi_krb5.so.2...done.
Loaded symbols for /usr/lib/libgssapi_krb5.so.2
Reading symbols from /usr/lib/sasl/libcrammd5.so...done.
Loaded symbols for /usr/lib/sasl/libcrammd5.so
Reading symbols from /usr/lib/sasl/libanonymous.so...done.
Loaded symbols for /usr/lib/sasl/libanonymous.so
Reading symbols from /usr/lib/sasl/libplain.so...done.
Loaded symbols for /usr/lib/sasl/libplain.so
Reading symbols from /usr/lib/sasl/liblogin.so...done.
Loaded symbols for /usr/lib/sasl/liblogin.so
Reading symbols from /lib/libnss_dns.so.2...done.
Loaded symbols for /lib/libnss_dns.so.2
0x0fe0f4d8 in sigset () from /lib/libc.so.6
(gdb) info thread
[New Thread 2049 (LWP 24829)]
Can't attach LWP 24829: No such process
(gdb) bt
#0  0x0fe0f4d8 in sigset () from /lib/libc.so.6
#1  0x0ff38c90 in sigwait () from /lib/libpthread.so.0
#2  0x10015204 in mi_signal_thread ()
#3  0x0ff35428 in pthread_start_thread () from /lib/libpthread.so.0
#4  0x0feb5e68 in clone () from /lib/libc.so.6
(gdb) quit
A debugging session is active.
Do you still want to close the debugger?(y or n) y
Can't detach LWP 24828: No such process
(gdb)

> > On a somewhat related note, has anyone else noticed that the supplied
> > rc-debian.sh doesn't work to stop the milter? It usually won't die
> > unless I kill all five of its processes.
> 
> Can you contribute a fix?

So far as I can tell, in order to kill milter-greylist, I need to signal
all its processes, not just the one whose pid is recorded in the
pidfile. I'll need to tinker with some other systems to see if that's
what the other scripts do - if so, I can just modify the Debian init
script as a quick solution; otherwise, I'll have to dig into the
milter's code.

-- 
William Aoki     KD7YAF    waoki@...    5-1924

Re: [milter-greylist] Milter-greylist instability

2005-08-05 by Matthias Scheler

On Thu, Aug 04, 2005 at 01:08:27PM -0600, Will Aoki wrote:
> So far as I can tell, in order to kill milter-greylist, I need to signal
> all its processes, not just the one whose pid is recorded in the
> pidfile. I'll need to tinker with some other systems to see if that's
> what the other scripts do - if so, I can just modify the Debian init
> script as a quick solution; otherwise, I'll have to dig into the
> milter's code.

This should really be handled in the Linux init scripts because it is
actually a Linux bug. Posix threads should all have the same PID which
is the case under Solaris, NetBSD and IIRC Linux with new thread
implementation.

Your system is apparently using the old Linux thread implementation
based on the "clone" system call which used a PID per thread.

	Kind regards

-- 
Matthias Scheler                                  http://scheler.de/~matthias/

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.