Yahoo Groups archive

Milter-greylist

Index last updated: 2026-04-28 23:32 UTC

Message

Re: [milter-greylist] thread leak

2014-02-14 by Emmanuel Dreyfus

On Thu, Feb 13, 2014 at 08:48:00AM +0000, Bruncsak, Attila wrote:
> Did you had libmilter compilation option defined "_FFR_WORKERS_POOL"  ?
> (FFR: for future release)
> By the way, your libmilter is coming from which version of sendmail?

No, and it is 8.14.7.

But I managed to track down the offending code. It was tricky because 
once milter-greylist get too much threads, gdb becomes unable to 
explore them. Catching the process soon enough (351 threads) gives me this:

#0  0x00007f7ff6875d6a in ___lwp_park50 () from /usr/lib/libc.so.12
#1  0x00007f7ff70088f1 in ?? () from /usr/lib/libpthread.so.1
#2  0x00007f7ff78245a1 in ldap_send_initial_request ()
   from /usr/pkg/lib/libldap_r-2.4.so.2
#3  0x00007f7ff7815668 in ldap_pvt_search ()
   from /usr/pkg/lib/libldap_r-2.4.so.2 
#4  0x00007f7ff781576f in ldap_pvt_search_s ()
   from /usr/pkg/lib/libldap_r-2.4.so.2
#5  0x00007f7ff7815839 in ldap_search_ext_s ()
   from /usr/pkg/lib/libldap_r-2.4.so.2
#6  0x00000000004157d9 in ldapcheck_validate (ad=<optimized out>,
    stage=<optimized out>, ap=0x7f7fdffff4d0, priv=0x7f7ff5110800)
    at ldapcheck.c:502
#7  0x00000000004120e8 in acl_filter (stage=AS_RCPT, ctx=<optimized out>,
    priv=0x7f7ff5110800) at acl.c:2407
#8  0x0000000000408f53 in real_envrcpt (ctx=0x7f7ff7332220,
    envrcpt=0x7f7ff511b3d0) at milter-greylist.c:725
#9  0x000000000040928f in mlfi_envrcpt (ctx=0x7f7ff7332220,
    envrcpt=0x7f7ff511b3d0) at milter-greylist.c:230
#10 0x00000000004231b6 in st_rcpt ()
#11 0x000000000042301a in mi_engine ()
#12 0x0000000000420bbf in mi_handle_session ()
#13 0x000000000041faf9 in mi_thread_handle_wrapper ()
#14 0x00007f7ff700b2ce in ?? () from /usr/lib/libpthread.so.1
#15 0x00007f7ff6875d80 in ___lwp_park50 () from /usr/lib/libc.so.12

ldap_send_initial_request() uses two mutex. I think one thread get 
stuck in connection opening or request sending, and the other threads
wait. 

The timelimit option of ldap_search_ext_s() will not help: this is
a server-side timeout for the request.

I think the fix is to start a new LDAP connexion when we detect the
deadlock. I could be because the thread count involved in LDAP
operations reach a threshold, or because the oldest opeartion hits
a timeout. I suspect the second approach is better.

There is still a problem with that approach: correctly handling 
if the LDAP directory is misbehaving: we do not want to open an
inifinite amount of connexions if they all get stuck.

-- 
Emmanuel Dreyfus
manu@...

Attachments

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.