Yahoo Groups archive

Milter-greylist

Index last updated: 2026-04-28 23:32 UTC

Thread

crash on db dump (still)

crash on db dump (still)

2007-08-30 by Jeff A. Earickson

Hi,

New to the list, but I did search the list archives on this
issue...  I am having frequent crashes when milter-greylist
wants to write out its db file (platform: Sparc Solaris 10,
compiler gcc 4.2).

I started with version 3.0 last week.  I saw in the list where the issue was
"too many open files" and the general fix is to boost the process file
descriptors via ulimit or plimit.  I have done this in steps, going from 
the default of 256 to 2048 currently.

This morning I installed 4.0b1; still running with 2048 maximum.
Instead of getting a crash like in 3.0:

    cannot write dumpfile "/var/milter-greylist/greylist.db-XXdpaOEl":
    Too many open files

I now get this with 4.0b1:

    cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
    Error 0

I checked the permissions of /var/milter-greylist and the db file;
no problems.  Any ideas?

BTW, I saw a bunch of references to 3.1.x releases in the 4.0b1
ChangeLog.  But they aren't available for download on the website.
What gives???

Jeff Earickson
Colby College

Re: [milter-greylist] crash on db dump (still)

2007-08-30 by manu@netbsd.org

Jeff A. Earickson <jaearick@...> wrote:

> I now get this with 4.0b1:
> 
>     cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
>     Error 0

Hum... errno was not set after a failed fdopen? Any Solaris expert that
could explain what is going on?

Many Solaris users have trouble with this stupid libc limitation. There
are two ways of fixing that:
1) build a 64 bit binary
2) Give a try to Johann E. Klasek's patch, which I should integrate just
after 4.0 is released:
http://jk.kom.tuwien.ac.at/~jklasek/software/milter-greylist/
mg.stdio-solaris.patch

> BTW, I saw a bunch of references to 3.1.x releases in the 4.0b1
> ChangeLog.  But they aren't available for download on the website.
> What gives???

ftp://ftp.espci.fr/pub/milter-greylist
But don't use 3.1.x, as 4.0b1 is just the same code base with many bug
fixes.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: [milter-greylist] crash on db dump (still)

2007-08-31 by Jeff A. Earickson

On Thu, 30 Aug 2007, manu@... wrote:

> Date: Thu, 30 Aug 2007 19:29:47 +0200
> From: manu@...
> Reply-To: milter-greylist@yahoogroups.com
> To: Milter Greylist list <milter-greylist@yahoogroups.com>
> Subject: Re: [milter-greylist] crash on db dump (still)
> 
> Jeff A. Earickson <jaearick@...> wrote:
>
>> I now get this with 4.0b1:
>>
>>     cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
>>     Error 0
>
> Hum... errno was not set after a failed fdopen? Any Solaris expert that
> could explain what is going on?
>
> Many Solaris users have trouble with this stupid libc limitation. There
> are two ways of fixing that:
> 1) build a 64 bit binary
> 2) Give a try to Johann E. Klasek's patch, which I should integrate just
> after 4.0 is released:
> http://jk.kom.tuwien.ac.at/~jklasek/software/milter-greylist/
> mg.stdio-solaris.patch

After a day of running this patch, I have backed it out.  It did not help
and may have made things worse.  I had a dumpfile crash 7 times yesterday
and 6 times so far since midnight.  I have a cron job that restarts
milter-greylist if it disappears from the process list.

Jeff Earickson
Colby College

Re: [milter-greylist] crash on db dump (still)

2007-08-31 by manu@netbsd.org

Jeff A. Earickson <jaearick@...> wrote:

> After a day of running this patch, I have backed it out.  It did not help
> and may have made things worse.  I had a dumpfile crash 7 times yesterday
> and 6 times so far since midnight.  I have a cron job that restarts
> milter-greylist if it disappears from the process list.

Do you have the possibility of building a 64 bit binary, so that we can
completely rule out the Solaris-specific stream limit?

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: [milter-greylist] crash on db dump (still)

2007-08-31 by Jeff A. Earickson

On Fri, 31 Aug 2007, manu@... wrote:

> Date: Fri, 31 Aug 2007 14:42:40 +0200
> From: manu@...
> Reply-To: milter-greylist@yahoogroups.com
> To: Milter Greylist list <milter-greylist@yahoogroups.com>
> Subject: Re: [milter-greylist] crash on db dump (still)
> 
> Jeff A. Earickson <jaearick@...> wrote:
>
>> After a day of running this patch, I have backed it out.  It did not help
>> and may have made things worse.  I had a dumpfile crash 7 times yesterday
>> and 6 times so far since midnight.  I have a cron job that restarts
>> milter-greylist if it disappears from the process list.
>
> Do you have the possibility of building a 64 bit binary, so that we can
> completely rule out the Solaris-specific stream limit?

I thought about this for a minute.  I build both sendmail (8.14.1) and
Berkeleydb (4.6.19) with Sun's Studio 11 C compiler.  I was building
milter-greylist with gcc 4.2.  Maybe not a bright idea to mix compilers.
I have rebuilt and reinstalled milter-greylist with Sun C.  The make
output, and the configure script, are attached.  There are some minor
compile complaints.

I'll have to read the acc manpages to figure out how to do a 64 bit
compile.

BTW, how come the database isn't a real db?  I realized this when I
did an ldd and didn't see that BerkeleyDB was linked in?  I guess 
milter-greylist is standalone and maybe compiler mixing is not an
issue...

Jeff Earickson
Colby College

Re: [milter-greylist] crash on db dump (still)

2007-08-31 by Richard Whelan

Hi,

I'm also seeing this same problem, but only on one machine. I have three
identical systems, all running Solaris 9, Sendmail 8.14.1, Berkeley db
4.3.27, milter-greylist 3.0, mimedefang, spamassassin & clamav. All have
been built using gcc (but not 64 bit). Two of the machines are
absolutely fine, but on the other, when it tries to write to the
greylist.db file, it crashes. This afternoon, I managed to get a write
to the file at 13:07, but not since. Without effecting everything else,
I'm not going to be able to change these to 64bit either (being that
they have a lot of live traffic going through them).

Cheers,

Richard
Show quoted textHide quoted text
-------- Original Message --------
Subject: Re:[milter-greylist] crash on db dump (still)
From: Jeff A. Earickson <jaearick@...>
To: Milter Greylist list <milter-greylist@yahoogroups.com>
Date: 31/08/2007 14:25
>
> On Fri, 31 Aug 2007, manu@... <mailto:manu%40netbsd.org> wrote:
>
> > Date: Fri, 31 Aug 2007 14:42:40 +0200
> > From: manu@... <mailto:manu%40netbsd.org>
> > Reply-To: milter-greylist@yahoogroups.com
> <mailto:milter-greylist%40yahoogroups.com>
> > To: Milter Greylist list <milter-greylist@yahoogroups.com
> <mailto:milter-greylist%40yahoogroups.com>>
> > Subject: Re: [milter-greylist] crash on db dump (still)
> >
> > Jeff A. Earickson <jaearick@... <mailto:jaearick%40colby.edu>>
> wrote:
> >
> >> After a day of running this patch, I have backed it out. It did not
> help
> >> and may have made things worse. I had a dumpfile crash 7 times
> yesterday
> >> and 6 times so far since midnight. I have a cron job that restarts
> >> milter-greylist if it disappears from the process list.
> >
> > Do you have the possibility of building a 64 bit binary, so that we can
> > completely rule out the Solaris-specific stream limit?
>
> I thought about this for a minute. I build both sendmail (8.14.1) and
> Berkeleydb (4.6.19) with Sun's Studio 11 C compiler. I was building
> milter-greylist with gcc 4.2. Maybe not a bright idea to mix compilers.
> I have rebuilt and reinstalled milter-greylist with Sun C. The make
> output, and the configure script, are attached. There are some minor
> compile complaints.
>
> I'll have to read the acc manpages to figure out how to do a 64 bit
> compile.
>
> BTW, how come the database isn't a real db? I realized this when I
> did an ldd and didn't see that BerkeleyDB was linked in? I guess
> milter-greylist is standalone and maybe compiler mixing is not an
> issue...
>
> Jeff Earickson
> Colby College
>
>  <!-- #ygrp-mkp{ border: 1px solid #d8d8d8; font-family:
> Arial; margin: 14px 0px; padding: 0px 14px; } #ygrp-mkp hr{ border:
> 1px solid #d8d8d8; } #ygrp-mkp #hd{ color: #628c2a; font-size: 85%;
> font-weight: bold; line-height: 122%; margin: 10px 0px; } #ygrp-mkp
> #ads{ margin-bottom: 10px; } #ygrp-mkp .ad{ padding: 0 0; } #ygrp-mkp
> .ad a{ color: #0000ff; text-decoration: none; } --> <!-- #ygrp-sponsor
> #ygrp-lc{ font-family: Arial; } #ygrp-sponsor #ygrp-lc #hd{ margin:
> 10px 0px; font-weight: bold; font-size: 78%; line-height: 122%; }
> #ygrp-sponsor #ygrp-lc .ad{ margin-bottom: 10px; padding: 0 0; } -->
> <!-- #ygrp-mlmsg {font-size:13px; font-family:
> arial,helvetica,clean,sans-serif;*font-size:small;*font:x-small;}
> #ygrp-mlmsg table {font-size:inherit;font:100%;} #ygrp-mlmsg select,
> input, textarea {font:99% arial,helvetica,clean,sans-serif;}
> #ygrp-mlmsg pre, code {font:115% monospace;*font-size:100%;}
> #ygrp-mlmsg * {line-height:1.22em;} #ygrp-text{ font-family: Georgia;
> } #ygrp-text p{ margin: 0 0 1em 0; } #ygrp-tpmsgs{ font-family: Arial;
> clear: both; } #ygrp-vitnav{ padding-top: 10px; font-family: Verdana;
> font-size: 77%; margin: 0; } #ygrp-vitnav a{ padding: 0 1px; }
> #ygrp-actbar{ clear: both; margin: 25px 0; white-space:nowrap; color:
> #666; text-align: right; } #ygrp-actbar .left{ float: left;
> white-space:nowrap; } .bld{font-weight:bold;} #ygrp-grft{ font-family:
> Verdana; font-size: 77%; padding: 15px 0; } #ygrp-ft{ font-family:
> verdana; font-size: 77%; border-top: 1px solid #666; padding: 5px 0; }
> #ygrp-mlmsg #logo{ padding-bottom: 10px; } #ygrp-vital{
> background-color: #e0ecee; margin-bottom: 20px; padding: 2px 0 8px
> 8px; } #ygrp-vital #vithd{ font-size: 77%; font-family: Verdana;
> font-weight: bold; color: #333; text-transform: uppercase; }
> #ygrp-vital ul{ padding: 0; margin: 2px 0; } #ygrp-vital ul li{
> list-style-type: none; clear: both; border: 1px solid #e0ecee; }
> #ygrp-vital ul li .ct{ font-weight: bold; color: #ff7900; float:
> right; width: 2em; text-align:right; padding-right: .5em; }
> #ygrp-vital ul li .cat{ font-weight: bold; } #ygrp-vital a{
> text-decoration: none; } #ygrp-vital a:hover{ text-decoration:
> underline; } #ygrp-sponsor #hd{ color: #999; font-size: 77%; }
> #ygrp-sponsor #ov{ padding: 6px 13px; background-color: #e0ecee;
> margin-bottom: 20px; } #ygrp-sponsor #ov ul{ padding: 0 0 0 8px;
> margin: 0; } #ygrp-sponsor #ov li{ list-style-type: square; padding:
> 6px 0; font-size: 77%; } #ygrp-sponsor #ov li a{ text-decoration:
> none; font-size: 130%; } #ygrp-sponsor #nc{ background-color: #eee;
> margin-bottom: 20px; padding: 0 8px; } #ygrp-sponsor .ad{ padding: 8px
> 0; } #ygrp-sponsor .ad #hd1{ font-family: Arial; font-weight: bold;
> color: #628c2a; font-size: 100%; line-height: 122%; } #ygrp-sponsor
> .ad a{ text-decoration: none; } #ygrp-sponsor .ad a:hover{
> text-decoration: underline; } #ygrp-sponsor .ad p{ margin: 0; }
> o{font-size: 0; } .MsoNormal{ margin: 0 0 0 0; } #ygrp-text tt{
> font-size: 120%; } blockquote{margin: 0 0 0 4px;} .replbq{margin:4} --> 

-- 
Richard Whelan
Senior Systems & NMS Administrator

Pipex Communications

Tel:  +44 (0) 1865 381568
Mob:  +44 (0) 7786 276020
Web:  http://www.pipex.com
 
This e-mail is subject to: http://www.pipex.net/disclaimer.html

Re: [milter-greylist] crash on db dump (still)

2007-08-31 by manu@netbsd.org

Jeff A. Earickson <jaearick@...> wrote:

> BTW, how come the database isn't a real db? 

Some time ago, I  gave a try to a BDB backend, but the result was not
convinving. You can dig CVS for it if you are curious.

>  I realized this when I
> did an ldd and didn't see that BerkeleyDB was linked in?  I guess 
> milter-greylist is standalone and maybe compiler mixing is not an
> issue...

It should not be.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: [milter-greylist] crash on db dump (still)

2007-08-31 by Jeff A. Earickson

On Fri, 31 Aug 2007, manu@... wrote:

> Date: Fri, 31 Aug 2007 14:42:40 +0200
> From: manu@...
> Reply-To: milter-greylist@yahoogroups.com
> To: Milter Greylist list <milter-greylist@yahoogroups.com>
> Subject: Re: [milter-greylist] crash on db dump (still)
> 
> Jeff A. Earickson <jaearick@...> wrote:
>
>> After a day of running this patch, I have backed it out.  It did not help
>> and may have made things worse.  I had a dumpfile crash 7 times yesterday
>> and 6 times so far since midnight.  I have a cron job that restarts
>> milter-greylist if it disappears from the process list.
>
> Do you have the possibility of building a 64 bit binary, so that we can
> completely rule out the Solaris-specific stream limit?
>

Hi,

I have struggled with this a great deal today, using gcc -m64.  My first
error was configuration of milter-greylist:

checking for smfi_register in -lmilter... no
checking for smfi_register in -lmilter -lsm... no
checking for smfi_register in -lmilter -lsmutil... no
Required libmilter not found. Use --with-libmilter

This was something I did not get with 32 bit.  I found an old version
of libmilter in /usr/lib, and realized (a) libmilter was out-of-date
relative to sendmail, (b) it had been built with cc, not gcc, and (c)
it was a 32-bit library.  So I went back to my sendmail code and 
realized that libmilter does not automatically get built and installed
with sendmail.

So I tried to build a 64-bit gcc shared library version of libmilter
for use here.  The URL:

http://www.technoids.org/libmilter.so.html#Appendix

gave me a general approach on how to tackle this.  I'm to the point
where the .c files for libmilter compile just fine, but the shared
library won't link:

gcc -O2 -I. -I../../sendmail   -I../../include  -I/opt/BerkeleyDB/include -I/opt/sasl/include/sasl -I/opt/openssl/include -I/usr/local/include -DSOLARIS=21000 -DNETINET6 -UNIS -UNISPLUS -DNEWDB -DSASL=2 -DSTARTTLS -DSM_CONF_LDAP_MEMFREE -DTCPWRAPPERS -DHASURANDOMDEV -DIDENTPROTO=0 -DDSN=0 -DNOT_SENDMAIL -Dsm_snprintf=snprintf -m64 -g -D_REENTRANT -DXP_MT  -c  strl.c
gcc -lpthread -Wl,"-64" -G -o libmilter.so -h libmilter.so. main.o engine.o listener.o worker.o handler.o comm.o smfi.o signal.o sm_gethost.o monitor.o errstring.o strl.o 
ld: fatal: file /usr/local/lib/gcc/sparc-sun-solaris2.10/4.2.0/crt1.o: wrong ELF class: ELFCLASS32
ld: fatal: File processing errors. No output written to libmilter.so

or:

gcc -lpthread -G -o libmilter.so -h libmilter.so. main.o engine.o listener.o worker.o handler.o comm.o smfi.o signal.o sm_gethost.o monitor.o errstring.o strl.o 
ld: fatal: file main.o: wrong ELF class: ELFCLASS64
ld: fatal: File processing errors. No output written to libmilter.so

Obviously, I'm mixing 32 and 64 bit binaries here but I don't know
why (nor how to fix it with loader options).  I googled on "wrong
ELF class" but didn't find an answer...

Visions of Will Farrell in his Elf outfit come to mind here.  :)

Any help here?

Jeff Earickson
Colby College

Re: [milter-greylist] crash on db dump (still)

2007-09-03 by Matthieu Herrb

manu@... wrote:
> Jeff A. Earickson <jaearick@...> wrote:
> 
>> I now get this with 4.0b1:
>>
>>     cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
>>     Error 0
> 
> Hum... errno was not set after a failed fdopen? Any Solaris expert that
> could explain what is going on?
> 
> Many Solaris users have trouble with this stupid libc limitation. There
> are two ways of fixing that:
> 1) build a 64 bit binary
> 2) Give a try to Johann E. Klasek's patch, which I should integrate just
> after 4.0 is released:
> http://jk.kom.tuwien.ac.at/~jklasek/software/milter-greylist/
> mg.stdio-solaris.patch
> 

Sun has now released a "patch" for this issue. See 
<http://developers.sun.com/solaris/articles/stdio_256.html> for details 
   and possible solutions.
-- 
Matthieu Herrb

Re: [milter-greylist] crash on db dump (FIX, MAYBE)

2007-09-04 by Jeff A. Earickson

On Mon, 3 Sep 2007, Matthieu Herrb wrote:

> Date: Mon, 03 Sep 2007 12:26:10 +0200
> From: Matthieu Herrb <matthieu.herrb@...>
> Reply-To: milter-greylist@yahoogroups.com
> To: milter-greylist@yahoogroups.com
> Subject: Re: [milter-greylist] crash on db dump (still)
> 
> manu@... wrote:
>> Jeff A. Earickson <jaearick@...> wrote:
>> 
>>> I now get this with 4.0b1:
>>>
>>>     cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
>>>     Error 0
>> 
>> Hum... errno was not set after a failed fdopen? Any Solaris expert that
>> could explain what is going on?
>> 
> Sun has now released a "patch" for this issue. See 
> <http://developers.sun.com/solaris/articles/stdio_256.html> for details   and 
> possible solutions.
> -- 
> Matthieu Herrb

Matthieu,

Thank you, this webpage was most interesting reading.

Emmanuel,

I have reworked 4.0b1 per the "Programming Solutions" part of this
webpage.  Basically I added to "F" option to calls to fopen(), fdopen(),
and popen(), and a I surrounded the changes with "ifdef __sun" blocks.
Attached is my makepatch output of my changes.  I recompiled with gcc
4.2 and reinstalled and I am running this code at my site now.
I will see how it performs and write back.

I have a feature suggestion too.  It would be nice if one could send
a "kill" signal to milter-greylist (like HUP or SIGUSR1) to have the
process dump the greylist.db file.  Then the startup/shutdown script
could be modified so that the db file was dumped right before 
milter-greylist was shut down, eg:

   stop)
         # Stop daemons.
         echo "Dump greylist.db and shut down milter-greylist: ... \c"
 		/usr/bin/pkill -HUP milter greylist
         /usr/bin/pkill milter-greylist
         echo "done."

Jeff Earickson
Colby College

Solaris crash on db dump (FIXED!)

2007-09-05 by Jeff A. Earickson

>> Jeff A. Earickson <jaearick@...> wrote:
>> 
>>> I now get this with 4.0b1:
>>>
>>>     cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
>>>     Error 0
>> 
>> Hum... errno was not set after a failed fdopen? Any Solaris expert that
>> could explain what is going on?
>> 
>> Many Solaris users have trouble with this stupid libc limitation. There
>> are two ways of fixing that:
>> 1) build a 64 bit binary
>> 2) Give a try to Johann E. Klasek's patch, which I should integrate just
>> after 4.0 is released:
>> http://jk.kom.tuwien.ac.at/~jklasek/software/milter-greylist/
>> mg.stdio-solaris.patch
>> 
>
> Sun has now released a "patch" for this issue. See 
> <http://developers.sun.com/solaris/articles/stdio_256.html> for details   and 
> possible solutions.
> -- 
> Matthieu Herrb
>

I am happy to report that my changes to the 4.0b1 code that I submitted
yesterday via makepatch have solved my issues with milter-greylist
crashing at db dump time.  Before the code change, it would crash at
least a dozen times a day; it has not crashed once since yesterday's
code change (15+ hours).  Many thanks to Mr. Herrb for finding that
Sun article.

Jeff Earickson
Colby College

Re: [milter-greylist] Solaris crash on db dump (FIXED!)

2007-09-05 by manu@netbsd.org

Jeff A. Earickson <jaearick@...> wrote:

> I am happy to report that my changes to the 4.0b1 code that I submitted
> yesterday via makepatch have solved my issues with milter-greylist
> crashing at db dump time.  Before the code change, it would crash at
> least a dozen times a day; it has not crashed once since yesterday's
> code change (15+ hours).  Many thanks to Mr. Herrb for finding that
> Sun article.

The change looks really minor, I wonder if this is safe for inclusion in
4.0b2. Could the F flag be harmful to some Solaris setups? Or is the
Solaris situation so bad that nobody can get milter-greylist working
without that change?

In the first case, we should probably make this available as a configure
option, disabled by default.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: [milter-greylist] crash on db dump (FIX, MAYBE)

2007-09-05 by manu@netbsd.org

Jeff A. Earickson <jaearick@...> wrote:

> I have a feature suggestion too.  It would be nice if one could send
> a "kill" signal to milter-greylist (like HUP or SIGUSR1) to have the
> process dump the greylist.db file.  Then the startup/shutdown script
> could be modified so that the db file was dumped right before 
> milter-greylist was shut down, eg:

The problem is that the milter API says that libmilter shall catch
signals. So we cannot use signals.

A workaround would be to open a Unix socket and receive instructions
there.
-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: [milter-greylist] Solaris crash on db dump (FIXED!)

2007-09-05 by shuttlebox

On 9/5/07, manu@... <manu@...> wrote:
> Jeff A. Earickson <jaearick@...> wrote:
>
>  > I am happy to report that my changes to the 4.0b1 code that I submitted
>  > yesterday via makepatch have solved my issues with milter-greylist
>  > crashing at db dump time.  Before the code change, it would crash at
>  > least a dozen times a day; it has not crashed once since yesterday's
>  > code change (15+ hours).  Many thanks to Mr. Herrb for finding that
>  > Sun article.
>
>  The change looks really minor, I wonder if this is safe for inclusion in
>  4.0b2. Could the F flag be harmful to some Solaris setups? Or is the
>  Solaris situation so bad that nobody can get milter-greylist working
>  without that change?
>
>  In the first case, we should probably make this available as a configure
>  option, disabled by default.

I for one would be very happy if you could include it in the next
beta, especially if it's an option, then no harm will be done.

I can't get it to run reliably either, sometimes it runs for weeks and
sometimes it crashes several times an hour.

I'm building it for the Blastwave project (http://www.blastwave.org)
but I can't release it when I know that it's not stable. I would just
get a lot of bug reports. :-)

Please include it since it concerns stability.

-- 
/peter

Re: [milter-greylist] Solaris crash on db dump (FIXED!)

2007-09-05 by Jeff A. Earickson

On Wed, 5 Sep 2007, manu@... wrote:

> Date: Wed, 5 Sep 2007 14:42:02 +0200
> From: manu@...
> Reply-To: milter-greylist@yahoogroups.com
> To: milter-greylist@yahoogroups.com
> Subject: Re: [milter-greylist] Solaris crash on db dump (FIXED!)
> 
> Jeff A. Earickson <jaearick@...> wrote:
>
>> I am happy to report that my changes to the 4.0b1 code that I submitted
>> yesterday via makepatch have solved my issues with milter-greylist
>> crashing at db dump time.  Before the code change, it would crash at
>> least a dozen times a day; it has not crashed once since yesterday's
>> code change (15+ hours).  Many thanks to Mr. Herrb for finding that
>> Sun article.
>
> The change looks really minor, I wonder if this is safe for inclusion in
> 4.0b2. Could the F flag be harmful to some Solaris setups? Or is the
> Solaris situation so bad that nobody can get milter-greylist working
> without that change?

IMHO, yes the situation is that bad.  Your code really is unstable without 
this change.  But....   The caveats here are:

    * You must be running Solaris 10.  There may be patches for Solaris 9
    and earlier, but I didn't see any reference to earlier releases in the
    Sun doc.

    * You must have the following patches installed if your version of S10
    is less than the release of July 2007 (which I haven't seen yet):

    SPARC platform:

        * 125100-04 Kernel Update Patch
        * 120473-05 libc nss ldap PAM zfs Patch
        * 125800-01 Fault Manager Patch

    x86/x64 platform:

        * 125101-04 Kernel Update Patch
        * 120037-15 libc nss ldap PAM zfs Patch
        * 125801-01 Fault Manager Patch

    * You must raise the file descriptor limit in the parent process/shell
    before launching milter-greylist via "ulimit -n".  I have attached my
    /etc/init.d script as an example.

I did take a look at the runtime option in the doc instead of the 
programming option.  I does work, but it seemed kludgy to me.  Since
your code "follows the rules" regarding the use of FILE, the use of "F"
was a quick fix for me.  You may not be happy cluttering up your code
with OS-specific ifdefs.

Probably the real solution is to get the Makefile to just automatically
build a 64-bit application for Solaris.  I couldn't quickly figure this
out; school is just starting and I don't have a lot of spare time right
now.

>
> In the first case, we should probably make this available as a configure
> option, disabled by default.

Words of warning about this problem in the README are always a good thing.

Jeff Earickson
Colby College

Re: [milter-greylist] Solaris crash on db dump (FIXED!)

2007-09-05 by Richard Whelan

-------- Original Message --------
Show quoted textHide quoted text
Subject: Re:[milter-greylist] Solaris crash on db dump (FIXED!)
From: shuttlebox <shuttlebox@...>
To: milter-greylist@yahoogroups.com
Date: 05/09/2007 13:56

> 
> 
> On 9/5/07, manu@... <mailto:manu%40netbsd.org> <manu@...
> <mailto:manu%40netbsd.org>> wrote:
>> Jeff A. Earickson <jaearick@... <mailto:jaearick%40colby.edu>>
> wrote:
>>
>> > I am happy to report that my changes to the 4.0b1 code that I submitted
>> > yesterday via makepatch have solved my issues with milter-greylist
>> > crashing at db dump time. Before the code change, it would crash at
>> > least a dozen times a day; it has not crashed once since yesterday's
>> > code change (15+ hours). Many thanks to Mr. Herrb for finding that
>> > Sun article.
>>
>> The change looks really minor, I wonder if this is safe for inclusion in
>> 4.0b2. Could the F flag be harmful to some Solaris setups? Or is the
>> Solaris situation so bad that nobody can get milter-greylist working
>> without that change?
>>
>> In the first case, we should probably make this available as a configure
>> option, disabled by default.
> 
> I for one would be very happy if you could include it in the next
> beta, especially if it's an option, then no harm will be done.
> 
> I can't get it to run reliably either, sometimes it runs for weeks and
> sometimes it crashes several times an hour.
> 

What actually happens at the point of writing the DB file ? Do you just
write the updates that have changed since the previous write, or dump
the whole DB everytime. The reason I'm asking,  is that I have also made
a small change, but only to my config file, reducing the dump frequency
from the default of 10m down to 3m. Since then, almost a week now, I
have not had the process crash on me once. Up until then, I was seeing
the process crash every 10 minutes for hours on end. It seems as though
it's potentially having to write less to the file.

This is still using version 3.0 on Solaris 9, not 10, and in a 32bit
environment.

Cheers,

Richard

-- 
Richard Whelan
Senior Systems & NMS Administrator

Pipex Communications

Tel:  +44 (0) 1865 381568
Mob:  +44 (0) 7786 276020
Web:  http://www.pipex.com

This e-mail is subject to: http://www.pipex.net/disclaimer.html

Re: [milter-greylist] crash on db dump (still)

2007-09-05 by Johann Klasek

On Thu, Aug 30, 2007 at 07:29:47PM +0200, manu@... wrote:
> Jeff A. Earickson <jaearick@...> wrote:
> 
> > I now get this with 4.0b1:
> > 
> >     cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
> >     Error 0
> 
> Hum... errno was not set after a failed fdopen? Any Solaris expert that
> could explain what is going on?

Just for completness:

Solaris man page (eg. for fdopen) says (even on Sol 10):

     The fdopen() function may fail and not set  errno  if  there
     are no free stdio streams.


This issue is addressed by

http://jk.kom.tuwien.ac.at/~jklasek/software/milter-greylist/mg.stdio-handling.patch

which seems not to interfere with other platforms but improves
compatibility somehow.


Johann Klasek

Re: [milter-greylist] crash on db dump (still)

2007-09-05 by Jeff A. Earickson

On Wed, 5 Sep 2007, Johann Klasek wrote:

> Date: Wed, 5 Sep 2007 15:49:53 +0200
> From: Johann Klasek <johann@...>
> Reply-To: milter-greylist@yahoogroups.com
> To: milter-greylist@yahoogroups.com
> Subject: Re: [milter-greylist] crash on db dump (still)
> 
> On Thu, Aug 30, 2007 at 07:29:47PM +0200, manu@... wrote:
>> Jeff A. Earickson <jaearick@...> wrote:
>>
>>> I now get this with 4.0b1:
>>>
>>>     cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
>>>     Error 0
>>
>> Hum... errno was not set after a failed fdopen? Any Solaris expert that
>> could explain what is going on?
>
> Just for completness:
>
> Solaris man page (eg. for fdopen) says (even on Sol 10):
>
>     The fdopen() function may fail and not set  errno  if  there
>     are no free stdio streams.
>
>
> This issue is addressed by
>
> http://jk.kom.tuwien.ac.at/~jklasek/software/milter-greylist/mg.stdio-handling.patch
>
> which seems not to interfere with other platforms but improves
> compatibility somehow.
>
>
> Johann Klasek

Hi,

I tried this patch a few days ago (Sparc, using gcc 4.2) and it did not
help at all.  I had the same number of crashes as before.

Jeff Earickson
Colby College

Re: [milter-greylist] crash on db dump (still)

2007-09-05 by Chris Hoogendyk

Johann Klasek wrote:
> On Thu, Aug 30, 2007 at 07:29:47PM +0200, manu@... wrote:
>   
>> Jeff A. Earickson <jaearick@...> wrote:
>>
>>     
>>> I now get this with 4.0b1:
>>>
>>>     cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
>>>     Error 0
>>>       
>> Hum... errno was not set after a failed fdopen? Any Solaris expert that
>> could explain what is going on?
>>     
>
> Just for completness:
>
> Solaris man page (eg. for fdopen) says (even on Sol 10):
>
>      The fdopen() function may fail and not set  errno  if  there
>      are no free stdio streams.
>
>
> This issue is addressed by
>
> http://jk.kom.tuwien.ac.at/~jklasek/software/milter-greylist/mg.stdio-handling.patch
>
> which seems not to interfere with other platforms but improves
> compatibility somehow.

I see that exact line in the man page on my Solaris 9 system. We are
still running the milter-greylist 1.6 that we installed over a year and
a half ago. At the time we had some issues with 2.0.2 and decided to
stick with 1.6. Since then, I've never bothered to update. Ahh,
remembering now, we are running poprelayd which creates popip.db from
the uw-imap.log and there is a patch to milter-greylist that looks up
the IP in popip.db and bypasses greylisting if it is found. The guy who
wrote that left just as 2.0.2 came out, and I hadn't had time to dig
into it yet when we went online.

We did have some issues with stability. I don't recall what we did to
make things better. It wasn't coding. Even so, we have a cron script
that runs every 15 minutes or so called greycheck. It checks to see that
greylist is still running; and, if it isn't, starts it up again and
sends us an email. It doesn't happen that often. Interesting thing is
that when it happened a couple of weeks ago, it happened on both of the
departmental servers we are responsible for. These servers are
independent, on different subnets, and with completely different users
and community of contacts.

I don't know if there is any connection between the code currently being
discussed and the version that we are running. It seems that a great
many of the features that have been added ought to have been pretty
independent of the db writing segment. Anyway, I just thought I would
throw this into the discussion.

One of my projects this fall is to update everything on our mail
servers. Up to now, we've been tuning and adjusting. But it seems time
to upgrade, especially if there seem to be some Solaris stability
improvements coming in.


---------------

Chris Hoogendyk

-
   O__  ---- Systems Administrator
  c/ /'_ --- Biology & Geology Departments
 (*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst 

<hoogendyk@...>

--------------- 

Erd\ufffds 4

Re: [milter-greylist] crash on db dump (still)

2007-09-05 by Johann Klasek

On Wed, Sep 05, 2007 at 10:46:36AM -0400, Jeff A. Earickson wrote:
> On Wed, 5 Sep 2007, Johann Klasek wrote:
> 
> > Date: Wed, 5 Sep 2007 15:49:53 +0200
> > From: Johann Klasek <johann@...>
> > Reply-To: milter-greylist@yahoogroups.com
> > To: milter-greylist@yahoogroups.com
> > Subject: Re: [milter-greylist] crash on db dump (still)
> > 
> > On Thu, Aug 30, 2007 at 07:29:47PM +0200, manu@... wrote:
> >> Jeff A. Earickson <jaearick@...> wrote:
> >>
> >>> I now get this with 4.0b1:
> >>>
> >>>     cannot write dumpfile "/var/milter-greylist/greylist.db-XXn4aOmb":
> >>>     Error 0
> >>
> >> Hum... errno was not set after a failed fdopen? Any Solaris expert that
> >> could explain what is going on?
> >
> > Just for completness:
> >
> > Solaris man page (eg. for fdopen) says (even on Sol 10):
> >
> >     The fdopen() function may fail and not set  errno  if  there
> >     are no free stdio streams.
> >
> >
> > This issue is addressed by
> >
> > http://jk.kom.tuwien.ac.at/~jklasek/software/milter-greylist/mg.stdio-handling.patch
> >
> > which seems not to interfere with other platforms but improves
> > compatibility somehow.
> >
> >
> > Johann Klasek
> 
> Hi,
> 
> I tried this patch a few days ago (Sparc, using gcc 4.2) and it did not
> help at all.  I had the same number of crashes as before.

Sure because the above patch is only a part of a patch cluster
(see http://jk.kom.tuwien.ac.at/~jklasek/software/milter-greylist/milter-greylist-4.0a6.patch
for a compainion patch, slightly different, alas against 4.0a6)
which does *not* resolve the problem with the crashes.

You have to manually add module file_ext.o during the linkage process (Makefile)
and define the macro USE_FD_POOL (e.g. via CFLAGS).

When running log messages like

Sep  5 14:53:19 tuvok milter-greylist: [ID 411381 mail.info] fdopen_ext: get_pool_desc: descriptor 263 reused as 3

should appear ... (grep for "fdopen_ext")

We have this already on heavly used mailservers in a production environment -
seem really to work.

Johann

RE: [milter-greylist] crash on db dump (FIX, MAYBE)

2007-09-05 by attila.bruncsak@itu.int

> I have a feature suggestion too.  It would be nice if one could send
> a "kill" signal to milter-greylist (like HUP or SIGUSR1) to have the
> process dump the greylist.db file.  Then the startup/shutdown script
> could be modified so that the db file was dumped right before 
> milter-greylist was shut down, eg:
> 
>    stop)
>          # Stop daemons.
>          echo "Dump greylist.db and shut down milter-greylist: ... \c"
>  		/usr/bin/pkill -HUP milter greylist
>          /usr/bin/pkill milter-greylist
>          echo "done."
> 
> Jeff Earickson
> Colby College
> 

As far as I know the dump always happens now on exit of milter-greylist
(HUP and TERM signals), at least with the recent beta version.

Bests,
Attila

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.