Yahoo Groups archive

Milter-greylist

Index last updated: 2026-04-28 23:32 UTC

Thread

milter-greylist-2.0.2 crash

milter-greylist-2.0.2 crash

2005-12-19 by Eugene Filatov

Hello.

I use milter-greylist-2.0.2 under SunOS 5.8 (Sparc) with sendmail 8.12.11
and alt_spf. Last week I started it for whole site, and currently my db
size is 55Mb. It seems that after some size of db (or after hight load)
milter-greylist started to crash.

I wrote small script for starting greylist and first started it on
Friday:

#!/bin/sh

ulimit -HS -n 2048

while ( true )
do
date >> /var/milter-greylist/start.log
/usr/local/bin/milter-greylist -D
sleep 3
done

Now in log file I have:

# cat /var/milter-greylist/start.log
Fri Dec 16 10:04:06 EET 2005
Sun Dec 18 18:43:19 EET 2005
Sun Dec 18 20:40:15 EET 2005
Mon Dec 19 11:43:50 EET 2005
Mon Dec 19 13:23:16 EET 2005

It crashes 4 times during two days :( Does anyone have ideas how can it be
fixed?

Thanks.

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2005-12-19 by Emmanuel Dreyfus

On Mon, Dec 19, 2005 at 01:35:58PM +0200, Eugene Filatov wrote:
> It crashes 4 times during two days :( Does anyone have ideas how can it be
> fixed?

1) Does running with option nospf fix your problem? If it does, then that
means your libspf is linked with a thread-unsafe DNS resolver.

2) Change the Makefile and add -g in CFLAGS, rebuild milter-greylist, and
run it with gdb:
# gdb milter-greylist
(gdb) run -Fv

When it will crash, type bt and send the output.
-- 
Emmanuel Dreyfus
manu@...

Re: [milter-greylist] milter-greylist-2.0.2 crash

2005-12-21 by Eugene Filatov

On Mon, 19 Dec 2005, Emmanuel Dreyfus wrote:

> On Mon, Dec 19, 2005 at 01:35:58PM +0200, Eugene Filatov wrote:
> > It crashes 4 times during two days :( Does anyone have ideas how can it be
> > fixed?
> 1) Does running with option nospf fix your problem? If it does, then that
> means your libspf is linked with a thread-unsafe DNS resolver.

Unfortunately I was unable to find in spf_alt any options which belongs to
threads and to resolver. Now I compiled milter-greylist with libspf2 and
thread-safe resolver (bind) and it seems working. Hope, it will work
stable.

I found one thing which confused me:  after reinstallation
of milter-greylist, /var/milter-greylist changed it's owner to root.
After that milter-greylist is unable to start because it can't create
socket. I run greylist from smmsp user.

I think that it is not nessesary to change owner of this dir if it'
already exist and it's will be good to add error message to milter like
"Can't create socket".


>
> 2) Change the Makefile and add -g in CFLAGS, rebuild milter-greylist, and
> run it with gdb:
> # gdb milter-greylist
> (gdb) run -Fv
>
> When it will crash, type bt and send the output.
> --
> Emmanuel Dreyfus
> manu@...
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>


Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Eugene Filatov

On Mon, 19 Dec 2005, Emmanuel Dreyfus wrote:

> 2) Change the Makefile and add -g in CFLAGS, rebuild milter-greylist, and
> run it with gdb:
> # gdb milter-greylist
> (gdb) run -Fv

I upgraded till milter-greylist 2.1.2 and found that -g flag is already in
CFLAGS in makefile. I don't want to run milter-greylist under gdb, because
I don't want to stop this process when it will crash.

I set no limits to greylist's core but, unfortunally I found that no core
dumped after it's crash. Is it possible that greylist do not leave any
core files after crash or I missed something?

Current limits for greylist:
 plimit 20180
20180:  /usr/local/bin/milter-greylist
   resource              current         maximum
  time(seconds)         unlimited       unlimited
  file(blocks)          unlimited       unlimited
  data(kbytes)          unlimited       unlimited
  stack(kbytes)         8192            unlimited
  coredump(blocks)      unlimited       unlimited
  nofiles(descriptors)  2048            2048
  vmemory(kbytes)       unlimited       unlimited

Currently I running greylist from root user.

> When it will crash, type bt and send the output.
> --
> Emmanuel Dreyfus
> manu@...
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>


Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Emmanuel Dreyfus

On Wed, Jan 25, 2006 at 02:03:22PM +0200, Eugene Filatov wrote:
> I upgraded till milter-greylist 2.1.2 and found that -g flag is already in
> CFLAGS in makefile. I don't want to run milter-greylist under gdb, because
> I don't want to stop this process when it will crash.

Won't any process stop when it crashes?

> I set no limits to greylist's core but, unfortunally I found that no core
> dumped after it's crash. Is it possible that greylist do not leave any
> core files after crash or I missed something?

It's not on its own to decide that. Maybe it does not have write access to
its current directory.

> Current limits for greylist:
>  plimit 20180
> 20180:  /usr/local/bin/milter-greylist
>    resource              current         maximum
>   time(seconds)         unlimited       unlimited
>   file(blocks)          unlimited       unlimited
>   data(kbytes)          unlimited       unlimited
>   stack(kbytes)         8192            unlimited
>   coredump(blocks)      unlimited       unlimited
>   nofiles(descriptors)  2048            2048
>   vmemory(kbytes)       unlimited       unlimited

If it's not a limit issue, then it's a bug...

-- 
Emmanuel Dreyfus
manu@...

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Eugene Filatov

On Wed, 25 Jan 2006, Emmanuel Dreyfus wrote:

> > I upgraded till milter-greylist 2.1.2 and found that -g flag is already in
> > CFLAGS in makefile. I don't want to run milter-greylist under gdb, because
> > I don't want to stop this process when it will crash.
> Won't any process stop when it crashes?

I use small script which starts greylist again after crash. I don't know
how to start it automatically again under gdb. Another bad thing is that
sometimes greylist works well for two or three days, sometimes it crashes
twice per day. I think that keep gdb running on my server is not very good
thing for server's resources.

For example here my "crash" log for last days:

Fri Jan 20 11:08:24 EET 2006
Fri Jan 20 14:29:30 EET 2006
Fri Jan 20 20:21:25 EET 2006
Mon Jan 23 10:27:59 EET 2006
Mon Jan 23 19:27:36 EET 2006
Tue Jan 24 16:44:13 EET 2006
Wed Jan 25 14:04:07 EET 2006
Wed Jan 25 14:56:23 EET 2006

> > I set no limits to greylist's core but, unfortunally I found that no core
> > dumped after it's crash. Is it possible that greylist do not leave any
> > core files after crash or I missed something?
> It's not on its own to decide that. Maybe it does not have write access to
> its current directory.

Usually i run greylist as smmsp user and from /var/milter-greylist
directory (which is owned by smmsp). But when I got no core with smmsp
user, I started it as "root" but result is still same - no core.

> If it's not a limit issue, then it's a bug...

How can I catch it? Any ideas?

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Eugene Filatov

On Wed, 25 Jan 2006, Emmanuel Dreyfus wrote:

> If it's not a limit issue, then it's a bug...

In addition to my previos letter. The following string I found in my
syslog before crash. Maybe it will be helpful.

Jan 25 16:08:39 snark milter-greylist: [ID 525239 mail.error] cannot write
dumpfile "/var/milter-greylist/greylist.db-XXJeaaLX": No such file or directory
Jan 25 16:08:39 snark milter-greylist: [ID 421540 mail.info] Final
database dump: no change to dump
Jan 25 16:08:39 snark milter-greylist: [ID 649572 mail.info] Exiting

at the same time empty file greylist.db-XXJeaaLX was created and leaved in
/var/milter-greylist/:

-rw-------   1 root     other          0 Jan 25 16:08 greylist.db-XXJeaaLX

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Emmanuel Dreyfus

On Wed, Jan 25, 2006 at 03:52:46PM +0200, Eugene Filatov wrote:
> I use small script which starts greylist again after crash. I don't know
> how to start it automatically again under gdb. Another bad thing is that
> sometimes greylist works well for two or three days, sometimes it crashes
> twice per day. I think that keep gdb running on my server is not very good
> thing for server's resources.

I don't think it's such a problem. gdb will just sit down quitely until the
kernel tells it that the traced process got a signal.

What you need is a way to detect that the milter crashed. On my system
(NetBSD), ps will report a traced and stoped program by a TX in the flags 
column. Maybe you can tweak your script so that it restarts milter-greylist
when it sees an instance in that state?

> > If it's not a limit issue, then it's a bug...
> How can I catch it? Any ideas?

We need to know where it happens, and a gdb backtrace will tell us that.

-- 
Emmanuel Dreyfus
manu@...

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Emmanuel Dreyfus

On Wed, Jan 25, 2006 at 04:16:37PM +0200, Eugene Filatov wrote:
> Jan 25 16:08:39 snark milter-greylist: [ID 525239 mail.error] cannot write
> dumpfile "/var/milter-greylist/greylist.db-XXJeaaLX": No such file or directory
> Jan 25 16:08:39 snark milter-greylist: [ID 421540 mail.info] Final
> database dump: no change to dump
> Jan 25 16:08:39 snark milter-greylist: [ID 649572 mail.info] Exiting
 
Oh then it does not crash, it quits. Forget the gdb story.

Here is the revelant code section:
        if ((dumpfd = mkstemp(newdumpfile)) == -1) {
                syslog(LOG_ERR, "mkstemp(\"%s\") failed: %s",
                    newdumpfile, strerror(errno));
                exit(EX_OSERR);
        }  
 
        if ((dump = fdopen(dumpfd, "w")) == NULL) {
                syslog(LOG_ERR, "cannot write dumpfile \"%s\": %s",
                    newdumpfile, strerror(errno));
                exit(EX_OSERR);
        } 

You say the file exists:
> -rw-------   1 root     other          0 Jan 25 16:08 greylist.db-XXJeaaLX

It sounds like an OS-speicifc mystery. Can you remind me what OS you use?

-- 
Emmanuel Dreyfus
manu@...

RE: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by attila.bruncsak@itu.int

> In addition to my previos letter. The following string I found in my
> syslog before crash. Maybe it will be helpful.
> 
> Jan 25 16:08:39 snark milter-greylist: [ID 525239 mail.error] 
> cannot write
> dumpfile "/var/milter-greylist/greylist.db-XXJeaaLX": No such 
> file or directory
> Jan 25 16:08:39 snark milter-greylist: [ID 421540 mail.info] Final
> database dump: no change to dump
> Jan 25 16:08:39 snark milter-greylist: [ID 649572 mail.info] Exiting
> 
> at the same time empty file greylist.db-XXJeaaLX was created 
> and leaved in
> /var/milter-greylist/:
> 
> -rw-------   1 root     other          0 Jan 25 16:08 
> greylist.db-XXJeaaLX
> 

Might be disk full condition, but rather the milter may have
run out of file pointers.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Emmanuel Dreyfus

On Wed, Jan 25, 2006 at 03:56:15PM +0100, attila.bruncsak@... wrote:
> Might be disk full condition, but rather the milter may have
> run out of file pointers. 

I recall a stupid limit on solaris with the number of FILE * you could
open.  The thing was dependent on the ABI, as far as I recall.

-- 
Emmanuel Dreyfus
manu@...

RE: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by attila.bruncsak@itu.int

> > Might be disk full condition, but rather the milter may have
> > run out of file pointers. 
> 
> I recall a stupid limit on solaris with the number of FILE * you could
> open.  The thing was dependent on the ABI, as far as I recall.
> 
Yes, I remember now. It simply does not work with 32 bit binaries
but you have to compile the 64 bit version.

RE: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Eugene Filatov

On Wed, 25 Jan 2006 attila.bruncsak@... wrote:

> > Jan 25 16:08:39 snark milter-greylist: [ID 525239 mail.error]
> > cannot write
> > dumpfile "/var/milter-greylist/greylist.db-XXJeaaLX": No such
> > file or directory
> Might be disk full condition, but rather the milter may have
> run out of file pointers.

I just looked at file descriptors. greylist has limit set to 2048
and uses at current time approx. 150-180 descriptors (now we have average
load at mail server). I don't think that limit with 2048 descriptors is
possible to reach in my case.

To avoid problems with space I moved greylist's db file to another file
system with much free space.

Will write about results later.

Anyway is it a good thing to exit if we're unable to save db file on disk?
I think it's better to put error message in log file and continue to work.

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Eugene Filatov

On Wed, 25 Jan 2006, Emmanuel Dreyfus wrote:

> It sounds like an OS-speicifc mystery. Can you remind me what OS you use?

#uname -a
SunOS 5.8 Generic_108528-22 sun4u sparc SUNW,Ultra-80

Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Emmanuel Dreyfus

On Wed, Jan 25, 2006 at 05:45:04PM +0200, Eugene Filatov wrote:
> > It sounds like an OS-speicifc mystery. Can you remind me what OS you use?
> #uname -a
> SunOS 5.8 Generic_108528-22 sun4u sparc SUNW,Ultra-80

Try looking in the list archive. I'm now convinced this is the same problem
we had before with Solaris: there is a rather low limit on the number of
file pointers (FILE *) you can have. The limit is lower than the file 
descriptor limit, andit is hardcoded in the libc.

As far as I remember, the limit was specific to a given ABI. The problem
was fixed by rebuilding with an onter ABI (was it 64 bit?)
 
-- 
Emmanuel Dreyfus
manu@...

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Eugene Filatov

On Wed, 25 Jan 2006, Emmanuel Dreyfus wrote:

> Try looking in the list archive. I'm now convinced this is the same problem
> we had before with Solaris: there is a rather low limit on the number of
> file pointers (FILE *) you can have. The limit is lower than the file
> descriptor limit, andit is hardcoded in the libc.
> As far as I remember, the limit was specific to a given ABI. The problem
> was fixed by rebuilding with an onter ABI (was it 64 bit?)

I my case I have 32bit OS...

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Matthias Scheler

On Wed, Jan 25, 2006 at 04:18:46PM +0000, Emmanuel Dreyfus wrote:
> Try looking in the list archive. I'm now convinced this is the same problem
> we had before with Solaris: there is a rather low limit on the number of
> file pointers (FILE *) you can have.

It's 256 for 32Bit binaries.

> As far as I remember, the limit was specific to a given ABI.

Yes, it's specific to the 32Bit ABI.

> The problem was fixed by rebuilding with an onter ABI (was it 64 bit?)

Yes, that works arround the problem.

	Kind regards

-- 
Matthias Scheler                                  http://scheler.de/~matthias/

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Matthias Scheler

On Wed, Jan 25, 2006 at 02:34:23PM +0000, Emmanuel Dreyfus wrote:
> Here is the revelant code section:
>         if ((dumpfd = mkstemp(newdumpfile)) == -1) {
>                 syslog(LOG_ERR, "mkstemp(\"%s\") failed: %s",
>                     newdumpfile, strerror(errno));
>                 exit(EX_OSERR);
>         }  

Crude untested hack:

#ifdef SOLARIS_FD_WORKARROUND
	if (dumpfd > 255) {
		int	retries = 10;

		while (dumpfd > 255) {
			int	lowfd;

			lowfd = dup(dumpfd);
			if (lowfd <= 255) {
				(void)close(dupfd);
				dupfd = lowfd;
				break;
			}

			if (retries-- == 0) {
				/* Handle error gracefully */
				...
			}

			(void) usleep(10);
		}
	}
#endif /* SOLARIS_FD_WORKARROUND */

	Kind regards

-- 
Matthias Scheler                                  http://scheler.de/~matthias/

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by manu@netbsd.org

Matthias Scheler <tron@...> wrote:

> Crude untested hack:

That looks painful :-)

Eugene, can you finish and test Matthias' hack? If it works, I'll
integrate it.

We also need a configure test to enable the hack. Any suggestion?

-- 
Emmanuel Dreyfus
Un bouquin en français sur BSD:
http://www.eyrolles.com/Informatique/Livre/9782212114638/livre-bsd.php
manu@netbsd.org

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-25 by Matthias Scheler

On Wed, Jan 25, 2006 at 10:55:26PM +0100, Emmanuel Dreyfus wrote:
> Eugene, can you finish and test Matthias' hack? If it works, I'll
> integrate it.

There's a typo in it ("dupfd" instead of "dumpfd") but the compiler
will catch that.

> We also need a configure test to enable the hack.

Do we? The check below might be good enough:

#if defined(__sun__) && !defined(_LP64)

If you insist on a configure check you can include "stdio_impl.h",
check whether "struct __FILE_TAG" has a member "_file" and examine
its size. If the size is 1 (unsigned char) you got the API/ABI problem.

	Kind regards

-- 
Matthias Scheler                                  http://scheler.de/~matthias/

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by Eugene Filatov

On Wed, 25 Jan 2006 manu@... wrote:

> > Crude untested hack:
> That looks painful :-)
> Eugene, can you finish and test Matthias' hack? If it works, I'll
> integrate it.

Yes.

I modified it a little with help of my friend which knows Solaris and
C much better than me :-) My friend told me that FD equal to 255 is also
not acceptible on Solaris and he added "close" funnction to duplicated FD
which is not acceptible.

Here is the difference:

/opt/tmp/milter-greylist-2.1.2# diff -c dump.c.orig dump.c
*** dump.c.orig Thu Jan 26 09:36:32 2006
--- dump.c      Thu Jan 26 10:11:26 2006
***************
*** 196,201 ****
--- 196,233 ----
                exit(EX_OSERR);
        }

+ /* SOLARIS_FD_WORKARROUND */
+         if (dumpfd > 255) {
+                 int     retries = 10;
+
+                 while (dumpfd > 254) {
+                         int     lowfd;
+                       syslog(LOG_ERR, "current FD is %d (>254). we're duplicating it.",
+                           dumpfd);
+                         lowfd = dup(dumpfd);
+                         if (lowfd <= 254) {
+                                 (void)close(dumpfd);
+                                 dumpfd = lowfd;
+                               syslog(LOG_ERR, "new FD is %d. now we can write dump file.",
+                                  dumpfd);
+                                 break;
+                         } else {
+                             (void)close(lowfd);
+                             syslog(LOG_ERR, "current FD is %d (>254). duplicating it again.",
+                                dumpfd);
+                         }
+
+                         if (retries-- == 0) {
+                                /* Handle error gracefully */
+                                syslog(LOG_ERR, "cannot get FD lower than 255. sad but true.");
+                                exit(EX_OSERR);
+                         }
+
+                         (void) usleep(10);
+                 }
+         }
+
+
        if ((dump = fdopen(dumpfd, "w")) == NULL) {
                syslog(LOG_ERR, "cannot write dumpfile \"%s\": %s",
                    newdumpfile, strerror(errno));

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by Oliver Fromme

manu@... wrote:
 > We also need a configure test to enable the hack. Any suggestion?

I don't know if this helps, but the canonical way on Solaris
to find out what binaries are supported (32bit and/or 64bit)
is to use /usr/bin/isainfo.  On a 32bit system, it prints
one line:

   32-bit sparc applications

(Or "i386" instead of "sparc", depending on hardware.)
When 64bit binaries are supported, it prints two lines:

   64-bit sparcv9 applications
   32-bit sparc applications

Of course, if /usr/bin/isainfo does not exist, then the
system is too old and supports 32bit only.

In order to find out whether you're compiling for 32bit
or 64bit, the safest way is probably to compile a small
dummy binary (with user-supplied $CC, $CFLAGS etc.), and
then just parse the output from "file binary".  Depending
on the compilation target architecture, it prints:

   dummy:  ELF 32-bit MSB executable SPARC Version 1, ...

or:

   dummy:  ELF 64-bit MSB executable SPARCV9 Version 1, ...

Please note that I'm not a Solaris expert.  There might be
better ways to do it, but the above has worked for me.

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

With Perl you can manipulate text, interact with programs, talk over
networks, drive Web pages, perform arbitrary precision arithmetic,
and write programs that look like Snoopy swearing.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by Matthias Scheler

On Thu, Jan 26, 2006 at 10:34:43AM +0200, Eugene Filatov wrote:
> I modified it a little with help of my friend which knows Solaris and
> C much better than me :-) My friend told me that FD equal to 255 is also
> not acceptible on Solaris ...

Ah, I didn't know that.

> ... and he added "close" funnction to duplicated FD which is not acceptible.

Oops. :-)

> Here is the difference:

Does that patch actually help?

	Kind regards

-- 
Matthias Scheler                                  http://scheler.de/~matthias/

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by Eugene Filatov

On Thu, 26 Jan 2006, Matthias Scheler wrote:

> On Thu, Jan 26, 2006 at 10:34:43AM +0200, Eugene Filatov wrote:
> > I modified it a little with help of my friend which knows Solaris and
> > C much better than me :-) My friend told me that FD equal to 255 is also
> > not acceptible on Solaris ...
> Ah, I didn't know that.
> > ... and he added "close" funnction to duplicated FD which is not acceptible.
> Oops. :-)
> > Here is the difference:
> Does that patch actually help?

It's works well, but still no intresting errors in my syslog.
I decreased dump interval to 5 minues. Hope that I catch "wrong"
file descriptor soon :-)

Will write later about results.

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by Eugene Filatov

On Thu, 26 Jan 2006, Eugene Filatov wrote:

> > Does that patch actually help?
> Will write later about results.

Yes, it's worked.

I found some bugs in output to syslog in previous patch, corrected version
is below. Also I increased usleep time.

My friend which which knows C told me that "usleep" is thread unsafe and
it's better to use something like "nanosleep". I tried "nanosleep" but
something wrong happened with milte-greylist (it crashed every two
minutes).

What about  using "open, malloc+snprintf+write+free" in milter-greylist?
It should be the best solution.

Best Regards, 					mailto:eugenef@...
	Eugene.


*** dump.c.orig Thu Jan 26 09:36:32 2006
--- dump.c      Thu Jan 26 16:48:54 2006
***************
*** 196,201 ****
--- 196,232 ----
                exit(EX_OSERR);
        }

+ /* SOLARIS_FD_WORKARROUND */
+         if (dumpfd > 255) {
+                 int     retries = 10;
+
+                 while (dumpfd > 254) {
+                         int     lowfd;
+                       syslog(LOG_ERR, "current FD is %d (>254). we're duplicating it.",
+                           dumpfd);
+                         lowfd = dup(dumpfd);
+                         if (lowfd <= 254) {
+                                 (void)close(dumpfd);
+                                 dumpfd = lowfd;
+                               syslog(LOG_ERR, "new FD is %d. now we can write dump file.",
+                                  dumpfd);
+                                 break;
+                         } else {
+                             (void)close(lowfd);
+                             syslog(LOG_ERR, "new FD is %d (>254). duplicating it again (%d tries remain).",
+                                lowfd, retries);
+                         }
+
+                         if (retries-- == 0) {
+                                /* Handle error gracefully */
+                                syslog(LOG_ERR, "cannot get FD lower than 255. sad but true.");
+                                exit(EX_OSERR);
+                         }
+
+                         (void) usleep (1000000);
+                 }
+         }
+
        if ((dump = fdopen(dumpfd, "w")) == NULL) {
                syslog(LOG_ERR, "cannot write dumpfile \"%s\": %s",
                    newdumpfile, strerror(errno));

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by Matthias Scheler

On Thu, Jan 26, 2006 at 05:14:43PM +0200, Eugene Filatov wrote:
> My friend which which knows C told me that "usleep" is thread unsafe ...

Where does he have that information from? The Solaris 9 manual page says:

     ____________________________________________________________
    |       ATTRIBUTE TYPE        |       ATTRIBUTE VALUE       |
    |_____________________________|_____________________________|
    | MT-Level                    | Safe                        |
    |_____________________________|_____________________________|

That mean it *is* thread safe.

> What about  using "open, malloc+snprintf+write+free" in milter-greylist?

A memory large enough for the whole dump files would dramatically increase
the memory footprint of "milter-greylist". And you don't know how large the
buffer needs to be. Writing code for which manages a small buffer means
reimplementing a lot of stdio.

> It should be the best solution.

I disagree. It would mean writing and maintaing a lot of extra code just
because of a problem with 32Bit binaries under Solaris.

	Kind regards

-- 
Matthias Scheler                                  http://scheler.de/~matthias/

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by Eugene Filatov

On Thu, 26 Jan 2006, Matthias Scheler wrote:

> On Thu, Jan 26, 2006 at 05:14:43PM +0200, Eugene Filatov wrote:
> > My friend which which knows C told me that "usleep" is thread unsafe ...
> Where does he have that information from? The Solaris 9 manual page says:
> That mean it *is* thread safe.

We have Solaris 8. And manual says that usleep is unsafe :(

> > What about  using "open, malloc+snprintf+write+free" in milter-greylist?
> A memory large enough for the whole dump files would dramatically increase
> the memory footprint of "milter-greylist". And you don't know how large the
> buffer needs to be. Writing code for which manages a small buffer means
> reimplementing a lot of stdio.

I'm not very proficient in this area, but I think that it's possible to
calculate site of each string and allocate memory per each string.

> > It should be the best solution.
> I disagree. It would mean writing and maintaing a lot of extra code just
> because of a problem with 32Bit binaries under Solaris.

I agree with you, it could be a big change to code.
Do you see any ways which will _fix_ this problem under Solaris?
Patch which I used is only decreases chances of getting problem but do not
solve it.

Another idea - is it possible to catch file descriptor wich less than 254
in the beginning and keep (reserve) it for future use when we will need to
dump?

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by Matthias Scheler

On Thu, Jan 26, 2006 at 06:01:42PM +0200, Eugene Filatov wrote:
> I'm not very proficient in this area, but I think that it's possible to
> calculate site of each string and allocate memory per each string.

Yes, but that would be very slow because of the huge number of calls to
malloc(), free() and write().

> Do you see any ways which will _fix_ this problem under Solaris?

Yes, boot your machine with a 64Bit kernel, download the free Studio 10
compiler(*) from Sun and build "milter-greylist" with that.

> Another idea - is it possible to catch file descriptor wich less than 254
> in the beginning and keep (reserve) it for future use when we will need to
> dump?

There are still possible race conditions.  If you use fclose() which
calls close() the file descriptor is free to be reused.

	Kind regards

(*) Check whether it works with Solaris 8 first. If it doesn't build your
    own 64Bit GCC. Or just update to Solaris 10 which is faster anyway
    and comes with a 64Bit GCC.

-- 
Matthias Scheler                                  http://scheler.de/~matthias/

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by manu@netbsd.org

Eugene Filatov <eugenef@...> wrote:

> What about  using "open, malloc+snprintf+write+free" in milter-greylist?
> It should be the best solution.

I'm not sure it's easy to do in a clean way. Do you want to give it a
try?
-- 
Emmanuel Dreyfus
Un bouquin en français sur BSD:
http://www.eyrolles.com/Informatique/Livre/9782212114638/livre-bsd.php
manu@...

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-26 by Oliver Fromme

Matthias Scheler wrote:
 > Eugene Filatov wrote:
 > > Another idea - is it possible to catch file descriptor wich less than 254
 > > in the beginning and keep (reserve) it for future use when we will need to
 > > dump?

That sounds like a good idea, I think.

 > There are still possible race conditions.  If you use fclose() which
 > calls close() the file descriptor is free to be reused.

How about not closing the file at all, but keeping it open?
Of course, fflush() must be called after each dump (maybe
also fsync()), and fseek(0) and truncate(0) before starting
to write the next dump.

Actually, _two_ file descriptors will be required which are
used alternating, so there's always one complete dump on
the disk, while the other is being written, then rename()
is used to exchange them.  But both files can be kept open
all the time, as described above.

I think that should be quite easy to implement.  And all
operating systems should benefit from that strategy, not
only Solaris/32bit.  It makes sure that the greylist is
always safely stored to the disk, even if some descriptor
limit is approached.

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"In My Egoistical Opinion, most people's C programs should be indented
six feet downward and covered with dirt."
        -- Blair P. Houghton

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-27 by Eugene Filatov

On Thu, 26 Jan 2006 manu@... wrote:

> > What about  using "open, malloc+snprintf+write+free" in milter-greylist?
> > It should be the best solution.
> I'm not sure it's easy to do in a clean way. Do you want to give it a
> try?

Unfortunally, I'm not proficient in C and I will be not able to implement
it :( I'll offer it to my friend, but I'm not sure that he will agree to
make it for us, because he is quite busy person.

Another idea - we can just skip dumping and report it to log if can't get
usable file descriptor on solaris. Now, in last patch we're just exit from
milter-greylist in this case.

As we found before, "usleep" is thread unsafe on solaris 8.
what about using "poll" instead of "usleep"?

Something like:

poll(NULL, NULL, 1000);

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-27 by Matthias Scheler

On Fri, Jan 27, 2006 at 10:19:59AM +0200, Eugene Filatov wrote:
> As we found before, "usleep" is thread unsafe on solaris 8.
> what about using "poll" instead of "usleep"?
> 
> Something like:
> 
> poll(NULL, NULL, 1000);

poll() takes a timeout value in milliseconds. That original code only
waited 10 microseconds. '(void) poll(NULL, 0, 1);' seems to be more
advisable.

	Kind regards

-- 
Matthias Scheler                                  http://scheler.de/~matthias/

RE: [milter-greylist] milter-greylist-2.0.2 crash

2006-01-30 by attila.bruncsak@itu.int

> Current limits for greylist:
>  plimit 20180
> 20180:  /usr/local/bin/milter-greylist
>    resource              current         maximum
>   time(seconds)         unlimited       unlimited
>   file(blocks)          unlimited       unlimited
>   data(kbytes)          unlimited       unlimited
>   stack(kbytes)         8192            unlimited
>   coredump(blocks)      unlimited       unlimited
>   nofiles(descriptors)  2048            2048
>   vmemory(kbytes)       unlimited       unlimited
> 
 
Could you try to limit the number of available file descriptors to 254 as a workaround?
Please use the vanilla code of milter-greylist-2.0.2 to test with,
of course if you have time and interested in the test.

RE: [milter-greylist] milter-greylist-2.0.2 crash

2006-02-01 by Eugene Filatov

On Mon, 30 Jan 2006 attila.bruncsak@... wrote:

> > 20180:  /usr/local/bin/milter-greylist
> >    resource              current         maximum
> >   nofiles(descriptors)  2048            2048
>
> Could you try to limit the number of available file descriptors to 254
> as a workaround?

And what should happen? We will just get errors when sendmail will try
to communicate with milter which reached descriptors limit. Tha same
problem we will have with creating db file for dumping memory to disk.

> Please use the vanilla code of milter-greylist-2.0.2 to test with,
> of course if you have time and interested in the test.

Pardon, explain please, what is "vanilla code"?

Best Regards, 					mailto:eugenef@...
	Eugene.

RE: [milter-greylist] milter-greylist-2.0.2 crash

2006-02-01 by fredrik.pettai@vattenfall.com

> > Please use the vanilla code of milter-greylist-2.0.2 to test with,
> > of course if you have time and interested in the test.
>
> Pardon, explain please, what is "vanilla code"?

The original milter-greylist-2.0.2 package, with no modified/patched
code 

/P

RE: [milter-greylist] milter-greylist-2.0.2 crash

2006-02-01 by Eugene Filatov

On Wed, 1 Feb 2006 fredrik.pettai@... wrote:

> > Pardon, explain please, what is "vanilla code"?
> The original milter-greylist-2.0.2 package, with no modified/patched
> code

thanks! :-)

Eugene.

RE: [milter-greylist] milter-greylist-2.0.2 crash

2006-02-02 by attila.bruncsak@itu.int

> > Could you try to limit the number of available file 
> descriptors to 254
> > as a workaround?
> 
> And what should happen? We will just get errors when sendmail will try
> to communicate with milter which reached descriptors limit. Tha same
> problem we will have with creating db file for dumping memory to disk.

The OS may not give smallest available file descriptor on open().
It is possible that it only wraps around when reaches "nofiles" limit.
It is very likely the case that you are not short of file descriptors,
just got one which is bigger than 254.
If you limit it to 254, the OS may wrap around on that value and gives
the smallest free one not conflicting with the stdio implementation.

> 
> > Please use the vanilla code of milter-greylist-2.0.2 to test with,
> > of course if you have time and interested in the test.
> 
> Pardon, explain please, what is "vanilla code"?
> 
non-patched version

RE: [milter-greylist] milter-greylist-2.0.2 crash

2006-02-09 by Eugene Filatov

On Thu, 2 Feb 2006 attila.bruncsak@... wrote:

> > > Could you try to limit the number of available file
> > descriptors to 254 as a workaround?
> The OS may not give smallest available file descriptor on open().
> It is possible that it only wraps around when reaches "nofiles" limit.
> It is very likely the case that you are not short of file descriptors,
> just got one which is bigger than 254.

No. Here is part of man for open() from solaris:

     The open() function returns a file descriptor for the  named
     file  that  is the lowest file descriptor not currently open
     for that process. The open  file  description  is  new,  and
     therefore  the  file  descriptor  does not share it with any

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-26 by Eugene Filatov

On Wed, 25 Jan 2006, Emmanuel Dreyfus wrote:

> > > It sounds like an OS-speicifc mystery. Can you remind me what OS you use?
> > #uname -a
> > SunOS 5.8 Generic_108528-22 sun4u sparc SUNW,Ultra-80
> Try looking in the list archive. I'm now convinced this is the same problem
> we had before with Solaris: there is a rather low limit on the number of
> file pointers (FILE *) you can have. The limit is lower than the file
> descriptor limit, andit is hardcoded in the libc.

I hope that guys will remember solaris' problem with file
pointers whcih we discussed before :-)

We're found that greylist stops on this place (dump.c):

        if ((dump = fdopen(dumpfd, "w")) == NULL) {
                syslog(LOG_ERR, "cannot write dumpfile \"%s\": %s",
                    newdumpfile, strerror(errno));
                exit(EX_OSERR);
        }

I made a crude patch wich helps me to solve thies problem (I posted it
before).

I have another simple idea - just replace "exit()" with "return" in this
function. So greylist will just skip dumping and inform about it via
syslog but it will continue to work.

Emmanuel, what do you think about it?

> As far as I remember, the limit was specific to a given ABI. The problem
> was fixed by rebuilding with an onter ABI (was it 64 bit?)

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-26 by manu@netbsd.org

Eugene Filatov <eugenef@...> wrote:

> I have another simple idea - just replace "exit()" with "return" in this
> function. So greylist will just skip dumping and inform about it via
> syslog but it will continue to work.
> 
> Emmanuel, what do you think about it?

I'm not sure it's acceptable for the average administrator: if you don't
check the logs, you get toasted when milter-greylist restarts because it
never dumped its database.

Perhaps we could figure a way of booking a FILE *?

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-27 by Eugene Filatov

On Wed, 26 Jul 2006 manu@... wrote:

> > I have another simple idea - just replace "exit()" with "return" in this
> > function. So greylist will just skip dumping and inform about it via
> > syslog but it will continue to work.
> I'm not sure it's acceptable for the average administrator: if you don't
> check the logs, you get toasted when milter-greylist restarts because it
> never dumped its database.

What about making it an option in config file which will be "off" by
default?

like:

dont_stop_on_dumperror

I can try to make a simple patch if it is intresting.

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-27 by Oliver Fromme

manu@... wrote:
 > Eugene Filatov wrote:
 > 
 > > I have another simple idea - just replace "exit()" with "return" in this
 > > function. So greylist will just skip dumping and inform about it via
 > > syslog but it will continue to work.
 > > 
 > > Emmanuel, what do you think about it?
 > 
 > I'm not sure it's acceptable for the average administrator: if you don't
 > check the logs, you get toasted when milter-greylist restarts because it
 > never dumped its database.
 > 
 > Perhaps we could figure a way of booking a FILE *?

Sorry I don't remember exactly what the actual problem is,
but ...

Shouldn't it be sufficient to open() the dump file right
from the start and never close() it?  Only always write to
it, then fsync()+lseek(0).  (And fflush()+rewind() if stdio
functions are used to write the file.)

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

One Unix to rule them all, One Resolver to find them,
One IP to bring them all and in the zone to bind them.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-27 by Eugene Filatov

On Thu, 27 Jul 2006, Oliver Fromme wrote:

>  > Perhaps we could figure a way of booking a FILE *?
> Sorry I don't remember exactly what the actual problem is,
> but ...

The problem is in low limit number of file pointers
you can have on Solaris (x86, 32Bit). It's 256.

> Shouldn't it be sufficient to open() the dump file right
> from the start and never close() it?  Only always write to
> it, then fsync()+lseek(0).  (And fflush()+rewind() if stdio
> functions are used to write the file.)

I like this idea. I can try to realize it.
I'm quite weak in C programming, but I can try :-)

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-27 by Emmanuel Dreyfus

On Thu, Jul 27, 2006 at 10:03:50AM +0300, Eugene Filatov wrote:
> > > I have another simple idea - just replace "exit()" with "return" in this
> > > function. So greylist will just skip dumping and inform about it via
> > > syslog but it will continue to work.
> > I'm not sure it's acceptable for the average administrator: if you don't
> > check the logs, you get toasted when milter-greylist restarts because it
> > never dumped its database.
> 
> What about making it an option in config file which will be "off" by
> default?

That sounds a good compromise.

> like:
> 
> dont_stop_on_dumperror

no_mandatory_dump ?
optional_dump?
opt_dump?
dump_error_ok?
nodumpfatal?

I like the last one. Opinions?

> I can try to make a simple patch if it is intresting.

Please wait a few days for the next release, or the patch will be difficult
to merge.

-- 
Emmanuel Dreyfus
manu@...

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-27 by Emmanuel Dreyfus

On Thu, Jul 27, 2006 at 09:22:52AM +0200, Oliver Fromme wrote:
> Shouldn't it be sufficient to open() the dump file right
> from the start and never close() it?  Only always write to
> it, then fsync()+lseek(0).  (And fflush()+rewind() if stdio
> functions are used to write the file.)

The idea is to preserve the file in the event of a crash during the
dump. So the dump is currently done in a temporary file, which is 
moved to replace the original file once the dump completes. 

rename(2) can buy us that. I'm not sure it's possible by using the 
same file.
-- 
Emmanuel Dreyfus
manu@...

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-27 by Eugene Filatov

On Thu, 27 Jul 2006, Oliver Fromme wrote:

> Shouldn't it be sufficient to open() the dump file right
> from the start and never close() it?  Only always write to
> it, then fsync()+lseek(0).  (And fflush()+rewind() if stdio
> functions are used to write the file.)

another way is to znalyze in configure on which system we're and if we're
on Solaris (and possible others systems?) compile another function
with open()+snprintf+write for dumping to temp file.

Best Regards, 					mailto:eugenef@...
	Eugene.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-27 by Oliver Fromme

Emmanuel Dreyfus wrote:
 > Oliver Fromme wrote:
 > > Shouldn't it be sufficient to open() the dump file right
 > > from the start and never close() it?  Only always write to
 > > it, then fsync()+lseek(0).  (And fflush()+rewind() if stdio
 > > functions are used to write the file.)
 > 
 > The idea is to preserve the file in the event of a crash during the
 > dump. So the dump is currently done in a temporary file, which is 
 > moved to replace the original file once the dump completes.

That makes a lot of sense.

 > rename(2) can buy us that. I'm not sure it's possible by using the 
 > same file.

Certainly.  rename(2) just changes the name of a directory
entry, it does _not_ change the underlying inode:  You can
see in "ls -li" that the inode stays the same when you mv(1)
or rename(2) a file.  The same happens when you hardlink
a file, except that the source name is not removed.

Also, any open file descriptors associated with the inode
don't change (neither on rename(2) nor on link(2)), they
will just work with the new name.  You can see that with
daemons like Apache:  When you rotate the logfiles (e.g.
"mv access.log access.log.0") and forget to send the daemon
a SIGHUP, it will continue to write into the file under the
new name.

So, the solution for milter-greylist would be to open two
files at the start, and use them alternating to write the
dump file and then hardlink them to the final name.  So,
in pseudo-code, it would look like this:

    start:

        name1 = "greylist.db.new1"
        name2 = "greylist.db.new2"

        fd1 = open(name1, O_CREAT | O_TRUNC)
        fd2 = open(name2, O_CREAT | O_TRUNC)

    ...
    dump:

        write db to fd1

        fflush(fd1)        // only if stdio used
        fsync(fd1)
        rewind(fd1)        // only if stdio used
        lseek(fd1,0)

        link("greylist.db", "greylist.db.old")
        link(name1, "greylist.db")

        swap fd1 and fd2
        swap name1 and name2

That's from the top of my head and I haven't implemented
it, but it should work (I'm familiar with UNIX/POSIX file
semantics for many years).  link() first removes the target
if it already exists, and it's an atomic operation, so it
ensures that always a good copy of greylist.sb exists.

The first dump is written to *.new1, and after the link(2)
functions, these files exist:

    ,--> greylist.db
    `--> greylist.db.new1
         greylist.db.old

where greylist.db and greylist.db.new1 are hardlinked.
The second dump is written to *.new2.  After the linking,
The files look like this:

    ,--> greylist.db
    |    greylist.db.new1
    `--> greylist.db.new2
         greylist.db.old

where greylist.db and greylist.db.new2 are hardlinked.
The third dump is written to *.new1 again (which is not
hardlinked to greylist.db at this time).  Afterwards:

    ,--> greylist.db
    `--> greylist.db.new1
         greylist.db.new2
         greylist.db.old

again, greylist.db and greylist.db.new1 are hardlinked.

The alternation of fd1/fd2 and name1/name2 ensures that
the dump is always written to the file which is _not_
currently hardlinked with greylist.db, so it will be
preserved if the process crashes.

In my opinion, that would be a very clean solution.

Best regards
   Oliver

PS:  fflush() and rewind() are only required when stdio
functions (those that use a "FILE *") are used to write
to the file, e.g. fwrite(3), fputs(3), fprintf(3) etc.
But you will also need fsync() in this case!  Note that
the fileno(3) function can be used to get the file
descriptor from a "FILE *", which is needed for fsync().

If only file descriptor functions are used (such as
write(2)), then fsync()+lseek() is sufficient.

Please don't be offended if you already knew all of that
(I'm sure you knew).  I mention it just to be safe.  :-)


-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

Passwords are like underwear.  You don't share them,
you don't hang them on your monitor or under your keyboard,
you don't email them, or put them on a web site,
and you must change them very often.

Re: [milter-greylist] milter-greylist-2.0.2 crash

2006-07-27 by Emmanuel Dreyfus

On Thu, Jul 27, 2006 at 10:41:53AM +0200, Oliver Fromme wrote:
> So, the solution for milter-greylist would be to open two
> files at the start, and use them alternating to write the
> dump file and then hardlink them to the final name.  

Looks great. I hope someone will contribute that patch (but please, wait
for 2.1.7 release)

-- 
Emmanuel Dreyfus
manu@...

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.