[milter-greylist] crash of milter-greylist process due to stack corruption
2015-06-03 by Bruncsak, Attila
Yahoo Groups archive
Index last updated: 2026-04-28 23:32 UTC
Thread
2015-06-03 by Bruncsak, Attila
Hello, At the end I got a usable core dump to figure out the reason of the long standing problem of milter-greylist process crash. In the function sync_waitdata() the value of "fd" is not checked against of FD_SETSIZE. It must have a value smaller than that. Please apply the patch attached to that message. While I was debugging I found code which seemingly left over (pthread_mutex_(init|lock) on an auto variable in a thread, which is nowhere else referred to) and I could optimize out one member of the struct sync (s_peer). It is not needed since each peer has his own private sync list. Best, Attila
2015-06-07 by manu@...
'Bruncsak, Attila' attila.bruncsak@... [milter-greylist]
<milter-greylist@yahoogroups.com> wrote:
> In the function sync_waitdata() the value of "fd" is not checked
> against of FD_SETSIZE. It must have a value smaller than that.
>
> Please apply the patch attached to that message.
Is there any merit to keep the select() code along with the poll() flavor?
I would rather replace it all (like below). What is your opinion?
Index: sync.c
===================================================================
RCS file: /cvsroot/milter-greylist/sync.c,v
retrieving revision 1.90
diff -U 4 -r1.90 sync.c
--- sync.c 23 Mar 2013 01:45:02 -0000 1.90
+++ sync.c 7 Jun 2015 04:30:14 -0000
@@ -47,8 +47,9 @@
#include <strings.h>
#include <pthread.h>
#include <syslog.h>
#include <sysexits.h>
+#include <poll.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
@@ -1316,34 +1317,48 @@
sync_waitdata(fd, stimeout)
int fd;
time_t stimeout;
{
- fd_set fdr, fde;
- struct timeval timeout;
+ struct pollfd fds;
int retval;
- FD_ZERO(&fdr);
- FD_SET(fd, &fdr);
+ if (stimeout == 0)
+ stimeout = INFTIM;
+ else
+ stimeout = 1000 * stimeout;
- FD_ZERO(&fde);
- FD_SET(fd, &fde);
+ fds.fd = fd;
+ fds.events = POLLIN;
+
+ retval = poll(&fds, 1, stimeout);
- timeout.tv_sec = stimeout;
- timeout.tv_usec = 0;
+ switch (retval) {
+ case 0:
+ mg_log(LOG_ERR, "%s: poll timeout", __func__);
+ break;
- retval = select(fd + 1, &fdr, NULL, &fde, stimeout == 0 ? NULL : &timeout);
+ case -1:
+ mg_log(LOG_ERR, "%s: poll error: %s",
+ __func__, strerror(errno));
+ break;
- if (retval == 0) {
- mg_log(LOG_ERR, "sync_waitdata : select timeout");
-
- } else {
- if (retval < 0) {
- mg_log(LOG_ERR, "sync_waitdata : error");
- } else {
- if (FD_ISSET(fd,&fde)) {
- mg_log(LOG_ERR, "sync_waitdata : exception fd is
set");
- }
- }
+ default:
+ if (fds.revents & POLLPRI)
+ mg_log(LOG_ERR, "%s: high priority data available",
+ __func__);
+
+ if (fds.revents & POLLERR)
+ mg_log(LOG_ERR, "%s: exceptional condition",
+ __func__);
+
+ if (fds.revents & POLLHUP)
+ mg_log(LOG_ERR, "%s: socket was disconnected",
+ __func__);
+
+ if (fds.revents & POLLHUP)
+ mg_log(LOG_ERR, "%s: file descriptor not open",
+ __func__);
+ break;
}
return retval;
}
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...2015-06-07 by manu@...
'Bruncsak, Attila' attila.bruncsak@... [milter-greylist] <milter-greylist@yahoogroups.com> wrote: > While I was debugging I found code which seemingly left over > (pthread_mutex_(init|lock) on an auto variable in a thread, > which is nowhere else referred to) Indeed that one must die. > and I could optimize out one member of the struct sync (s_peer). > It is not needed since each peer has his own private sync list. Not sure I grasped this one. Could you elaborate? -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz manu@...
2015-06-08 by Bruncsak, Attila
> > and I could optimize out one member of the struct sync (s_peer). > > It is not needed since each peer has his own private sync list. > > Not sure I grasped this one. Could you elaborate? > I suggest to do: grep s_peer *.[ch] and you find no more than the three references for it: 1. The declaration in sync.h 2. Its usage in sync_sender() where sync->s_peer is always equal to peer, 3. since the value assignment of s_peer in sync_queue() is coming from the sync list owning peer value. So in sync_sender() sync->s_peer can be replaced via peer. No more use of the s_peer, no need then for its assignment, than no need for its declaration.
2015-06-08 by Bruncsak, Attila
> Is there any merit to keep the select() code along with the poll() flavor? I thought about this as well, but I was not brave enough to propose this one. (minimalistic change approach) The poll() is superior than select(). Actually select() is archaic with its file descriptor number restriction, its use should be avoided as much as possible. > I would rather replace it all (like below). What is your opinion? In principle OK, but I couldn't apply your code since all tabs are replaced with spaces in the inline text of this patch. Cut & paste leads to invalid patch for me. May be in format of attachment instead of inline? The second occurrence of POLLHUP should be POLLNVAL, just reading the code of the patch. You may optimise out the check for select() from the configure script as well.
2015-06-08 by manu@...
'Bruncsak, Attila' attila.bruncsak@... [milter-greylist] <milter-greylist@yahoogroups.com> wrote: > The second occurrence of POLLHUP should be POLLNVAL, just > reading the code of the patch. Thank you for spoting that one. I checked the change in. > You may optimise out the check for select() from the configure script as well. Well, many things could be changed there... select is still used in a configue test. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz manu@...