Hello,
> attila> I have rearranged to code of sync_master_restart() to
> take into account
> attila> that any of the sync_master thread can exit independently.
> attila> It also reports better possible error conditions.
> attila> The side effect of this is that on Tru64 UNIX the compilation
> attila> environment
> attila> supports IPV6 but the run-time not by default.
> attila> Both the two sync_master threads are running, but I
> get a warning:
>
> attila> milter-greylist: cannot set IPV6_V6ONLY: Invalid argument
>
> attila> After that both the two sync_master thread tries to
> run on IPV4 socket.
> attila> The second one fails on the bind, so the milter-greylist exit.
>
> attila> To fix this error condition I had to add in addition
> attila> too the SO_REUSEPORT code in the sync_listen() function.
>
> It seems your code try to listen IPv4 1st then try to listen IPv6. It
> doesn't work correctly on some environment. So, you need to try to
> listen IPv6 before IPv4. I think trying IPv6 before IPv4 should fix
> your problem without issuing SO_REUSEPORT.
I have tried on my system with IPv6 first
and after the IPv4 and behaves
exactly the same way as with the reverse order.
I do not know on other systems is it a problem
If yes we can change the order.
>
> attila> The patch is attached.
>
> You changed to test sync_master4.runs and sync_master6.runs
> independly. It seems to me that when either IPv6 thread or IPv4
> thread fails, milter-greylist exit, now. I suspect milter-greylist
> became not work on at least IPv4 only host or the envionment where
> IPv4-mapped IPv6 address is required to handle both IPv6 and IPv4.
> This is why the following code was exist:
>
> if (!sync_master4.runs && !sync_master6.runs) {
> mg_log(LOG_ERR, "cannot start MX sync, socket
> failed: %s",
> strerror(errno));
> exit(EX_OSERR);
>
In code I provided the IPv6 code is protected with
#ifdef AF_INET6
So that should not be a problem for IPv4 only systems.
I do not know what the status is for the
IPv4-mapped IPv6 address case.
Can someone try this out and let us know?
An other thinking:
The sync_master_restart() is called very frequently
actually at every new envelop recipient.
At the entry point of this function we do not
have enough information why a sync_master is not running:
we had a permanent failure of starting it up earlier or
it was just exited because the peer list became empty.
If it is a permanent failure (due to improper IP version support)
it is not practical to try to restart so frequently.
We should try to restart only if it was just due
a previous empty peer list exit condition.
What do you think to encode the additional
Information about the reason of non-running
condition into the sync_master[46].runs?
Bests,
AttilaMessage
RE: [milter-greylist] MX synchronization loss critical bug
2009-05-15 by attila.bruncsak@itu.int
Attachments
- No local attachments were found for this message.