[milter-greylist] hash table acceleration breaks lazyaw
2006-03-26 by attila.bruncsak@itu.int
Yahoo Groups archive
Index last updated: 2026-04-28 23:32 UTC
Thread
2006-03-26 by attila.bruncsak@itu.int
Hello, The subject tells the problem, the record of the autowhitelist which otherwise would match the actual recipient with "lazyaw on" cannot be found since the record is in an other bucket. I think only the connecting IP should be used as hash value for the autowhitelist, and even just part of the IP if subnetmatch/subnetmatch6 is in effect. Bests, Attila
2006-03-26 by Ranko Zivojnovic
Hi, On Sun, 2006-03-26 at 23:12 +0200, attila.bruncsak@... wrote: > Hello, > > The subject tells the problem, the record of the autowhitelist which > otherwise would match the actual recipient with "lazyaw on" cannot be > found since the record is in an other bucket. > I think only the connecting IP should be used as hash value for the > autowhitelist, and even just part of the IP if > subnetmatch/subnetmatch6 is in effect. Thanks for discovering the issues. This patch should address them both. Best regards, R.
2006-03-26 by Ranko Zivojnovic
Take two: was missing "6" on "conf.c_match_mask6"... R.
On Mon, 2006-03-27 at 02:01 +0300, Ranko Zivojnovic wrote: > Hi, > > On Sun, 2006-03-26 at 23:12 +0200, attila.bruncsak@... wrote: > > Hello, > > > > The subject tells the problem, the record of the autowhitelist which > > otherwise would match the actual recipient with "lazyaw on" cannot > be > > found since the record is in an other bucket. > > I think only the connecting IP should be used as hash value for the > > autowhitelist, and even just part of the IP if > > subnetmatch/subnetmatch6 is in effect. > > Thanks for discovering the issues. This patch should address them > both. > > Best regards, > > R. > >
2006-03-27 by Hajimu UMEMOTO
Hi,
>>>>> On Mon, 27 Mar 2006 02:01:20 +0300
>>>>> Ranko Zivojnovic <ranko@...> said:
ranko> ! #define BUCKET_HASH_V6(sa, from, rcpt, bucket_count) \
ranko> ! ((((uint32_t)(SADDR6(sa)->s6_addr)[3] & \
ranko> ! (uint32_t)(((ipaddr *)&conf.c_match_mask)->in6.s6_addr)[3])\
ranko> ! ^ F2B_SPICE(from, rcpt)) \
ranko> ! % bucket_count)
It seems this test is broken for an IPv6. Why do you check only
(uint32_t)(SADDR6(sa)->s6_addr)[3]? We need to check whole address.
Sincerely,
--
Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan
ume@... ume@{,jp.}FreeBSD.org
http://www.imasy.org/~ume/2006-03-27 by Ranko Zivojnovic
Hi, On Mon, 2006-03-27 at 09:18 +0900, Hajimu UMEMOTO wrote: > Hi, > > >>>>> On Mon, 27 Mar 2006 02:01:20 +0300 > >>>>> Ranko Zivojnovic <ranko@...> said: > > ranko> ! #define BUCKET_HASH_V6(sa, from, rcpt, bucket_count) > \ > ranko> ! ((((uint32_t)(SADDR6(sa)->s6_addr)[3] & \ > ranko> ! (uint32_t)(((ipaddr > *)&conf.c_match_mask)->in6.s6_addr)[3])\ > ranko> ! ^ F2B_SPICE(from, rcpt)) \ > ranko> ! % bucket_count) > > It seems this test is broken for an IPv6. Why do you check only > (uint32_t)(SADDR6(sa)->s6_addr)[3]? We need to check whole address. > I was taking only the last part of the v6 address because that was the place I would expect the most "activity" - and we are only creating the hash. But either way you're right - we should take the whole address length into account. Especially now with the netmask in the game. Please consider this version. Best regards, Ranko
2006-03-28 by Sysadmin
Tere.
> Please consider this version.
>
[test@test milter-greylist-2.1.3]# patch -p0 < milter-greylist.h.patch
patching file `milter-greylist.h'
Hunk #1 FAILED at 80.
1 out of 1 hunk FAILED -- saving rejects to milter-greylist.h.rej
cat milter-greylist.h.rej
***************
*** 80,106 ****
#endif
/* Notes:
! * -For IPv6 not using s6_addr32 as Solaris 8 for some reason has it
only defined for its kernel...
! * -Using also first two characters in "from" and "rcpt" to distribute
potentially lot of triplets
! * coming from a single host (first two chars only because "<>" is
the "shortest" email address)
*/
#define F2B(s) (tolower((int)*(s)) | (tolower((int)*((s)+1)) << 8))
#ifdef AF_INET6
! #define BUCKET_HASH(sa, from, rcpt, bucket_count) \
! (sa->sa_family == AF_INET ? \
! ((ntohl(SADDR4(sa)->s_addr) ^ F2B(from) ^ F2B(rcpt)) \
! % bucket_count) \
! : sa->sa_family == AF_INET6 ? \
! (((uint32_t)(SADDR6(sa)->s6_addr)[3] ^ F2B(from) ^ F2B(rcpt)) \
! % bucket_count) \
! : 0)
- #else
- #define BUCKET_HASH(sa, from, rcpt, bucket_count) \
- (sa->sa_family == AF_INET ? \
- ((ntohl(SADDR4(sa)->s_addr) ^ F2B(from) ^ F2B(rcpt)) \
- % bucket_count) \
- : 0)
#endif
struct mlfi_priv {
--- 80,127 ----
#endif
/* Notes:
! * -For IPv6 not using s6_addr32 as Solaris 8 for some reason has it only
! * defined for its kernel...
! * -Using also first two characters in "from" and "rcpt" to distribute
! * potentially lot of triplets coming from a single host (first two
chars
! * only because "<>" is the "shortest" email address)
*/
#define F2B(s) (tolower((int)*(s)) | (tolower((int)*((s)+1)) << 8))
+ #define F2B_SPICE(from, rcpt) (conf.c_lazyaw ? 0 : (F2B(from) ^
F2B(rcpt)))
+
+ #define BUCKET_HASH_V4(sa, from, rcpt, bucket_count) \
+ ((ntohl(SADDR4(sa)->s_addr & \
+ ((ipaddr *)&conf.c_match_mask)->in4.s_addr) \
+ ^ F2B_SPICE(from, rcpt)) \
+ % bucket_count)
+
#ifdef AF_INET6
! #define BUCKET_HASH_V6(sa, from, rcpt, bucket_count) \
! (((((uint32_t)(SADDR6(sa)->s6_addr)[0] & \
! (uint32_t)(((ipaddr *)&conf.c_match_mask6)->in6.s6_addr)[0]) ^ \
! ((uint32_t)(SADDR6(sa)->s6_addr)[1] & \
! (uint32_t)(((ipaddr *)&conf.c_match_mask6)->in6.s6_addr)[1]) ^ \
! ((uint32_t)(SADDR6(sa)->s6_addr)[2] & \
! (uint32_t)(((ipaddr *)&conf.c_match_mask6)->in6.s6_addr)[2]) ^ \
! ((uint32_t)(SADDR6(sa)->s6_addr)[3] & \
! (uint32_t)(((ipaddr *)&conf.c_match_mask6)->in6.s6_addr)[3])) \
! ^ F2B_SPICE(from, rcpt)) \
! % bucket_count)
!
! #define BUCKET_HASH(sa, from, rcpt, bucket_count) \
! (sa->sa_family == AF_INET ? \
! BUCKET_HASH_V4(sa, from, rcpt, bucket_count) \
! : sa->sa_family == AF_INET6 ? \
! BUCKET_HASH_V6(sa, from, rcpt, bucket_count) \
! : 0)
!
! #else /* AF_INET6 */
!
! #define BUCKET_HASH(sa, from, rcpt, bucket_count) \
! (sa->sa_family == AF_INET ? \
! BUCKET_HASH_V4(sa, from, rcpt, bucket_count) \
! : 0)
#endif
struct mlfi_priv {
--
Sysadmin2006-03-28 by Ranko Zivojnovic
On Tue, 2006-03-28 at 11:35 +0300, Sysadmin wrote: > Tere. > > Please consider this version. > > > [test@test milter-greylist-2.1.3]# patch -p0 < milter-greylist.h.patch > patching file `milter-greylist.h' > Hunk #1 FAILED at 80. > 1 out of 1 hunk FAILED -- saving rejects to milter-greylist.h.rej Works for me(tm). Have you patched your source with something else too? Have you patched with an older version of milter-greylist.h.patch? R.
2006-03-28 by Sysadmin
Tere. > Works for me(tm). > Really? > Have you patched your source with something else too? Have you patched > with an older version of milter-greylist.h.patch? > > Nop, downloaded latest 2.1.3 source, unpacked and patched. -- Sysadmin
2006-03-28 by Hajimu UMEMOTO
Hi,
>>>>> On Mon, 27 Mar 2006 07:21:56 +0300
>>>>> Ranko Zivojnovic <ranko@...> said:
ranko> ! #define BUCKET_HASH_V6(sa, from, rcpt, bucket_count) \
ranko> ! (((((uint32_t)(SADDR6(sa)->s6_addr)[0] & \
ranko> ! (uint32_t)(((ipaddr *)&conf.c_match_mask6)->in6.s6_addr)[0]) ^ \
ranko> ! ((uint32_t)(SADDR6(sa)->s6_addr)[1] & \
ranko> ! (uint32_t)(((ipaddr *)&conf.c_match_mask6)->in6.s6_addr)[1]) ^ \
ranko> ! ((uint32_t)(SADDR6(sa)->s6_addr)[2] & \
ranko> ! (uint32_t)(((ipaddr *)&conf.c_match_mask6)->in6.s6_addr)[2]) ^ \
ranko> ! ((uint32_t)(SADDR6(sa)->s6_addr)[3] & \
ranko> ! (uint32_t)(((ipaddr *)&conf.c_match_mask6)->in6.s6_addr)[3])) \
ranko> ! ^ F2B_SPICE(from, rcpt)) \
ranko> ! % bucket_count)
It seems you are confused. The s6_addr is uint8_t, and is not casted
to uint32_t. Further, cast to `ipaddr *' is redundant.
Sincerely,
--
Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan
ume@... ume@{,jp.}FreeBSD.org
http://www.imasy.org/~ume/2006-03-28 by Ranko Zivojnovic
On Tue, 2006-03-28 at 14:13 +0300, Sysadmin wrote: > Tere. > > Works for me(tm). > > > Really? > > Have you patched your source with something else too? Have you > patched > > with an older version of milter-greylist.h.patch? > > > > > Nop, downloaded latest 2.1.3 source, unpacked and patched. > Well - then I'm really not sure where the problem is - anybody else with the same problem out there? Maybe a patch in a different format would help? Try the one attached here. ---test patching--- [ranko@ranko-fc3 tmp]$ gtar zxf milter-greylist-2.1.3.tgz [ranko@ranko-fc3 tmp]$ cd milter-greylist-2.1.3 [ranko@ranko-fc3 milter-greylist-2.1.3]$ patch -p0 < ../milter-greylist.h.patch patching file milter-greylist.h [ranko@ranko-fc3 milter-greylist-2.1.3]$ patch --version patch 2.5.4 Copyright 1984-1988 Larry Wall Copyright 1989-1999 Free Software Foundation, Inc. This program comes with NO WARRANTY, to the extent permitted by law. You may redistribute copies of this program under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING. written by Larry Wall and Paul Eggert [ranko@ranko-fc3 milter-greylist-2.1.3]$ ---test patching--- R.
2006-03-28 by Ranko Zivojnovic
On Tue, 2006-03-28 at 21:19 +0900, Hajimu UMEMOTO wrote: > It seems you are confused. Hopefully not too much ... > The s6_addr is uint8_t, and is not casted to uint32_t. True - the mistake was in the typecast, but I still want to use 32 bit values in the hash calculus in order to allow for the large hash table. Linux and others do provide for s6_addr32, but Solaris 8 unfortunately does not (for user space at least) - thus the cast. If you have better or more portable solution, please do let me know. > Further, cast to `ipaddr *' is redundant. True... Anyhow - I have rewritten macros and have _tested_ them this time around, so now both v4 and v6 implementations do behave as expected. Attached is the working version. Sorry for any confusion :) caused. Best regards, R.
2006-03-28 by Matthias Scheler
On Tue, Mar 28, 2006 at 11:49:24PM +0300, Ranko Zivojnovic wrote: > True - the mistake was in the typecast, but I still want to use 32 bit > values in the hash calculus in order to allow for the large hash table. > Linux and others do provide for s6_addr32, but Solaris 8 unfortunately > does not (for user space at least) - thus the cast. If you have better > or more portable solution, please do let me know. Run a standard byte string hash function over the full IPv6 address. Your assumption that the last 32bit of an IPv6 address provide a good hash key is not true in a lot of cases anyway (e.g. for statically assigned IPv6 addresses). Kind regards -- Matthias Scheler http://scheler.de/~matthias/
2006-03-29 by Hajimu UMEMOTO
Hi,
>>>>> On Tue, 28 Mar 2006 23:49:24 +0300
>>>>> Ranko Zivojnovic <ranko@...> said:
ranko> True - the mistake was in the typecast, but I still want to use 32 bit
ranko> values in the hash calculus in order to allow for the large hash table.
I have no objection about use of 32 bit values.
ranko> Linux and others do provide for s6_addr32, but Solaris 8 unfortunately
ranko> does not (for user space at least) - thus the cast. If you have better
ranko> or more portable solution, please do let me know.
s6_addr32 is not standard. Only s6_addr is defined in RFC 3493. So,
you cannot use s6_addr32 as long as you consider portability.
BSDs such as FreeBSD and NetBSD don't expose s6_addr32 to userland
intentionally, as well.
ranko> True... Anyhow - I have rewritten macros and have _tested_ them this
ranko> time around, so now both v4 and v6 implementations do behave as
ranko> expected.
It seems okay to me.
ranko> Attached is the working version. Sorry for any confusion :) caused.
No problem. :-)
--
Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan
ume@... ume@{,jp.}FreeBSD.org
http://www.imasy.org/~ume/2006-03-29 by manu@netbsd.org
Ranko Zivojnovic <ranko@...> wrote: > True - the mistake was in the typecast, but I still want to use 32 bit > values in the hash calculus in order to allow for the large hash table. > Linux and others do provide for s6_addr32, but Solaris 8 unfortunately > does not (for user space at least) - thus the cast. If you have better > or more portable solution, please do let me know. Make a configure check for s6_addr32, and define it if it does not exist? -- Emmanuel Dreyfus Le cahier de l'admin BSD 2eme ed. est dans toutes les bonnes librairies http://www.eyrolles.com/Informatique/Livre/9782212114638/livre-bsd.php manu@...
2006-03-29 by Hajimu UMEMOTO
Hi,
>>>>> On Wed, 29 Mar 2006 08:02:17 +0200
>>>>> manu@... said:
manu> Make a configure check for s6_addr32, and define it if it does not
manu> exist?
I think it is overkill. :-)
Sincerely,
--
Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan
ume@... ume@{,jp.}FreeBSD.org
http://www.imasy.org/~ume/2006-03-29 by manu@netbsd.org
Hajimu UMEMOTO <ume@...> wrote: > manu> Make a configure check for s6_addr32, and define it if it does not > manu> exist? > I think it is overkill. :-) We already have a configure test to check if we can have __RCSID twice in a .c file :-) -- Emmanuel Dreyfus Un bouquin en français sur BSD: http://www.eyrolles.com/Informatique/Livre/9782212114638/livre-bsd.php manu@...
2006-03-30 by Ranko Zivojnovic
On Wed, 2006-03-29 at 08:02 +0200, manu@... wrote: > Ranko Zivojnovic <ranko@...> wrote: > > > True - the mistake was in the typecast, but I still want to use 32 > bit > > values in the hash calculus in order to allow for the large hash > table. > > Linux and others do provide for s6_addr32, but Solaris 8 > unfortunately > > does not (for user space at least) - thus the cast. If you have > better > > or more portable solution, please do let me know. > > Make a configure check for s6_addr32, and define it if it does not > exist? Would that be really necessary? `s6_addr' used by the equation is defined by RFC and having redundant checks and #if's would IMHO only clutter the code. R.
2006-03-30 by Ranko Zivojnovic
On Tue, 2006-03-28 at 23:25 +0100, Matthias Scheler wrote: > On Tue, Mar 28, 2006 at 11:49:24PM +0300, Ranko Zivojnovic wrote: > > True - the mistake was in the typecast, but I still want to use 32 > bit > > values in the hash calculus in order to allow for the large hash > table. > > Linux and others do provide for s6_addr32, but Solaris 8 > unfortunately > > does not (for user space at least) - thus the cast. If you have > better > > or more portable solution, please do let me know. > > Run a standard byte string hash function over the full IPv6 address. > Your assumption that the last 32bit of an IPv6 address provide a good > hash key is not true in a lot of cases anyway (e.g. for statically > assigned IPv6 addresses). Some tests I've done show that the current hash function does pretty good job distributing records across buckets. Originally, before I have published any patches, I've only used the IP as a bucket reference. The distribution was not really satisfactory as certain buckets would hold a lot of records. This was because one bucket would hold for example ALL the records coming from one regular mail server - which can be really a lot. This is when I have added some "salt" to the equation which also takes into consideration first two characters of both sender and recipient. The distribution then evened across the buckets. The same "salt" trick is used for both IPv4 and IPv6 thus I cannot really see any obvious need to change the current algorithm. Best regards, Ranko