Synth-DIY Yahoo! Groups Archives

Hi,

I'm very sorry, this will become a somewhat lengthy mail.
I hope your spam filter doesn't drop it.  ;-)
But the issue at hand is _not_ trivial and cannot be
discussed in a single paragraph.

Matt Kettler wrote:
 > That said, do you really think your regex execution is THAT slow
 > that it will take longer than 50us of real CPU time to evaluate
 > one on a 2ghz processor?

Do you know how regular expressions work?  The expression
itself (i.e. the string from the configuration file) is
not stored in memory, neither is it used directly for
matching.

Instead, during parsing, the regex is "compiled" into a
finite state automaton.  That automaton is basically a
data structure consisting of several tables (arrays),
typically several hundred bytes in size, sometimes even
thousands.  The size depends on structure of the regexp
and the actual implementation of the regular expression
library, and how it trades off speed and size.  I assume
that milter-greylist uses the system library (regex(3)),
which can behave quite differently on different platforms,
depending on whether the authors turned their attention
to speed or to memory consumption.

When the automaton is applied ("executed") to a string,
the speed also depends very much on the structure of the
regular expression.  If the automaton only contains simple
states (e.g. fixed characters and unlimited repetitions),
it is almost as fast as a plain string comparison, i.e.
negligible.  On the other hand, complicated states such
as repetition ranges or back-references require recursion
and will need significantly more CPU time.  If you even
nest them in multiple levels, CPU consumption will sky-
rocket exponentially.

Bottom line:  When using regular expressions, it's worth
to understand how they work, and then craft them in a way
that is most efficient.  Perform benchmarks if necessary
(using the same regex library, of course).  It can make a
huge difference.

UDP is a very efficient protocol, especially when run
over the loopback interface (as is the case when you have
a local caching nameserver, which you definitely should
have when running an MTA).  Of course it depends on the
implementation of the IP stack in the kernel.  UDP is
connection-less, doesn't have to care about re-ordering,
retransmits etc., and the loopback interface doesn't have
to produce an ethernet frame and doesn't have to perform
fragmentation and re-assembly.  So it's several orders of
magnitude less complicated and more efficient than a TCP
connection to a remote host.

Basically, the resolver library generates a request packet
(typically between 60 and 80 bytes).  That packet is copied
around once or twice in the kernel, maybe not even once if
the kernel is well optimized for that case.  It doesn't
matter much anyway for 80 bytes.  For DNS black lists, the
reply is usually not much larger ("real" DNS replies are
noticeably larger, typically between 100 and 400 bytes,
depending on how much RRs and glue records the server
associates with the request, but that's still not much).
The processing on the server side depends on lot on the
efficiency of the implementation of the caching nameserver
(as far as I know, BIND is quite good in that regard).

If the answer to the query is not already cached, then it
has to be fetched from a parent (forwarder), or from an
authoritative nameserver directly.  Of course, this will
lead to a much higher latency, but during that time the
local CPU is free to do other things, as Emmanuel already
pointed out, so it doesn't matter at all.  On my FreeBSD 4
machine the overhead is near zero and requires quite
sophisticated benchmarks to even be able to measure it.

In fact, you can probably run a caching nameserver on a
different machine (or load-balance on several machines) in
the same LAN (gigabit) instead of the local machine, and
won't notice a difference.

Some final notes:  First, when using a packet filter (IPFW
or PF on BSD systems, IPF on Solaris, IP tables/chains on
Linux or whatever), make sure that loopback and/or UDP/53
traffic is short-cut right at the beginning of your rules,
or even completely exempt from filtering if possible.  If
your DNS traffic has to pass through hundreds of rules,
it can add noticeable latency _and_ CPU processing.  The
rule of thumb from regular expressions also applies to
packet filters:  Craft the rules very carefully, make
short-cuts for the majority of them whereever possible.
If possible, use a different machine as a filtering bridge
or similar, and don't use packet filtering at all on the
MTA machines.

Second:  Note the fact that sendmail performs a DNS lookup
on every incoming connection anyway.  The result from that
lookup is passed to milter-greylist for matching with the
"acl domain" feature.  So even when using regexps on domain
names and no BLs at all, you cannot avoid DNS lookups
completely.

A note on memory efficiency:  For each regular expression,
the finite state automaton has to be stored in memory, but
no additional data has to be store for every connection
from a remote host (except for the data that is present
anyway, no matter if regexps are used or not).  However,
for DNS BL lookups, the result is always cached per remote
hosts (by your name server, not by milter-greylist).  So
using regular expressions scales better in that regard.
However, if the size of your named process isn't critical,
then it doesn't matter.  The size of the milter-greylist
process is dominated by the size of the database, but not
by the number of DNS lookups or regular expressions.

So, the bottom line is:  Whether regular expressions or
DNS BLs are more efficient for someone depend on a whole
lot of things.  You cannot generically say that one is
always more efficient than the other.  I think on really
large servers it's worth investigating to find the best
mix of the two.  For example, in a particular case it
might be best to first filter a bunch of domains out with
a broad set of _simple_ regular expressions, then perform
DNS BL lookups on the rest, and apply further refined
regexps (possibly less efficient ones) depending on the
outcome of the lookup.  (I'm not sure if the current
version of milter-greylist supports such constructs; I
haven't given it a try yet because of personal resource
constraints on my side.)

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

Perl is worse than Python because people wanted it worse.
        -- Larry Wall
Milter-greylist

Re: [milter-greylist] Re: Limiting resident memory usage

Attachments

Move to quarantaine