According to manu@...: > > > > While I'm here, I'd like to seek comments on the following ACL, which is > > > designed to fight image spam: > > > dacl blacklist body /src[:blank:]*=(3D)?[:blank:]*["']?[:blank:]*cid:/ \ > > > msg "HTML messages with embedded images are now refused" > > > > not usable for me. That matches at least a third of my business emails, > > including customers and suppliers. > > All seems to be company enforced sigs with logos and links. > > And the images are embedded in the mail and referenced with a cid: URL? Correct. Such a regexp could only lead to tons of false positive. What you can do is to feed all emails matching this to 'gocr' then check the result against a list of words.. That may not be enough as direct ocr is still too weak (animated gifs, transcoding, and stuff..). Add that and you obtain FuzzyOcr. /Fabien
Message
Re: block cid: URL as image source (was: Re: [milter-greylist] milter-greylist 3.1.3 is available)
2007-01-04 by Fabien Tassin