Enabling Bayesian filter with amavisd-new + Spamassassin

Thu Sep 25 17:17:25 CEST 2014

I posted the following message on ServerFault several days ago, but since
it has gotten very little attention, I thought I would ask on this mailing
list.

I'm trying to figure out a few things:
1) If SpamAssassin is configured properly and I'm editing the proper config
files
2) How I can get the Bayes filter's headers into my emails (even if the
Bayes filter isn't used)
3) How many messages (100 vs. 200) that Bayes has to be trained on (with
sa-learn) and if that's ultimately the issue I'm experiencing right now.

Here's the original post below:

I run a Postfix mailserver on CentOS, and am trying to enable
Spamassassin's bayes filter, but I seem to be missing something.

We're running amavisd-new 2.9.1:

Name        : amavisd-new
Arch        : noarch
Version     : 2.9.1
Release     : 2.el6
Size        : 3.0 M
Repo        : installed
>From repo   : epel

.... with Spamassassin 3.3.1:

Installed Packages
Name        : spamassassin
Arch        : x86_64
Version     : 3.3.1
Release     : 3.el6
Size        : 3.1 M
Repo        : installed
>From repo   : updates

>From what I can tell, my only spamassassin config files are located in
/etc/mail/spamassassin.

The local.cf file in this directory contains the following:

# These values can be overridden by editing ~/.spamassassin/user_prefs.cf
# (see spamassassin(1) for details)

# These should be safe assumptions and allow for simple visual sifting
# without risking lost emails.

required_hits 5
report_safe 0
rewrite_header Subject [SPAM]
use_bayes 1
bayes_auto_learn 1
bayes_auto_expire 0
bayes_path /var/amavis/var/.spamassassin/

amavisd.conf is located in /etc/amavisd/, and I *think* I've included all
of the configurations I need to in order to turn spamassassin "on" but I'm
not positive.

Some websites I've read indicate that the bayesian filter needs to be
trained on 100 messages (for both spam and non-spam messages) using sa-learn,
but I've seen at least 1 website indicating the filter needs to be trained
on 200 messages. That said, I can confirm I've trained the filter on at
least 100 spam messages.

So now, whenever I receive an email, after training the filter on these 100
spam messages, I'm still seeing no indication in the mail headers that the
baysian filter is being used:

X-Virus-Scanned: amavisd-new at developcents.com
X-Spam-Flag: NO
X-Spam-Score: -0.525
X-Spam-Level:
X-Spam-Status: No, score=-0.525 tagged_above=-999 required=4
    tests=[HK_RANDOM_FROM=1, HTML_MESSAGE=0.001, RP_MATCHES_RCVD=-2.499,
    SPF_SOFTFAIL=0.972, URIBL_BLOCKED=0.001] autolearn=unavailable

Even if bayes isn't fully trained and ready to be "used" yet, shouldn't I
be seeing a tag in the X-Spam-Status section that indicates whether or not
it's using the Bayes filter?

(For what its worth, the email for which I've posted the partial mail
header above, was spam, and obviously didn't get marked as such)

Is there something I'm missing?

-- 
David White
Founder & CEO

*Develop CENTS *
Computing, Equipping, Networking, Training & Supporting
Nonprofit Organizations Worldwide
http://developcents.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.amavis.org/pipermail/amavis-users/attachments/20140925/d81cdb2f/attachment.html>