Training Amavis

btb listsb-amavis at bitrate.net
Mon Feb 1 14:43:36 CET 2016


On 2016.02.01 02.32, @lbutlr wrote:
>
>> On Jan 31, 2016, at 9:49 PM, listsb-amavis at bitrate.net wrote:
>>
>>
>>> On Jan 31, 2016, at 23.07, @lbutlr <kremels at kreme.com> wrote:
>>>
>>> I get daily mails from wordpress verifying backups and these are
>>> all tagged as spam (at a very high score in the 7-13 range).
>>>
>>> How do I train amavis? Do i just run  normal sa-learn as root? As
>>> the user? as the scan user?
>>
>> you don't train amavis.  you train spamassassin.  they are two
>> different pieces of software, which work well together.  while
>> training spamassassin is good to do regardless of if you are having
>> a problem or not, blindly training it to solve a specific problem
>> is not a sensible approach.
>
> I ma not blindling trainmen it. i wam training false positives as
> ham.
>
> What I need to know is what user to train them as so that amavis will
> use the bases database that I am training to.
>
> They all hit BAYES_99 and BAYES_999, some hit other rules as well.
>
> X-Spam-Status: Yes, score=10.2 required=5.0
> tests=BAYES_99,BAYES_999,
> HEADER_FROM_DIFFERENT_DOMAINS,NO_RELAYS,TVD_SPACE_RATIO,TVD_SPACE_RATIO_MINFP
>  autolearn=no autolearn_force=no version=3.4.1
>
>
>> instead, look at the *actual* scoring the message was given
>> [X-Spam-Status header], and see which rule[s] are the ones which
>> significantly contributed to the score.
>
> Yes, that’s what I’ve done.
>
>> then you can determine the right way to solve the problem.
>
> Training falsely classified mail is *always* a good idea.
>
> The question still remains, do I train SA as root, as the user (which
> is a problem for most of the users since they are virtual users in a
> database) or as the vscan user?
>
> That is to say:
>
> sa-learn -u *WHAT* --ham /path/to/ham

you must train the database that is used during message evaluation. 
that is to say, whatever using is running amavis - their spamassassin 
bayes_path setting.  this may be undefined, in which case it is the 
default of ~/.spamassassin/bayes, it may defined in the global 
spamassassin config, or it may be defined in
the amavis user's spamassassin config [e.g. ~/.spamassassin/user_prefs].

see 
https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Conf.html 
for further detail on bayes_path

once you have identified this detail, the simplest way may be to just 
run sa-learn as the user running amavis - but as with anything, there 
are numerous methods, and the one which best fits your conditions can 
vary greatly.  all that matters is that the database files which are 
worked on are the ones amavis uses when running spamassassin.

for reference, i use the follow setting for spamassassin, which i find 
helpful in keeping clear the files which make up the db and keeping them 
organized/separated from other spamassassin files:

/etc/spamassassin/99_local-config.cf:
# note: the value specified here is *not* a directory.  it is
# a directory plus a prefix used in the names of the various
# files that comprise the entirety of the bayes database
bayes_path				~/.spamassassin/bayes_db/bayes


More information about the amavis-users mailing list