Bayes expire files
Mark Martinec
Mark.Martinec+amavis at ijs.si
Thu Oct 20 21:00:51 CEST 2011
Julien,
> "Gorn" <Gorn at xs4all.nl> a écrit :
> > http://old.nabble.com/bayes_toks.expire-problem-td22502372.html
>
> I already tried some of the ways indicated here, and nothing very
> good..
You did try disabling auto-expire and running it manually,
as indicated in that thread?
> root at rei ~ % ls -lah /usr/jails/mail/var/amavis/.spamassassin/
> total 92138
> drwx------ 2 110 110 9B 19 oct 12:35 .
> drwxr-xr-x 6 110 110 9B 19 oct 12:44 ..
> -rw------- 1 110 110 25K 19 oct 12:46 bayes_journal
> -rw------- 1 110 110 4,9M 19 oct 12:35 bayes_seen
> -rw------- 1 110 110 39M 19 oct 12:35 bayes_toks
> -rw------- 1 110 110 65M 18 oct 11:31 bayes_toks.expire10463
> -rw------- 1 110 110 128T 19 oct 12:16 bayes_toks.expire21624
> -rw------- 1 110 110 8,0T 19 oct 11:35 bayes_toks.expire64012
> -rw-r----- 1 110 110 109B 18 oct 11:31 razor-agent.log
>
> On a 80 GB disk, this is a very good compression :)
:-)
If a temporary tokens database gets so much larger than
the original database is, my guess is that the current database
is corrupted.
> On debug mode, I got a lot of :
> Oct 19 12:16:16.061 [21624] dbg: locker: refresh_lock:
> refresh /var/amavis/.spamassassin/bayes.lock
>
> and after some time :
> HASH: Out of overflow pages. Increase page size
> Segmentation fault (core dumped)
For bayes databases of any substantial size choosing an SQL-based
bayes usually offers a faster and more reliable operation. Instructions are
in the sql directory of the SpamAssassin distribution (files README.bayes
and bayes_mysql.sql or bayes_pg.sql). Choose either an MySQL with InnoDB
and Mail::SpamAssassin::BayesStore::MySQL as bayes_store_module,
or a fairly recent version of PostgreSQL. With a bayes on SQL it is usually
just fine to leave auto-expiry enabled.
As long as the rest of your SA rules and network tests are good,
it is not a big deal to start a new bayes database from scratch and
leaving it to auto-learning. For the first couple of hours it may be
prudent to lower the scores of BAYES_00 and BAYES_99 rules.
Btw, if starting from scratch, it is also a good idea to set:
bayes_auto_learn_on_error 1
(introduced with SpamAssassin 3.3).
See Mail::SpamAssassin::Plugin::AutoLearnThreshold man page
for a description of this setting.
Mark
More information about the amavis-users
mailing list