Bayes expire files
Julien Gormotte
julien at gormotte.info
Fri Oct 21 09:02:06 CEST 2011
Le Thu, 20 Oct 2011 21:00:51 +0200,
Mark Martinec <Mark.Martinec+amavis at ijs.si> a écrit :
> Julien,
>
> > "Gorn" <Gorn at xs4all.nl> a écrit :
> > > http://old.nabble.com/bayes_toks.expire-problem-td22502372.html
> >
> > I already tried some of the ways indicated here, and nothing very
> > good..
>
> You did try disabling auto-expire and running it manually,
> as indicated in that thread?
Yes, I set :
bayes_expiry_max_db_size 300000
bayes_auto_expire 0
And run :
sa-learn --force-expire
I runned for quite some time, and I got these huge files. Before the
files were using "just" 34 GB.
>
> > root at rei ~ % ls -lah /usr/jails/mail/var/amavis/.spamassassin/
> > total 92138
> > drwx------ 2 110 110 9B 19 oct 12:35 .
> > drwxr-xr-x 6 110 110 9B 19 oct 12:44 ..
> > -rw------- 1 110 110 25K 19 oct 12:46 bayes_journal
> > -rw------- 1 110 110 4,9M 19 oct 12:35 bayes_seen
> > -rw------- 1 110 110 39M 19 oct 12:35 bayes_toks
> > -rw------- 1 110 110 65M 18 oct 11:31 bayes_toks.expire10463
> > -rw------- 1 110 110 128T 19 oct 12:16 bayes_toks.expire21624
> > -rw------- 1 110 110 8,0T 19 oct 11:35 bayes_toks.expire64012
> > -rw-r----- 1 110 110 109B 18 oct 11:31 razor-agent.log
> >
> > On a 80 GB disk, this is a very good compression :)
>
> :-)
>
> If a temporary tokens database gets so much larger than
> the original database is, my guess is that the current database
> is corrupted.
I tried to run :
sa-learn --clear
and then :
sa-learn --force-expire
It did not remove expire files, so I deleted them manually. I'll see
what happens.
>
> > On debug mode, I got a lot of :
> > Oct 19 12:16:16.061 [21624] dbg: locker: refresh_lock:
> > refresh /var/amavis/.spamassassin/bayes.lock
> >
> > and after some time :
> > HASH: Out of overflow pages. Increase page size
> > Segmentation fault (core dumped)
>
> For bayes databases of any substantial size choosing an SQL-based
> bayes usually offers a faster and more reliable operation.
> Instructions are in the sql directory of the SpamAssassin
> distribution (files README.bayes and bayes_mysql.sql or
> bayes_pg.sql). Choose either an MySQL with InnoDB and
> Mail::SpamAssassin::BayesStore::MySQL as bayes_store_module, or a
> fairly recent version of PostgreSQL. With a bayes on SQL it is
> usually just fine to leave auto-expiry enabled.
I'll see what happens after my last operations, and it may be a good
idea to try sql backend afterwards.
>
> As long as the rest of your SA rules and network tests are good,
> it is not a big deal to start a new bayes database from scratch and
> leaving it to auto-learning. For the first couple of hours it may be
> prudent to lower the scores of BAYES_00 and BAYES_99 rules.
>
> Btw, if starting from scratch, it is also a good idea to set:
> bayes_auto_learn_on_error 1
> (introduced with SpamAssassin 3.3).
> See Mail::SpamAssassin::Plugin::AutoLearnThreshold man page
> for a description of this setting.
>
> Mark
I'll take some time to see this as soon as I can, thanks for the
advices :)
More information about the amavis-users
mailing list