Increasing spam filtering with spamassassin
Nikolaos Milas
nmilas at noa.gr
Sat Aug 27 15:37:52 CEST 2016
On 27/8/2016 1:02 πμ, Marc Pujol wrote:
> ...
> At this point I would ditch the entire database and start from scratch, disabling auto-learning first (put "bayes_auto_learn 0" in your config).
> ...
> You could also try to move/copy your/root/.spamassin database over to the amavis location (check the permissions!).
> ...
Thank you all for your suggestions. Your remarks were in fact all
correct and I have started understanding / correcting things (I believe).
I have started by doing the above, and here is the result on a specimen:
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes:
tie-ing to DB file R/O /var/amavis/var/.spamassassin/bayes_toks
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes:
tie-ing to DB file R/O /var/amavis/var/.spamassassin/bayes_seen
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes:
found bayes db version 3
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes: DB
journal sync: last sync: 0
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes:
corpus size: nspam = 1680, nham = 0
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes:
cannot use bayes on this message; none of the tokens were found in
the database
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes: not
scoring message, returning undef
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes: DB
expiry: tokens in DB: 154999, Expiry max size: 300000, Oldest atime:
1219096335, Newest atime: 1472206207, Last expire: 1471602636,
Current time: 1472299481
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes: DB
journal sync: last sync: 0
Aug 27 15:04:41 mailgw3 amavis[7963]: (07963-15) SA dbg: bayes:
untie-ing
...
Aug 27 15:39:16 mailgw3 amavis[8866]: (08866-03) SA dbg: bayes:
tie-ing to DB file R/O /var/amavis/var/.spamassassin/bayes_toks
Aug 27 15:39:16 mailgw3 amavis[8866]: (08866-03) SA dbg: bayes:
tie-ing to DB file R/O /var/amavis/var/.spamassassin/bayes_seen
Aug 27 15:39:16 mailgw3 amavis[8866]: (08866-03) SA dbg: bayes:
found bayes db version 3
Aug 27 15:39:16 mailgw3 amavis[8866]: (08866-03) SA dbg: bayes: DB
journal sync: last sync: 0
Aug 27 15:39:16 mailgw3 amavis[8866]: (08866-03) SA dbg: bayes:
corpus size: nspam = 1680, nham = 0
Aug 27 15:39:16 mailgw3 amavis[8866]: (08866-03) SA dbg: bayes:
cannot use bayes on this message; none of the tokens were found in
the database
Aug 27 15:39:16 mailgw3 amavis[8866]: (08866-03) SA dbg: bayes: not
scoring message, returning undef
...
So, now "corpus size: nspam = 1680, nham = 0" and the currently normal
situation is (as seen above) "cannot use bayes on this message; none of
the tokens were found in the database".
It seems things are more under control, since a lot of messages are no
more automatically designated as "ham". I don't see spam detections, but
at least I don't see false positive (and auto-learned!) ham ones either!
I am expecting user feedback and I am trying to monitor spam filtering
behavior as much as I can.
Any and all additional advice will be appreciated!
With regard to the comments on rmpforge repo obsolescence, you are
right, but I am afraid there is no easy way to currently switch to EPEL
packages, because, as far as I remember, the respective amavisd-new /
clamd / spamassassin EPEL packages are not using the same paths /
structure / setup, and I don't want to mess things up.
When we rebuild a new system as a successor, it will probably be using
CentOS 8 (probably in two years or so)... Then we will use EPEL for
sure! (We currently have a lot of CentOS 5 systems to rebuild using
CentOS 7 in the immediate future...)
Thanks again,
Nick
More information about the amavis-users
mailing list