clearing just ham data

Fri Jun 30 09:30:30 CEST 2017

It's because I don't keep all the learnt messages, they get deleted once learnt.
Doing an sa-learn --backup I get a backup file containing a text file with data.
I see the first amount of data starts with a 't', then the second amount starts with an 's' and have a second column with 's' or 'h'.
Do you think I can remove all the 'h' entries from the file and restore from it?
What are the 't' rows?
Da:
Dino Edwards
A:
amavis-users at amavis.org
Data:
29 giugno 2017 19.02.56 CEST
Oggetto:
RE: RE: RE: RE: RE: different spamassassin behaviours
I don’t know of a why of just cleaning the ham. Unless someone knows of a way. I always have just cleared the whole database and started feeding it ham and spam.
From:
Gabriele Bulfon [mailto:gabriele.bulfon at sonicle.com]
Sent:
Thursday, June 29, 2017 9:22 AM
To:
Dino Edwards
; amavis-users at amavis.org
Subject:
[SUSPECTED SPAM]RE: RE: RE: RE: different spamassassin behaviours
Great Dino! ;) I think you got to the point :)
My traning is almost just spam, we never feed ham to learn.
Do you think I can clear just the ham database? Is there any way?
I would like to retain the spam learnt.
----------------------------------------------------------------------------------------
Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
Da:
Dino Edwards
dino.edwards at mydirectmail.net
A:
amavis-users at amavis.org
Data:
29 giugno 2017 14.14.57 CEST
Oggetto:
RE: RE: RE: RE: different spamassassin behaviours
As you can see, this particular message has auto trained the bayes database that It’s ham (
autolearn=ham)
when in fact it’s supposed to be spam.
Here’s what I recommend:
1.
Turn off autolearn
2.
Clear your bayes database and start over fresh
3.
Train your bayes database with legitimate ham and spam
It looks to me that the major issue you are having is your bayes database is completely jacked and the problem gets worse as each message comes in and incorrectly trains the bayes database.
The levels really depend on your setup, there is no magic levels per se. Train your database properly and you will be able to adjust them to a level that’s good for your environment.
From:
Gabriele Bulfon [
mailto:gabriele.bulfon at sonicle.com
]
Sent:
Thursday, June 29, 2017 7:58 AM
To:
Dino Edwards
dino.edwards at mydirectmail.net
;
amavis-users at amavis.org
Subject:
RE: RE: RE: different spamassassin behaviours
I will check your suggestions (sa_tag_level_deflt , bayes_auto_learn ).
BTW, what are the level you would suggest, instead of my 5.0,10.0,10?
Meanwhile, look at the email I attached: even manually, spamassassin does not detect anything, while it's and evident spam...
Here are the manual result:
X-Spam-Status: No, score=-0.5 required=5.0 tests=BAYES_20,HTML_MESSAGE,
RCVD_IN_DNSWL_NONE,RP_MATCHES_RCVD,TVD_RCVD_SPACE_BRACKET,UNPARSEABLE_RELAY
autolearn=ham autolearn_force=no version=3.4.1
------------------------------------------------------------------------------------------
Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
Da:
Dino Edwards
dino.edwards at mydirectmail.net
A:
amavis-users at amavis.org
Data:
28 giugno 2017 20.12.17 CEST
Oggetto:
RE: RE: RE: different spamassassin behaviours
Right off the bat, the following should be set like below:
$sa_tag_level_deflt = undef;
Setting it as such will add spam info headers to all email not just ones that score 2 or above. After you set this setting, do a test email normally and paste the headers generated and then run the same email manually and paste the headers. I would love to see the difference between the two to see what drives up the scores and what doesn’t.
The following settings seem really high to me
$sa_tag2_level_deflt = 5.0;  # add 'spam detected' headers at that level
$sa_kill_level_deflt = 10.0;  # triggers spam evasive actions (e.g. blocks mail)
$sa_dsn_cutoff_level = 10;   # spam level beyond which a DSN is not sent
You may want to add these settings:
$sa_spam_modifies_subj = 1;
$sa_spam_subject_tag = '[SPAM]';
Also I suggest you completely turn off auto learn in your SA until you get this figured out. As a matter of fact, I never turn it on because it has caused MAJOR issues for me in the past where the bayes database gets really screwy and things don’t get tagged correctly. Up to you.
#bayes
bayes_path /path/to/your/bayes/database
bayes_file_mode 0777
use_bayes 1
use_bayes_rules 1
bayes_auto_learn 0
----------------
Dino Edwards
----------------
Hermes Secure Email Gateway
Hermes Secure Email Gateway is a Free Open Source (Hermes SEG Community Only) Email Gateway that provides Spam, Virus and Malware protection, full in-transit and at-rest email encryption as well as email archiving. Hermes Secure Email Gateway combines Open Source technologies such as Postfix, Apache SpamAssassin, ClamAV, Amavisd-new and CipherMail under one unified web based Web GUI for easy administration and management of your incoming and ougoing email for your organization. It can be deployed to protect your in-house email solution as well as cloud email solutions such as Google Mail and Microsoft Office 365.
Learn More &Download the free open-source appliance at:
https://www.deeztek.com/hermes-secure-email-gateway/
From:
Gabriele Bulfon [
mailto:gabriele.bulfon at sonicle.com
]
Sent:
Tuesday, June 27, 2017 10:24 AM
To:
Dino Edwards
dino.edwards at mydirectmail.net
;
amavis-users at amavis.org
Subject:
RE: RE: different spamassassin behaviours
Here it is, thanks!
Gabriele
-------------------------------------------------------------------------------------------
Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
Da:
Dino Edwards
dino.edwards at mydirectmail.net
A:
Gabriele Bulfon
gabriele.bulfon at sonicle.com
amavis-users at amavis.org
Data:
27 giugno 2017 15.37.22 CEST
Oggetto:
RE: RE: different spamassassin behaviours
the x-spam-status headers should always be present Spam or not. So what you are saying is that the x-spam-status headers are not present when email goes through normally or when they are run manually?
Can you paste your amavis config here?
From:
Gabriele Bulfon [
mailto:gabriele.bulfon at sonicle.com
]
Sent:
Tuesday, June 27, 2017 9:03 AM
To:
Dino Edwards
dino.edwards at mydirectmail.net
;
amavis-users at amavis.org
Subject:
[SUSPECTED SPAM]RE: different spamassassin behaviours
The x-spam-status headers on that cases are not present, because the score is too low, and is considered non-spam.
Is there any way I can force the injection of the x-spam-status header even for low scores? This may help.
I meant that all the cf files (the rules files) are taken from the same place by spamassassin, both manually and automatically during postfix injection, as I can see it from the spam taken.
And finally, yes, I can find the logs you say, where the mail (that manually scores 18.0+) passes as "CLEAN" in amavis and back into postfix.
I attach an example email, and here is the relative log while passing in:
Jun 27 14:30:15 cloudserver amavis[28190]: [ID 702911 mail.notice] (28190-16) Passed CLEAN, [107.175.149.43] [107.175.149.43]
VIVINT.Premier-Provider at tmess.us
-
davide.dicosola at eurovetrocap.com
, Message-ID:
037996f410ef6dcfefa9bbb8b98e2681.3964721.19453093 at tmess.us_ys9
, mail_id: tW7q84X98Ieq, Hits: -0.347, size: 5698, queued_as: 9C78D27B16D, 1781 ms
-------------------------------------------------------------------------------------------
Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
----------------------------------------------------------------------------------
Da: Dino Edwards
dino.edwards at mydirectmail.net
A:
amavis-users at amavis.org
Data: 27 giugno 2017 13.59.37 CEST
Oggetto: RE: different spamassassin behaviours
Can you provide the x-spam-status headers for the same email when run through Postfix normally and then manually so we can see the differences?
Also, I'm a little confused, what do you mean when you say " All the files are taken from /sonicle/etc/mail/spamassassin and /sonicle/share/spamassassin"?
Also, in your mail log, do you say a lines similar to below? The first one is Amavis passing the message as CLEAN and then re-injecting it back to Postfix on port 10025 for delivery. Your port config may vary.
Jun 27 07:55:32 smtp amavis[22662]: (22662-15) Passed CLEAN [198.241.162.22]:12141 [198.241.162.22]
noreply at visaprepaidprocessing.com
-
, Queue-ID: D19FC40B0A, Message-ID:
d5360d$6ue5tu at cportal1.visa.com
, mail_id: X1sVYvfQoUFh, Hits: -0.877, size: 2490, queued_as: 250 2.6.0 Message received, dkim_sd=cportal:visaprepaidprocessing.com, 1280 ms
Jun 27 07:55:32 smtp postfix/smtp[22949]: D19FC40B0A: to=
someone at domain.tld
, relay=127.0.0.1[127.0.0.1]:10021, delay=2.6, delays=1.3/0/0/1.3, dsn=2.6.0, status=sent (250 2.6.0 from MTA(smtp:[127.0.0.1]:10025): 250 2.6.0 Message received)
From: Gabriele Bulfon [
mailto:gbulfon at sonicle.com
]
Sent: Tuesday, June 27, 2017 2:35 AM
To: Dino Edwards
dino.edwards at mydirectmail.net
;
amavis-users at amavis.org
Subject: RE: different spamassassin behaviours
Hi, thanks for your response.
There are a lot of things rising the score manually:
X-Spam-Status: Yes, score=18.1 required=5.0 tests=BAYES_50,CUSTOM_MANY_BL,
HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MIME_HTML_ONLY,RCVD_IN_DNSBL_INPS_DE,
RCVD_IN_HOSTKARMA_BL,RCVD_IN_MSPIKE_H2,RCVD_IN_UCEPROTECT2,
RCVD_IN_UCEPROTECT3,RCVD_IN_WPBL,SPF_HELO_PASS,TVD_RCVD_SPACE_BRACKET,
T_REMOTE_IMAGE,UNPARSEABLE_RELAY,URIBL_ABUSE_SURBL,URIBL_DBL_SPAM
autolearn=spam autolearn_force=no version=3.4.1
All the files are taken from /sonicle/etc/mail/spamassassin and /sonicle/share/spamassassin, and they looks to be read both manually and during postfix run, as many of the mails are caught and contains X-Spam-Status with tags taken from there (sare cf files, kam file, fili_br file etc).
Also, many of the auto-learnt mails get spammed after being trained.
The bayes is configured as :
use_bayes 1
bayes_auto_learn 1
bayes_path /sonicle/var/spamassassin/bayes_db/bayes
bayes_file_mode 0777
and here are the files:
sonicle at www:~$ ls -l /sonicle/var/spamassassin/bayes_db
total 12699
-rw-rw-rw- 1 snclamav snclamav 25680 Jun 27 08:28 bayes_journal
-rw-rw-rw- 1 snclamav snclamav 10567680 Jun 27 07:58 bayes_seen
-rw-rw-rw- 1 snclamav snclamav 5128192 Jun 27 07:58 bayes_toks
here are the amavis processes:
sonicle at www:~$ ps -ef | grep amavisd
snclamav 23517 20393 0 07:43:58 ? 0:04 /sonicle/bin/perl -T /sonicle/sbin/amavisd -u snclamav -c /sonicle/etc/amavis/a...
snclamav 20393 6278 0 May 12 ? 0:49 /sonicle/bin/perl -T /sonicle/sbin/amavisd -u snclamav -c /sonicle/etc/amavis/a...
snclamav 29614 20393 0 08:28:49 ? 0:00 /sonicle/bin/perl -T /sonicle/sbin/amavisd -u snclamav -c /sonicle/etc/amavis/a...
is there any way I can run amavisd manually exactly as postfix would do during an incoming email?
I bet I need debugging output, but enabling it live may fill my mail logs, and I would have to wait for some spam to get in.
Thanks again,
Gabriele
------------------------------------------------------------------------------------------
Sonicle S.r.l. :
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
________________________________________
Da: Dino Edwards
dino.edwards at mydirectmail.net
A:
amavis-users at amavis.org
Data: 26 giugno 2017 19.08.11 CEST
Oggetto: RE: different spamassassin behaviours
Do you know for a fact that the bayes database is making those scores get higher when you run it in debug? If so, where is your bayes database stored and who is the owner of that path? Do you know for a fact that Amavis calls Spamassassin to scan emails?
----------------
Hermes Secure Email Gateway
Hermes Secure Email Gateway is a Free Open Source (Hermes SEG Community Only) Email Gateway that provides Spam, Virus and Malware protection, full in-transit and at-rest email encryption as well as email archiving. Hermes Secure Email Gateway combines Open Source technologies such as Postfix, Apache SpamAssassin, ClamAV, Amavisd-new and CipherMail under one unified web based Web GUI for easy administration and management of your incoming and ougoing email for your organization. It can be deployed to protect your in-house email solution as well as cloud email solutions such as Google Mail and Microsoft Office 365.
Learn More &Download the free open-source appliance at:
https://www.deeztek.com/hermes-secure-email-gateway/
From: amavis-users [
mailto:amavis-users-bounces+dino.edwards=mydirectmail.net at amavis.org
] On Behalf Of Gabriele Bulfon
Sent: Monday, June 26, 2017 11:57 AM
To:
amavis-users at amavis.org
Subject: different spamassassin behaviours
Hi,
I have some installation of amavis+postfix, where I discovered that some spam is coming in with a very low score, but if I run spamassassin in debug mode on the same emails they get a very high score.
On my installations, amavisd runs under the "snclamav" user, while the smtp-amavis postfix daemons run under the "snclmail" user.
I run the bayes learn using the snclamav user, and also run spamassassin debug mode using the same user, that stores the bayes database in a specific path.
Any idea what may happen in amavisd spawn spamassassin that does not happen in manual debug mode?
Thanks for any help
Gabriele
------------------------------------------------------------------------------------------
Sonicle S.r.l. :
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
image002.jpg at 01D2EE7D.41BA0B40
image002.jpg at 01D2F018.8BACE190
Young student dreams about sex with two men.
Ready to go to any city and country for quality sex.
My profile is here, i’m waiting.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.amavis.org/pipermail/amavis-users/attachments/20170630/13e6f1eb/attachment.html>