Emails classification in perl

André Rodier andre at rodier.me
Tue Sep 16 06:44:09 CEST 2014


Hi everybody,

There is one feature, I missed when I stopped to use gmail, which is the 
automatic recognition of 'Important' messages. I am actually trying to 
replicate that, with a custom amavis perl script, and a virtual folder 
in 'Dovecot'. So far, it's encouraging.

I even can create rules in Thunderbird to automatically tag messages as 
'Important'.

So far, the script tries to 'classify' emails in 4 categories:
- Internal: an internal email inside the company.
- List: Official mailing list, since it's containing valid 'List-Id' 
etc.
- Bulk: A mailing list email, without the header
- Private: a private email, sent from someone.

So, my dovecot filter is getting all 'Internal' and 'Private' emails to 
create a virtual folder called 'Important'.

Please, understand that this 'feature' is an added value to spam 
filtering. It's very useful for people who are subscribe to many mailing 
lists using their main email address. They often want to have important 
emails automatically recognised without loosing access to mailing lists.

Since I could not find appropriate documentation about Amavis hooks, 
especially an up to date reference of $msginfo, the perl script is 
probably not optimised, and it's here I need your help. However, it's 
doing the job so far, and my users are thrilled. The code is simple, 
readable and optimised enough to have no impact on performances (so 
far).

Notes and Todo:
1 - The bulk email recognition is probably were there is more place for 
improvements
2 - I need to stop analysing emails marked as Spam.
3 - Internal emails recognition may not work in some cases, I need your 
help here.
4 - A nice to have feature will be per-user rules, using a simple 
database of important emails senders.

The feature 4 is essentially the missing point to have the same features 
as gmail.

I am neither a Perl hacker, so the perl's experts around may have some 
ideas.

If you want to help, or if you are interested, the code is here: 
https://github.com/arodier/emclass

Comments are welcome.
André


More information about the amavis-users mailing list