Spam assassin custom rule to block PDF files having Urls.

Matus UHLAR - fantomas uhlar at fantomas.sk
Wed Jul 6 11:03:21 CEST 2022


>> >I need a rule that can detect urls inside a PDF file. Can you guide me?
>>
>> do you have SA 4.0 installed and plugin:
>> Mail::SpamAssassin::Plugin::ExtractText
>> enabled?
>>
>>     This is a production server running CentOS7 with spamassassin-3.4.0-6.

On 05.07.22 11:00, Indunil Jayasooriya wrote:
>    According to your statement,  spamassassin-3.4 will NOT be able to
>fulfill this job.

hardly. SA 4 should be released soon AFAIK, you may test it as I do.
I was unable to find spam with PDF containing URL in my archive, but other 
rules catch text contained in pdf attachments.

>    spamassassin current version is release 3.4.6 (Stable).
>    Am I right? Where can I find Spam assassin 4?
>
>    I have NOT installed Mail::SpamAssassin::Plugin::ExtractText.
>    I could NOT find any RPM for the above package.

it's in the SA 4 distribution. I have no idea where you can get wortking SA4 
package for centos tho.


>> did you install pdftotext and configure SA to use it?
>>
>        I installed it with the below command.
>            yum install poppler-utils
>
>      How to configure spamassassin to use it?
>               Is just a file called rule.cf enough?

no, you must have SA4, ExtractText enabled and configure it to use pdftotext 
e.g. according to the ExtractText docs:

        loadplugin Mail::SpamAssassin::Plugin::ExtractText

        ifplugin Mail::SpamAssassin::Plugin::ExtractText

          extracttext_external  pdftotext  /usr/bin/pdftotext -nopgbrk -layout -enc UTF-8 {} -
          extracttext_use       pdftotext  .pdf application/pdf


>            If so, Could you please give me an example to catch a URL
> inside PDF attachment?
>
>     I googled it. But could not find a way for spamassassin to use it.

SA 4 is the way but note that this is amavis mailing list.
While amavis can use SA for spam searching, it's not the best list for this
-- 
Matus UHLAR - fantomas, uhlar at fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Emacs is a complicated operating system without good text editor.


More information about the amavis-users mailing list