Amavis process pinning CPU at 100% returning after 5 to 30 minutes

Arrigo Triulzi arrigo at alchemistowl.org
Tue Aug 30 15:15:42 CEST 2022


OK, so this is getting surreal.

I went to fetch the email from $TEMPFILE which had caused a stalled Perl process (i.e. process time > 5m) and, running it individually (on the parts) using spamassassin on the command line I get:

/var/amavisd/tmp/amavis-20220830T125008-24524-HdayDdgJ/parts

spamassassin -D --progress . > /tmp/out.txt

Aug 30 13:08:12.039 [10165] dbg: timing: total 1412 ms - parse: 28 (2.0%), extract_message_metadata: 167 (11.8%), get_uri_detail_list: 25 (1.8%), tests_pri_-1000: 106 (7.5%), tests_pri_-950: 1.74 (0.1%), tests_pri_-900: 1.82 (0.1%), tests_pri_-90: 1.55 (0.1%), tests_pri_0: 1068 (75.6%), check_spf: 38 (2.7%), poll_dns_idle: 0.07 (0.0%), check_dkim_signature: 15 (1.1%), check_dkim_adsp: 6 (0.4%), tests_pri_20: 1.80 (0.1%), tests_pri_30: 2.1 (0.2%), check_pyzor: 0.28 (0.0%), tests_pri_500: 8 (0.6%)

i.e. finished in 1412 ms.

Running it on the complete email.txt (which is not how amavis invokes SpamAssassin):

Aug 30 13:10:30.866 [73086] dbg: timing: total 7235 ms - init: 3338 (46.1%), parse: 28 (0.4%), extract_message_metadata: 266 (3.7%), get_uri_detail_list: 56 (0.8%), tests_pri_-1000: 106 (1.5%), compile_gen: 186 (2.6%), compile_eval: 39 (0.5%), tests_pri_-950: 5 (0.1%), tests_pri_-900: 6 (0.1%), tests_pri_-90: 5 (0.1%), tests_pri_0: 1419 (19.6%), check_spf: 63 (0.9%), dkim_load_modules: 40 (0.6%), check_dkim_signature: 14 (0.2%), check_dkim_adsp: 7 (0.1%), poll_dns_idle: 1876 (25.9%), tests_pri_20: 6 (0.1%), tests_pri_30: 6 (0.1%), check_pyzor: 0.37 (0.0%), tests_pri_500: 2024 (28.0%)


The only interesting extra debug output is:

Aug 30 13:10:30.871 [73086] dbg: check: tagrun - tag DKIMDOMAIN is still blocking action 3
Aug 30 13:10:30.871 [73086] dbg: check: tagrun - tag SENDERDOMAIN is still blocking action 0


The complete logs from the actual processing by amavis are:

Aug 30 12:50:08 mailgate-fra-1 amavis[24524]: (24524-01) LMTP [127.0.0.1]:10024 /var/amavisd/tmp/amavis-20220830T125008-24524-HdayDdgJ: <a at example.comm> -> <b at example.com> SIZE=569551 Received: from mailgate-fra-1.example.com ([127.0.0.1]) by localhost (mailgate-fra-1.example.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP for <b at example.com>; Tue, 30 Aug 2022 12:50:08 +0000 (UTC)
Aug 30 12:50:09 mailgate-fra-1 amavis[24524]: (24524-01) Checking: 2teaTkNr0qP8 MYNETS [192.168.10.15] <a at example.com> -> <b at example.com>
Aug 30 12:50:09 mailgate-fra-1 amavis[24524]: (24524-01) p005 1 Content-Type: multipart/mixed
Aug 30 12:50:09 mailgate-fra-1 amavis[24524]: (24524-01) p006 1/1 Content-Type: multipart/related
Aug 30 12:50:09 mailgate-fra-1 amavis[24524]: (24524-01) p007 1/1/1 Content-Type: multipart/alternative
Aug 30 12:50:09 mailgate-fra-1 amavis[24524]: (24524-01) p001 1/1/1/1 Content-Type: text/plain, QP, size: 8075, SHA1 digest: 431f0d85db39820a71f8e1f91fd0a8b48e5e8370
Aug 30 12:50:09 mailgate-fra-1 amavis[24524]: (24524-01) p002 1/1/1/2 Content-Type: text/html, QP, size: 28264, SHA1 digest: ed31c8daeb5ee2aa7679c52419ea8829f06c9759
Aug 30 12:50:09 mailgate-fra-1 amavis[24524]: (24524-01) p003 1/1/2 Content-Type: image/png, base64, size: 69506, SHA1 digest: aec78c99358bebbac17e5c42ffd373fbe760d905, name: image001.png
Aug 30 12:50:09 mailgate-fra-1 amavis[24524]: (24524-01) p004 1/2 Content-Type: application/pdf, base64, size: 313882, SHA1 digest: e59a7cf1564dda7234bd34515045abd74e0ff1c5, name: Blog Update.pdf
Aug 30 12:50:09 mailgate-fra-1 amavis[24524]: (24524-01) truncating a message passed to SA at 414138 bytes, orig 569545

[process gets killed after 5m]

Aug 30 12:54:48 mailgate-fra-1 amavis[24524]: (24524-01) (!)TempDir removal: tempdir is to be PRESERVED: /var/amavisd/tmp/amavis-20220830T125008-24524-HdayDdgJ

Any better debugging ideas?

Cheers,

Arrigo 


More information about the amavis-users mailing list