need help finding origin of timeout

Mark Martinec Mark.Martinec+amavis at ijs.si
Tue Jul 15 14:50:57 CEST 2014


Patrick,

> I'm using Amavisd-new for years with success on MX servers @work, but few days ago I've noticed few timeouts during email reception. I'm using Postfix 2.10.0 and amavisd-new-2.8.0_2 on FreeBSD, in before queue content filtering.
> The timeout occurs only for two different sender-recipient couple. I've activated amavisd-new debuging for both sender, to get some data, but I can't find the culprit. Any help would be greatly appreciated.
>
> (I'm subscribed to digest, feel free to Cc me)
>
> Postfix side of the error:
>
> Jul 15 13:20:26 host postfix/smtpd[60906]: connect from sender.domain.tld[xxx.yyy.130.2]
> Jul 15 13:20:26 host postfix/smtpd[60906]: NOQUEUE: client=sender.domain.tld[xxx.yyy.130.2]
> Jul 15 13:22:57 host postfix/smtpd[60906]: warning: timeout talking to proxy 127.0.0.1:10024
> Jul 15 13:22:57 host postfix/smtpd[60906]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue file write error; from=<sender at domain.tld> to=<user.1 at at-work.tld> proto=ESMTP helo=<sender.domain.tld>
> Jul 15 13:24:33 host postfix/smtpd[60906]: disconnect from sender.domain.tld[xxx.yyy.130.2]

Postfix has a timeout on a proxy content filter set to 2 minutes.

> Amavisd-new debug logs:
>
> Jul 15 13:20:26 host amavis[58256]: (58256-16)
>   get_deadline switch_to_my_time(rx SMTP RCPT) - deadline in 480.0 s, set to 336.000 s

Your amavisd timeout is set to 8 minutes.

Set the $child_timeout to a value less than the postfix timeout
(time in seconds),
e.g.  $child_timeout = 90;

or increase the postfix timeout (although the 120 seconds may
already be on a high side for some impatient SMTP clients,
so I wouldn't increase it).

> Jul 15 13:20:27 host amavis[58256]: (58256-16) CALLING SA check (0)
> Jul 15 13:22:57 host postfix/smtpd[60906]: warning: timeout talking to proxy 127.0.0.1:10024
> Jul 15 13:23:20 host amavis[58256]: (58256-16) DONE SA check (0)

Here you can see that postfix gave up after 2 minutes
and SpamAssassin took 3 minutes to complete.


> Jul 15 13:23:20 host amavis[58256]: (58256-16) TIMING-SA total 173319 ms- parse: 27 (0.0%), extract_message_metadata: 578 (0.3%), get_uri_detail_list: 834 (0.5%), tests_pri_-1000: 18 (0.0%), tests_pri_-950: 3 (0.0%), tests_pri_-900: 3 (0.0%), tests_pri_-400: 9418 (5.4%), check_bayes: 9393 (5.4%), tests_pri_0: 162515 (93.8%), check_dkim_adsp: 5 (0.0%), check_razor2: 517 (0.3%), check_pyzor: 187 (0.1%), tests_pri_500: 13 (0.0%), tests_pri_899: 260 (0.2%), check_crm114: 256 (0.1%), learn: 456 (0.3%), get_report: 1.33 (0.0%)

Bayes took 9 seconds which seems high.
Most of the time was taken by rules at priority 0.
It could indicate a busy host, or a message hitting
some worst case situation in a regexp of some rule.
Capturing such message and running it from am
command line to spamassassin -D  may tell more.

   Mark


More information about the amavis-users mailing list