Penpals and Envelope From != Header From
Stef Simoens
stef+au at bgs.org
Sun Jul 1 14:25:17 CEST 2012
Hello Mark,
Op 29-jun.-2012, om 19:31 heeft Mark Martinec het volgende geschreven:
>> I wrote a small patch that adds the check if the sender of the reply is the
>> recipient of the outgoing e-mail. I know that this might not be the case
>> in case of forwarding, but it's just to help the penpals algorithm a bit
>> more. Tested with my local set-up.
>> The SQL impact should be limited, as both sid and rid are indexed fields.
>>
>> --- /usr/sbin/amavisd.orig 2012-06-04 00:17:50.000000000 +0200
>> +++ /usr/sbin/amavisd 2012-06-12 21:43:58.380743445 +0200
>> @@ -1324,7 +1324,7 @@
>> 'sel_penpals_msgid' => # with a nonempty list of message-id references
>> "SELECT msgs.time_num, msgs.mail_id, subject, message_id, rid".
>> " FROM msgs JOIN msgrcpt USING (partition_tag,mail_id)".
>> - " WHERE sid=? AND msgs.content!='V' AND ds='P' AND message_id IN (%m)".
>> + " WHERE (sid=? OR rid=?) AND msgs.content!='V' AND ds='P' AND message_id IN (%m)".
>> " AND rid!=sid".
>> " ORDER BY rid=? DESC, msgs.time_num DESC", # LIMIT 1
>> );
>> @@ -21784,7 +21784,7 @@
>> if (defined($sel_penpals_msgid) && @$message_id_list && defined($sid)) {
>> # list of refs to Message-ID is nonempty, try reference or recipient match
>> my($n) = scalar(@$message_id_list); # number of keys
>> - my(@args) = ($sid,$rid); my(@pos_args); local($1);
>> + my(@args) = ($sid,$rid,$rid); my(@pos_args); local($1);
>> my($sel_taint) = substr($sel_penpals_msgid,0,0); # taintedness
>> $sel_penpals_msgid =~
>> s{ ( %m | \? ) } # substitute %m for keys and ? for next arg
>>
>> A better would probably be to have Amavis store the (calculated)
>> From/Reply-To like it stores the envelope-from and the recipients.
>> However, I'm not familiar enough with the code for that. I would be
>> willing to work on that, if this feature would be appreciated.
>
> The 'OR rid=?' seems too broad, it implies *anyone* from this site
> who have mailed to such recipient.
I understand your concerns to avoid this being an "open hole" for potential abuse.
However, there always needs to be a match with the References: (in the incoming e-mail) and the Message-Id: (from the outgoing e-mail).
I'm running the patch on my domain now (since the beginning of June) without any issues or abuse.
> More recent versions of amavisd-new do offer parsed
> addresses from certain mail header fields:
>
> sub rfc2822_from #author addresses list (rfc allows one or more), parsed 'From'
> sub rfc2822_sender # sender address (rfc allows none or one), parsed 'Sender'
> sub rfc2822_resent_from # resending author addresses list, parsed 'Resent-From'
> sub rfc2822_resent_sender # resending sender addresses, parsed 'Resent-Sender'
> sub rfc2822_to # parsed 'To' header field: a list of recipients
> sub rfc2822_cc # parsed 'Cc' header field: a list of Cc recipients
I'm using a recent version (2.7.1).
Consider the following use-case (I'm sure I'm not the only amavis-user using this technique).
*** Outgoing e-mail
Envelope-From / Return-Path: <bounce-handler at mydomain.com>
From: Mailing List <mailinglist at mydomain.com>
To: User <user at yourdomain.net>
Message-Id: <outgoing_message_12345 at mydomain.com>
Amavis will store the following data:
maddr
- id 1 / email bounce-handler at mydomain.com / domain com.mydomain
- id 2 / email user at yourdomain.net / domain net.yourdomain
msgs
- mail_id aaaaa / sid 1 / message_id <outgoing_message_12345 at mydomain.com>
msgrcpt
- mail_id aaaaa / rid 2
*** The user hits "Reply"
*** Incoming e-mail
Envelope-From / Return-Path: <user at yourdomain.net>
From: User <user at yourdomain.net>
To: Mailing List <mailinglist at mydomain.com>
Message-Id: <other_message_id_67890 at yourdomain.net>
References: <outgoing_message_12345 at mydomain.com>
Amavis will store the following data:
maddr
- id 3 / email mailinglist at mydomain.com
msgs
- mail_id bbbbb / sid 2 / message_id <other_message_id_67890 at yourdomain.net>
msgrcpt
- mail_id bbbbb / rid 3
For the penpals feature, amavis tries two queries. Both will not give any results for this use-case.
Query 1 (sel_penpals_msgid):
SELECT msgs.time_num, msgs.mail_id, subject, message_id, rid
FROM msgs JOIN msgrcpt USING (partition_tag,mail_id)
WHERE sid=3 AND msgs.content!='V' AND ds='P' AND message_id IN ('<outgoing_message_12345 at mydomain.com>') AND rid!=sid
ORDER BY rid=3 DESC, msgs.time_num DESC
==>> 0 results
Query 2 (sel_penpals):
SELECT msgs.time_num, msgs.mail_id, subject
FROM msgs JOIN msgrcpt USING (partition_tag,mail_id)
WHERE sid=3 AND rid=2 AND msgs.content!='V' AND ds='P'
ORDER BY msgs.time_num DESC
(BTW, why do you allow a penpals match without Message-Id match?)
==>> 0 results
>> However, penpals doesn't work when the "Envelope From" differs from the
>> "From" or the "Reply-To" e-mail headers.
>
> Yes, except that if a reply references the original Message-ID,
> a match is still found. This covers most correspondence over
> mailing lists, as well as direct correspondence.
Above example shows that direct correspondence works good;
however it doesn't work with mailinglists that do bounce-handling (including this mailing list ;-))
I discovered this "the hard way". Some replies to mailing list messages were sent to the Spam-folder, where Penpals would/should have saved the message...
>
>> Currently, Amavis only stores the "Envelope From" (as msgs.sid)
>> and the recipients of a message (as msgsrcpt.rid).
>> The "Header From"/"Header Reply-To" is not stored.
>
> Right. Adding an author address (From) would require extending
> the SQL schema with another field.
>
> I'm not sure how frequent are cases where matchng a From would help
> while matching a referenced Message-ID failed.
The reply matches the Message-Id and the original (Envelope) sender.
I'm afraid that each sent-out message where From/Reply-To != Envelope-From/Return-Path would fail…
My quick patch matches original (Envelope) sender and Message-Id OR original To and Message-Id.
Ideally:
- amavis could store the "intended reply destination" (by storing the e-mail address found in the Reply-To header, or in absence hereof, the e-mail found in the From header; e.g. in a new "did" field in msgs; and storing/looking up the address in maddr)
- for the penpals feature, the best match would be where From == msgrcpt.rid, To == msgs.did, References IN (msgs.message_id)
Regards
Stef
More information about the amavis-users
mailing list