Penpals and Envelope From != Header From

Stef Simoens stef+au at bgs.org
Sun Jul 1 14:25:17 CEST 2012


Hello Mark,

Op 29-jun.-2012, om 19:31 heeft Mark Martinec het volgende geschreven:

>> I wrote a small patch that adds the check if the sender of the reply is the
>> recipient of the outgoing e-mail. I know that this might not be the case
>> in case of forwarding, but it's just to help the penpals algorithm a bit
>> more. Tested with my local set-up.
>> The SQL impact should be limited, as both sid and rid are indexed fields.
>> 
>> --- /usr/sbin/amavisd.orig	2012-06-04 00:17:50.000000000 +0200
>> +++ /usr/sbin/amavisd	2012-06-12 21:43:58.380743445 +0200
>> @@ -1324,7 +1324,7 @@
>>     'sel_penpals_msgid' =>  # with a nonempty list of message-id references
>>       "SELECT msgs.time_num, msgs.mail_id, subject, message_id, rid".
>>       " FROM msgs JOIN msgrcpt USING (partition_tag,mail_id)".
>> -      " WHERE sid=? AND msgs.content!='V' AND ds='P' AND message_id IN (%m)".
>> +      " WHERE (sid=? OR rid=?) AND msgs.content!='V' AND ds='P' AND message_id IN (%m)".
>>         " AND rid!=sid".
>>       " ORDER BY rid=? DESC, msgs.time_num DESC",  # LIMIT 1
>>   );
>> @@ -21784,7 +21784,7 @@
>>   if (defined($sel_penpals_msgid) && @$message_id_list && defined($sid)) {
>>     # list of refs to Message-ID is nonempty, try reference or recipient match
>>     my($n) = scalar(@$message_id_list);  # number of keys
>> -    my(@args) = ($sid,$rid);  my(@pos_args);  local($1);
>> +    my(@args) = ($sid,$rid,$rid);  my(@pos_args);  local($1);
>>     my($sel_taint) = substr($sel_penpals_msgid,0,0);   # taintedness
>>     $sel_penpals_msgid =~
>>            s{ ( %m | \? ) }  # substitute %m for keys and ? for next arg
>> 
>> A better would probably be to have Amavis store the (calculated)
>> From/Reply-To like it stores the envelope-from and the recipients.
>> However, I'm not familiar enough with the code for that. I would be
>> willing to work on that, if this feature would be appreciated.
> 
> The 'OR rid=?' seems too broad, it implies *anyone* from this site
> who have mailed to such recipient.

I understand your concerns to avoid this being an "open hole" for potential abuse.
However, there always needs to be a match with the References: (in the incoming e-mail) and the Message-Id: (from the outgoing e-mail).

I'm running the patch on my domain now (since the beginning of June) without any issues or abuse.

> More recent versions of amavisd-new do offer parsed
> addresses from certain mail header fields:
> 
> sub rfc2822_from #author addresses list (rfc allows one or more), parsed 'From'
> sub rfc2822_sender  # sender address (rfc allows none or one), parsed 'Sender'
> sub rfc2822_resent_from # resending author addresses list, parsed 'Resent-From'
> sub rfc2822_resent_sender  # resending sender addresses, parsed 'Resent-Sender'
> sub rfc2822_to  # parsed 'To' header field: a list of recipients
> sub rfc2822_cc  # parsed 'Cc' header field: a list of Cc recipients

I'm using a recent version (2.7.1).

Consider the following use-case (I'm sure I'm not the only amavis-user using this technique).

*** Outgoing e-mail
Envelope-From / Return-Path: <bounce-handler at mydomain.com>
From: Mailing List <mailinglist at mydomain.com>
To: User <user at yourdomain.net>
Message-Id: <outgoing_message_12345 at mydomain.com>

Amavis will store the following data:
maddr
- id 1 / email bounce-handler at mydomain.com / domain com.mydomain
- id 2 / email user at yourdomain.net / domain net.yourdomain

msgs
- mail_id aaaaa / sid 1 / message_id <outgoing_message_12345 at mydomain.com>

msgrcpt
- mail_id aaaaa / rid 2

*** The user hits "Reply"

*** Incoming e-mail
Envelope-From / Return-Path: <user at yourdomain.net>
From: User <user at yourdomain.net>
To:  Mailing List <mailinglist at mydomain.com>
Message-Id: <other_message_id_67890 at yourdomain.net>
References: <outgoing_message_12345 at mydomain.com>

Amavis will store the following data:
maddr
- id 3 / email mailinglist at mydomain.com

msgs
- mail_id bbbbb / sid 2 / message_id <other_message_id_67890 at yourdomain.net>

msgrcpt
- mail_id bbbbb / rid 3

For the penpals feature, amavis tries two queries. Both will not give any results for this use-case.

Query 1 (sel_penpals_msgid):
SELECT msgs.time_num, msgs.mail_id, subject, message_id, rid
FROM msgs JOIN msgrcpt USING (partition_tag,mail_id)
WHERE sid=3 AND msgs.content!='V' AND ds='P' AND message_id IN ('<outgoing_message_12345 at mydomain.com>') AND rid!=sid
ORDER BY rid=3 DESC, msgs.time_num DESC

==>> 0 results

Query 2 (sel_penpals):
SELECT msgs.time_num, msgs.mail_id, subject
FROM msgs JOIN msgrcpt USING (partition_tag,mail_id)
WHERE sid=3 AND rid=2 AND msgs.content!='V' AND ds='P'
ORDER BY msgs.time_num DESC

(BTW, why do you allow a penpals match without Message-Id match?)

==>> 0 results


>> However, penpals doesn't work when the "Envelope From" differs from the
>> "From" or the "Reply-To" e-mail headers.
> 
> Yes, except that if a reply references the original Message-ID,
> a match is still found. This covers most correspondence over
> mailing lists, as well as direct correspondence.

Above example shows that direct correspondence works good;
however it doesn't work with mailinglists that do bounce-handling (including this mailing list ;-))

I discovered this "the hard way". Some replies to mailing list messages were sent to the Spam-folder, where Penpals would/should have saved the message...

> 
>> Currently, Amavis only stores the "Envelope From" (as msgs.sid)
>> and the recipients of a message (as msgsrcpt.rid).
>> The "Header From"/"Header Reply-To" is not stored.
> 
> Right. Adding an author address (From) would require extending
> the SQL schema with another field.
> 
> I'm not sure how frequent are cases where matchng a From would help
> while matching a referenced Message-ID failed.

The reply matches the Message-Id and the original (Envelope) sender.
I'm afraid that each sent-out message where From/Reply-To != Envelope-From/Return-Path would fail…

My quick patch matches original (Envelope) sender and Message-Id OR original To and Message-Id.

Ideally:
- amavis could store the "intended reply destination" (by storing the e-mail address found in the Reply-To header, or in absence hereof, the e-mail found in the From header; e.g. in a new "did" field in msgs; and storing/looking up the address in maddr)
- for the penpals feature, the best match would be where From == msgrcpt.rid, To == msgs.did, References IN (msgs.message_id)

Regards

Stef


More information about the amavis-users mailing list