ANNOUNCE: amavisd-new-2.8.2-rc1 release candidate is available
Mark Martinec via amavis-users
amavis-users at amavis.org
Wed Sep 4 19:24:44 CEST 2013
A preview of the coming version 2.8.2 of amavisd-new is available at:
http://www.ijs.si/software/amavisd/amavisd-new-2.8.2-rc1.tar.bz2
http://www.ijs.si/software/amavisd/amavisd-new-2.8.2-rc1.tar.xz
Release notes are at:
http://www.ijs.si/software/amavisd/release-notes.txt
amavisd-new-2.8.2-RC1 release notes
Contents:
COMPATIBILITY
BUG FIXES
NEW FEATURES
OTHER
WHY REDIS?
COMPATIBILITY
There are no incompatible changes since the previous release.
The version 2.8.2 drops dependency on a Perl module Redis, and makes
dependency on modules Convert::TNEF and Convert::UUlib truly optional.
BUG FIXES
- if SQL logging was disabled a pen pals feature was non-functional even
when a Redis storage backend was available and collecting data; now
pen pals is fully functional with a Redis database backend and no SQL;
- provide our own Redis client code, avoiding Redis CPAN module bugs,
its slowness and non-support for IPv6.
The noteworthy Redis CPAN module bug is the #38 (failing to re-select
a non-zero-index database after an automatic re-connect to a server).
See: https://github.com/melo/perl-redis/issues/38
https://github.com/melo/perl-redis/issues/28
- fixed a regexp in parsing wildcarded signing domain in a DKIM key
declaration and in a wildcarded sender pattern in signing options
(an exotic feature rarely used, compatibility with dkim_milter);
- drop hard-coded dependency on modules Convert::TNEF and Convert::UUlib.
The Convert::TNEF was made optional in amavisd-new-2.8.0, but the
program still failed if the module could not be loaded at startup.
Both of these modules are now loaded at run time when first used,
subject to @decoders setting. The use of module Convert::UUlib
(the do_ascii entry) is disabled in a default setting of @decoders,
and the module Convert::TNEF (the do_tnef entry) is not used
if an external TNEF decoder (the do_tnef_ext entry) is available,
or if it is disabled in the @decoders list.
NEW FEATURES
- IP address reputation
When a Redis storage backend is enabled, besides the existing pen pals
functionality, it now also offers information updating and retrieval
on IP address reputation. This function is enabled by default when
@storage_redis_dsn is nonempty, but can be disabled by setting
$enable_ip_repu to false (to 0 or undef), per policy bank if necessary.
For each mail message a list of public IP addresses is collected from
its 'Received' trace header fields in a mail header section. A redis
server maintains a database of each IP address encountered. For each
IP address an entry carries the following counters: a number of spam
messages having this IP address in a trace header, a number of ham
messages, a number of banned or infected messages, and a total number
of messages. Also a timestamp of the last encounter is kept (currently
only used for logging purposes). Each entry is subject to automatic
expiry, so that infrequently encountered IP addresses are eventually
automatically purged from a database.
When a new mail message is being processed, a lookup on all its public
IP addresses from a trace is done. For each IP address found in a
database a spam score is computed based on a ratio of ham versus all
messages, and based on a total number of messages. The largest spam
score of all encountered IP addresses is then contributed as a spam
score of a message.
A formula for computing spam score of each IP address is currently
hard-coded, is non-linear and takes into account the total number of
encounters, diluted by the ratio of ham messages versus all messages
seen with this IP address. The computed score cannot be negative,
i.e. the IP reputation can only contribute to spamminess of a message
and cannot serve as a 'whitelisting' negative score.
A time-to-live of each IP entry is assigned dynamically: frequently
encountered IP addresses are given longer expiration times (days),
infrequent IP addresses are short-lived and eventually expire,
typically in few hours.
It is possible to exclude certain IP addresses or networks from
contributing spam score by listing them in an @ip_repu_ignore_networks
list, e.g.:
@ip_repu_ignore_networks =
qw( 192.0.2.44 192.0.2.45 198.51.100.0/24 2001:db8::1:25 );
This does not preclude a redis lookup or updating counts on an IP
addresses matching the list, but just clears a resulting score to zero.
The mechanism is appropriate for excluding site's own mailers (MSA
and MX), or local (e.g. departmental) mailers, which may on occasion
emit a spammy message, but should never receive a score penalty.
There is no need to include private IP address networks in the list,
as these are already exempt from IP reputation database.
An associated list of lookup tables @ip_repu_ignore_maps (whose only
default entry is the \@ip_repu_ignore_networks) offers more flexibility
if needed, and is a member of policy banks.
Like other self-learning mechanisms (e.g. SpamAssassin's auto-learn,
and AWL), the quality of a result depends on a quality of other
spam-gauging rules - the better spam/ham classification works
(SpamAssassin), the more useful IP reputation becomes. For the purpose
of IP reputation's spam and ham counts, a mail is considered spam if
it is flagged with a contents category CC_SPAM or CC_SPAMMY (i.e. at
tag2_level or above), and is considered ham when its final score is
below 2.0. Intermediate scores are considered unclassified.
A nice feature of the mechanism is that it reacts fairly quickly
to a new rush-in of unwanted messages from some IP address, either
foreign, or local.
For insight on the IP address reputation behaviour, search the log
for ' redis: IP '. At log level 2 only spammy hits are logged, at
log level 3 also the clean hits are shown. The log entry shows
spam, ham, banned+infected and unclassified counts for an IP address,
a percentage of unwanted (spam+banned+infected) messages out of the
total count, and the associated score.
Apart from starting a redis server on a loopback interface (except for
changing its 'bind' setting in redis.conf, no other configuration changes
are necessary, a database need not be initialized), here is an example
configuration in amavisd.conf:
@storage_redis_dsn = (
{ server => '127.0.0.1:6379', db_id => 1 },
);
# list your MX and MSA mailer IP addresses or networks here:
@ip_repu_ignore_networks = qw( 192.0.2.44 2001:db8::/64 );
A redis server needs to support Lua scripting, which is available
since version 2.6. Support for IPv6 is available since version 2.8.0.
OTHER
- dropped dependency on a CPAN module Redis, implementing our own
client-side redis protocol implementation (Amavis::TinyRedis).
It is faster and smaller, and supports opening sessions with a
redis server over IPv6 (or over IPv4 or over a Unix socket).
The redis server supports IPv6 starting with version 2.8.0.
Currently supported options in @storage_redis_dsn are:
server, db_id, password, and ttl.
The 'server' specifies an INET or INET6 socket (a host IP address
or name and a port number) or an absolute path to a Unix socket.
An IPv6 address must be enclosed in square brackets. The default
value is '127.0.0.1:6379'. Match this with your redis configuration.
Option 'db_id' specifies a redis database index (given to a "SELECT"
redis command). Its value is a (small) integer, defaults to 0.
This allows for independent databases to co-exist on the same redis
server, e.g. an amavis database and a SpamAssassin Bayes database.
The 'ttl' option can override a global setting $storage_redis_ttl
on a per-server basis. Its value is an integer, representing a number
of seconds for expiration time of pen pals records. It defaults to
$storage_redis_ttl, which in turn defaults to 16 days (in seconds).
This setting does not affect IP reputation records, whose expiration
time is computed dynamically.
Example:
$storage_redis_ttl = 22*24*3600; # 22 days for pen pals records
@storage_redis_dsn = ( # alternative servers, use the first which works
{ server => '[::1]:6379', db_id => 1 },
{ server => '127.0.0.1:6379', db_id => 1, password => 'abc...' },
{ server => '/tmp/redis.sock', db_id => 1, ttl => 8*24*3600 },
);
Btw, make sure to keep the setting $database_sessions_persistent
at its default value (1, i.e. enabled), otherwise Redis performance
will suffer somewhat.
- store only essential information for pen pals operation to a Redis
storage backend to save memory on a database server; information on
inbound messages is no longer stored there, i.e. only information on
originating messages is kept;
- more informative logging of pen pals query results when using a Redis
storage backend. The redis support code (Lua and protocol handling)
was largely rewritten for efficiency since amavisd-new 2.8.1.
- added LDAP attribute amavisDisclaimerOptions 1.3.6.1.4.1.15312.2.2.1.47
to LDAP.schema; contributed by Quanah Gibson-Mount;
- filter for public IP addresses from a Received trace only once;
- add one digit of precision in the TIMING log report to reported small
elapsed times (below 5 ms);
- documentation README.sql-mysql: added "CREATE INDEX msgs_idx_mail_id..."
with a note on an InnoDB requirement for a foreign key; by Jernej Porenta;
WHY REDIS?
A redis database was chosen initially because SpamAssassin 3.4.0 supports
keeping its Bayes database in a redis server, which makes it very fast,
so this makes a redis database readily available to amavisd too.
Redis has some features that make it suitable for use as a pen pals
database, for Bayes storage, and now for IP reputation:
- automatic expiration of entries based on key's individual time-to-live
setting makes explicit database maintenance unnecessary;
- accessible over inet (or Unix sockets) allows several amavisd hosts
to use a common redis server, possibly running on a dedicated host;
- supports Lua scripting, which makes it possible to perform multiple
basic operations in one go as a single application's functional
operation. It reduces multiple network round-trip times to a single
network transaction, reducing network packet rate and latency;
- compared to SQL storage for pen pals (and for Bayes database), the
redis read speed is faster, but the write speed is MUCH faster;
- as an im-memory database with optional periodic disk persistence
it makes it suitable for use as a pen pals, as IP reputation and
as Bayes storage: it is fast, and a potential redis server restart
reloads data from the last snapshot, thus only losing the last
minute or two of updates when trouble strikes, which is acceptable
for these three databases.
- makes it possible to eliminate SQL r/w storage if its only purpose
was to provide pen pals functionality (and SpamAssassin's Bayes);
Mark
More information about the amavis-users
mailing list