docx decoding (docx2txt.pl)

Dominic Raferd dominic at timedicer.co.uk
Tue Aug 14 15:50:28 CEST 2018


Can Amavis 2.11.0 decode docx files out of the box?

A while ago I installed docx2txt.pl (which, as its name suggests, provides
simple text output from a docx file) and set this in 50-user.conf:
@decoders = (
  ...
  ['doc', \&do_ole, 'docx2txt.pl'],
  ['docx', \&do_ole, 'docx2txt.pl'],
  ...

but now I find error messages in the log like this:
amavis[22144]: (22144-01) (!!)collect_results from [26149] (/usr/local/bin/
docx2txt.pl): exit 255 \nUsage:\t/usr/local/bin/docx2txt.pl
[infile.docx|-|-h] [outfile.txt|-]\n\t/usr/local/bin/docx2txt.pl <
infile.docx\n\t/usr/local/bin/docx2txt.pl < infile.docx >
outfile.txt\n\n\tIn second usage, output is dumped on STDOUT.\n\n\tUse '-h'
as the first argument to get this usage information.\n\n\tUse '-' as the
infile name to read the docx file from STDIN.\n\n\tUse '-' as the outfile
name to dump the text on STDOUT.\n\tOutput is saved in infile.txt if second
argument is omitted.\n\nNote:\tinfile.docx can also be a directory name
holding the unzipped content\n\tof concerned .docx file.\n\n

Evidently amavis is not passing parameters to docx2txt.pl in the correct
way, and so docx2txt.pl outputs its help text. Probably it omits the
required '-' as the second parameter. (When I pass the docx attachment
directly to docx2txt.pl it processes it fine.)

Is this fixable or should docx files be handled in a different way?  The
default setting for 'doc' files also seems to fail on my installation
because I have no program 'ripole'. What happens about 'docm' files?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.amavis.org/pipermail/amavis-users/attachments/20180814/1625cb10/attachment.html>


More information about the amavis-users mailing list