You have :
– An Enterprise mail system that needs to be archived
You need :
– 2 *NIX – choose your own flavor – we use Ubuntu LTS
– An NFS share with adequite space
We found that we needed to archive 3 – 5 years of mail traffic from our mail system. We scanned the market and found that there are a lot of solutions, but they come at quite a hefty price.
As we have access to all the VMs necessary and quite a lump of disk space to our availability, we started out to design a solution our selves.
Our mail domain is mail.domain.tld – domain.local is our local AD controlled DNS.
VM no. 1 / mail.archive.domain.local
On *NIX VM no. 1 we installed a basic Postfix / Dovecot instance. We have chosen ‘Maildir’ as repository, more on that later.
Create 4 users ( both OS and Dovecot ), incoming, outgoing, inside, pickup ( you can get by with only one user for the actual archiving, but we find that splitting the data into 3 almost equal lumps are quite nice ).
Mount 4 NFS shares as follows – excerpts from /etc/fstab
nas.domain.local:/export/inside on /home/inside type nfs (rw,wsize=16384,vers=4)
nas.domain.local:/export/outgoing on /home/outgoing type nfs (rw,wsize=16384,vers=4)
nas.domain.local:/export/pickup on /home/pickup type nfs (rw,rsize=16384,wsize=16384,vers=4)
nas.domain.local:/export/incoming on /home/incoming type nfs (rw,wsize=16384,vers=4)
VM no. 2 / search.archive.domain.local
On *NIX VM no. 2 we installed maildir-utils – http://manpages.ubuntu.com/manpages/oneiric/man1/mu.1.html. It is available from the Ubuntu repositories, so all you need is :
#apt-get install maildir-utils maildir-utils-extra
Mount 4 NFS shares as follows – excerpts from /etc/fstab
nas.domain.local:/export/inside on /home/inside type nfs (ro,rsize=16384,vers=4)
nas.domain.local:/export/outgoing on /home/outgoing type nfs (ro,rsize=16384,vers=4)
nas.domain.local:/export/pickup on /home/pickup type nfs (rw,rsize=16384,wsize=16384,vers=4)
nas.domain.local:/export/incoming on /home/incoming type nfs (ro,rsize=16384,vers=4)
Note that we mount the NFS shared from search.archive.domain.local as ReadOnly! ( except the pickup-mount ) This is the indexing / search VM – no need to give ReadWrite.
We have created 3 cron jobs – They can be run daily, weekly or monthly ( /etc/cron.daily , etc. ) – as you preefer.
The jobs are as follows :
mu-incoming
#!/bin/sh
# Index /mnt/incoming/Maildir with maildir-utils / mu
mu index --quiet --autoupgrade --maildir=/mnt/incoming
mu-outgoing
#!/bin/sh
# Index /mnt/outgoing/Maildir with maildir-utils / mu
mu index --quiet --autoupgrade --maildir=/mnt/outgoing
mu-inside
#!/bin/sh
# Index /mnt/inside/Maildir with maildir-utils / mu
mu index --quiet --autoupgrade --maildir=/mnt/inside
Enterprise mail system / mail.domain.local / mail.domain.tld
On the enterprise mail system you create 3 rules for BCC’ing to the following.
Inside senders -> inside reciepients – BCC -> inside@mail.archive.domain.local
Outside senders -> inside reciepients – BCC -> incoming@mail.archive.domain.local
Inside senders -> ouside recipients – BCC -> outgoing@mail.archive.domain.local
( make your own logic / naming, if the above is confusing – it works for us 🙂 )
Now you have a working mail archiving. But we still need to be able to find an archived mail. Enter maildir-utils!
On the search.archive.domain.local the following are nifty commands :
#mu find to:user@domain.tld someone --fields "d f t l"
(d=date, f=from, t=to, l=location)
Finds all mail to user@domain.tld AND ‘someone’ is present either in sender, subject or body
#mu find from:user@domain.tld someone --fields "d f t l"
Finds all mail from user@domain.tld AND ‘someone’ is present either in sender, subject or body.
#mu find from:user@domain.tld @otherdomain cc:someoneelse@domain.tld –fields “d f t l”
Finds all mail from user@domain.tld to someone @otherdomain.tld AND CC to someoneelse@domain.tld
#mu find to:user@domain.tld date:20140801..20140901 findthistext --fields "d f t l"
Finds all mail from user@domain.tld from 1st of August 2014 to 1st of September 2014 AND contains ‘findthistext’ in either sender, subject or body.
In your output you get the location of the actual mail. You can use this to qualify that you have actually found the right mail.
#mu view /mnt/inside/Maildir/cur/2358755128.M708023P28292.mail,S=3905,W=6436:2, | more
We are now ready to actually retrieve the mail from the archive and send it on its way to newfound glory!
#cp /mnt/inside/Maildir/cur/2358755128.M708023P28292.mail,S=3905,W=6436:2, /mnt/pickup/Maildir/cur/
On a mail client configured with the pickup user towards mail.archive.domain.local, you should now be able to see the retrieved mail in you favorite mail client, and hence decide its future.
Some notes about performance.
We now have about 2 years of archived mails – about 10 million mails in all. It stresses the above configuration, so here is a few tweaks / tunes that might be of need if your are reaching the systems limit.
– The search.archive.domain.local needs a lot of memory to run the mu-commands. No less than 16 GB.
– There are some Dovecot / Maildir / NFS tunes – have not found them for this blogpost, but will update when I find them.
– Consider making the initial Dovecot storing on local disks, and then rsync them to a NFS share instead. Let search.archive.domain.local catch them there. Should be much faster, but also doubles the amount of diskspace you need to have available.
Feel free to comment on the above, and make suggestions for changes. We aim to improve – always!
Thanks for reading!