There are times in technology when something goes wrong. It happens. It’s the cost of progress.
It can be at lot of fun finding what caused things to go wrong and then fixing it.
Follow this trip, for instance. Hopefully it helps someone else with google hosted addresses which forward to an internal mailing list server running mailman.
The story starts with a third party application that included a MailMan mailing list installation, which, with its own custom hooks, it worked well. The lists were not used much, perhaps annually, but they did the job.
The organization has moved all of its mail to Google, and, because of the custom nature of this application and its mailing lists, those lists remained on this server. That was over a year ago. It gets reported that mail to those lists is going in to a veritable black hole; completely disappearing never to be seen again.
Get out the shovel and start digging through the layers. Verifying that outbound mail is working from the command line, from the mailman server (subscribe yourself from the web interface), and verify that inbound email is making it to the server. All of that checks out, but you are left with a log entry in /var/log/mailman/vette which reads:
Apr 23 12:45:56 2013 (3254) Message discarded, msgid: <5176BAAE.firstname.lastname@example.org>, handler: Approve
At least we know it is making it to Mailman before being dropped to the floor. Google time! Only to discover that your keywords “mailman” “vette”, “message discarded” and “handler: approve” come up with a lot of hits. Weeding through them takes a bit of time, but no fruit is yielded. Refining that search doesn’t change too much.
Take a break and accomplish a few other things, then after 10 minutes, come back. Search again. Read. Research. Close the over abundance of tabs that were left open. Open a few more. Refine that search one last time and the message you were looking for is there: http://mail.python.org/pipermail/mailman-users/2009-September/067226.html
Reading through it, it looks and sounds like what you are experiencing. At there is actually an intelligent answer in that posting. Ok.. Time to prove we have a match.
Back to Mailman. Check out the postfix configuration. Being a little rusty with postfix, and not being the person who built this, it’s a little daunting since there are references to /etc/mailman/aliases (wha?!). Fiddle with the aliases there, only to find out that it is really using /etc/aliases and some custom version of newaliases. Finally get that working such that any inbound email to the list will get dropped to the root user.
Fire off a couple of emails to the list and tail /var/log/maillog. Scratch your head as you watch the message get pushed off to Mailman when you explicitly told it not to. Dig, dig, and find that someone created a cronjob to rebuild /etc/aliases every 30 minutes. Look at the clock, it’s 32 minutes after. Fix the aliases file again, run newaliases, restart postfix, and fire off your test email again.
Note that you are now on test email #8, each of which includes the line “Please ignore and discard this message. There is no need to respond.”
Watching the logs, you see the message come through and get saved. Checking out /var/mail/root, you see the “X-BeenThere” header mentioned in the posting. Viola. That’s the problem!
Clean-up time. Revert /etc/aliases despite knowing that a cronjob will revert it for you anyhow and restart postfix. Back up
And modify the last few lines to include a custom version of the “x-beenthere” check. Move the .pyc and .pyo files to the same location you stored your backup. Restart mailman. Restart postfix just for kicks. Run a test through, number 9, and you watch it come in, get processed, and a slew of emails go out. Hooray.
Notify everyone! Oh, and then you check everything back in to puppet to make sure you don’t have to remember this stuff, and, while you are doing so, start deleting all of the unsolicited responses to your test message.