How to Filter All Incoming (and/or Outgoing) Mail with procmail

Weldon Whipple <weldon@whipple.org>


Contents

Many documents tell how an individual user can use procmail to filter incoming e-mail. There are likewise many documents that describe how to filter all incoming mail as it is delivered to local users. This page gives instructions for configuring sendmail to procmail-filter all incoming e-mail--including mail that will be relayed to non-local virtual or aliased users or mailing list subscribers.

These instructions aren't new--they are mentioned briefly in the main procmail man page! However, the procmail man page assumes that you edit your sendmail.cf file directly. Modern sendmail configurations are modified using an mc (macro configuration) file and m4 macros. The current document attempts to show how to use the technique with an mc file.

Introduction

There are still many books and web pages that show users how invoke procmail from a ~/.forward file if procmail isn't the local delivery agent:


"|IFS=' ' && exec /usr/bin/procmail -f- || exit 75 #weldon"

This no longer works in a "stock" sendmail configuration. If the mail administrator really wants to let users execute arbitrary programs from their .forward file (not a good idea in today's world!), she must relax sendmail's default restrictions with the DontBlameSendmail option in the sendmail.cf file.

It is much easier to just configure procmail as sendmail's local delivery agent. Then a user can create personal procmail rules in a file named .procmailrc in their home directory. As procmail delivers the mail, it detects the file's presence and executes the personal rules.

You can tell if procmail is sendmail's local delivery agent by looking for a line in your sendmail.cf file that begins "Mlocal" and contains a string that looks something like "P=/usr/bin/procmail".

As sendmail's local delivery agent, procmail can also run system-wide rules during delivery to local users. (See your procmail man page for the exact location and syntax of that rule file on your OS. It is probably stored as /usr/local/etc/procmailrc, /etc/procmailrc, or /etc/mail/procmailrc.)

This document describes how to configure sendmail to procmail-filter all incoming mail--even mail destined for mailing lists, virtual users, or aliases that don't have local mailboxes.

Overview

We create a new "procmail" delivery agent to deliver to our filtering rules.

Our procmail delivery agent is not the local delivery agent--even though the local delivery agent might (also) use the procmail program. If the procmail program is the local delivery agent, then our installation will have two delivery agents that call the procmail program: the (standard) local delivery agent as well as our new procmail delivery agent.

Sendmail's rule set 0 [zero] (the "parse" rule set) identifies the delivery agent that will deliver to each e-mail recipient. We show how to modify rule set 0 to filter all incoming mail (or some set of mail) through procmail

Rule set 0 typically selects the SMTP delivery agent to send mail to other servers, the program delivery agent to call programs (such as vacation), or the local delivery agent to deliver to local mailboxes.

We create a new procmail rule file by copying the system-wide procmailrc file and modifying at least one of its rules.

Warning, Achtung, Aviso!

Before attempting this modification, carefully read these warnings! The first two items describe conditions that might cause MX loops--and which must be corrected in order for this technique to work!

  1. These instructions are incompatible with wildcard MX records. If any of the (virtual) domains hosted by the e-mail server have wildcard MX records, sendmail might loop! You can recognize a wildcard MX record if a DNS query returns an asterisk at the beginning of the left-hand-side of a DNS MX record.

    For example, if you type the following at a command prompt:

    % dig mx whipple.org
    
    and one of the returned lines begins with an asterisk (*):
    *.whipple.org.            1D IN MX        10 mail.whipple.org.
    
    then you need to replace the wildcard record with separate MX records for each subdomain before proceeding.

  2. If the virtusertable has any entries with a right-hand side (RHS) of the form: user@somedomain.com, make sure that none of the "somedomain.com" entries refer to domains that are hosted on the local mail server. (Replace the RHS of such entries with the mailbox name of a local user (without trailing "@somedomain.com") or with a name that matches the LHS of an entry in the aliases database.)
  3. This configuration is not for the faint of heart--it may require significant effort to uncover typos and other errors! Be sure to back up your current sendmail.cf, mc and procmailrc files before proceeding.
  4. Before you write me about something that doesn't work, double check your configuration to make sure that what you have typed matches exactly what appears in these instructions.
    I occasionally receive e-mail from "creative" implementers who have tried something different from what appears in these instructions, and want me to help them figure out why their implementation doesn't work. My advice: Start by implementing exactly what appears in these instructions, and make sure it works. Then--if you want something slightly different--try modifying these instructions.
  5. Finally, try this first on a non-production server to avoid lost e-mail. Proceed at your own risk!

Step by Step: Filtering All Incoming Mail

Step 1: Change the mc File that Generates Your sendmail.cf File

Add a procmail pseudo-class to the LOCAL_CONFIG section of your mc file
If your mc file doesn't already have a LOCAL_CONFIG section, add one. If it does, just add the following line to the existing LOCAL_CONFIG section:
LOCAL_CONFIG
CPprocmail
The P class is a list of pseudo top-level domains. These instructions require that a domain like whipple.org be (temporarily) changed to whipple.org.procmail--with "procmail" looking like a top-level domain.
Create a procmail mailer in the mc file.
Find the section of the mc file with the "MAILER" directives--most likely at or near the end of the file. Add a line that reads "MAILER(procmail)" after the other MAILER lines.

The following lines are near the end of my mc file (right before the LOCAL_CONFIG and LOCAL_RULESETS sections--which are at the very end):

FEATURE(`local_procmail')dnl This makes local mailer (Mlocal) use procmail
MAILER(local)
MAILER(smtp)
MAILER(procmail)dnl Add this line to create the procmail mailer
My own (FreeBSD) configuration uses the procmail program for the local mailer (i.e. the local delivery agent). Because the line "MAILER(local)" is preceded by the line "FEATURE(`local_procmail')", my local mailer uses procmail instead of FreeBSD's default mail.local program.

Modify the arguments and mailer flags that the new procmail mailer will use.
Insert the following lines "fairly early" in the mc file (before the MAILER lines and before FEATURES that refer to procmail):

define(`PROCMAIL_MAILER_ARGS', `procmail -m $h $g $u')dnl
define(`PROCMAIL_MAILER_FLAGS', `mSDFMhun')dnl
In some of my installations, I have overlooked the above lines--and the filtering still worked. However, the procmail man page includes the equivalent of the above in its instructions--and they work.
Add some rules to the LOCAL_RULE_0 section of the mc file.
This is the "trickiest" step so far. Make sure you are alert, and that you put tabs (by pressing the tab key) between the left and right sides of each rule line!

If your mc file doesn't already have a LOCAL_RULESETS section (introduced by a LOCAL_RULESETS line, near the very bottom of the file--after the MAILER lines), add one by typing "LOCAL_RULESETS" on a line by itself. Then add the following LOCAL_RULE_0 lines to the LOCAL_RULESETS section:

LOCAL_RULESETS

LOCAL_RULE_0
R$* < @ $=w > $*    <tab>$#procmail $@ /usr/local/etc/procmailrcs/rc.whipple.org $: $1<@$2.procmail.>$3
R$* < @ $=w. > $*   <tab>$#procmail $@ /usr/local/etc/procmailrcs/rc.whipple.org $: $1<@$2.procmail.>$3
R$* < @$* .procmail. > $*  <tab>$1<@$2.>$3 <tab>Already filtered, map to original address

Be sure to replace <tab> with actual tab characters! (Press the tab key in your editor.)

Change the string (in bold italics above) that follows the $@ to the fully qualified path of a new procmail rc file that the special procmail mailer will call. Choose a name that is meaningful to you.

The new procmail rc file should be different from procmail's default system-wide procmailrc file! Otherwise, the file will be called twice--once by our new procmail mailer and again by the the procmail local mailer. Because the logic of the two files is slightly different, a single file probably won't work for both mailers.

Here is what the above three rules do:

  1. The first rule checks for any e-mail whose recipient domain is in class w (local-host-names). It selects our new procmail mailer, specifies the name of our procmail rule file, and appends ".procmail" to the end of the e-mail address.
    If the domain whipple.org is listed in local-host-names and the recipient is weldon@whipple.org, the first rule changes the e-mail address to weldon@whipple.org.procmail.
  2. The second rule is like the first, but handles the (unlikely) case where the e-mail address ends in a period (weldon@whipple.org. instead of weldon@whipple.org).
  3. The third rule matches on the second pass--after our procmail filter accepts the mail and calls sendmail a second time--when the e-mail address ends in ".procmail" (like weldon@whipple.org.procmail).
    The "To" address passed to our procmail rules has a ".procmail" suffix (e.g. weldon@whipple.org.procmail). Our procmail rules will end with a default rule that calls sendmail a second time (recursively). It is the recursive call to sendmail--passing the e-mail address with ".procmail" suffix--that matches the third rule above.

    The third rule "un-does" what the first rules did, restoring weldon@whipple.org.procmail to weldon@whipple.org and falling through to the "system" rule set 0, so that delivery can continue as it would have without our "detour."

Step 2: Create a New procmail rc File

Begin by copying the system procmailrc file to the file named in the first and second rules of LOCAL_RULE_0 (above.)

We will make two kinds of modifications to the copied file.

  1. Add the following rule at the bottom of the file:
    :0 w
    ! -oi -f "$@"
    
    This is the rule that recursively invokes sendmail (honest!) with the "pseudo" recipient e-mail address with a ".procmail" suffix. Without this rule, the procmail filtering would return to the system rule set 0 and fail because the domain whipple.org.procmail is not in local-host-names ...

    The procmail man page gives an alternate--slightly more complex--rc file:

    # The following two lines go near the top of the rc file--before any rules
    SENDER = "<$1>"                 # fix for empty sender addresses
    SHIFT = 1                       # remove it from $@
    
    # All the procmail rules go here
    # .
    # .
    
    # The very last rule (at the bottom of the rc file):
    :0 w
    ! -oi -f "$SENDER" "$@"
    
    If you follow the syntax from the procmail man page (and include "$SENDER" after the -f switch), be certain not to overlook assigning the SENDER environment variable at the top of the procmailrc file and calling shift to remove the first argument from the argument string. When I made that mistake, I found that the e-mail was delivered not only to the original recipient (to whom the mail was addressed), but also to the sender and to the root user!
    FWIW, Martin McCarthy, in The Procmail Companion (Addison-Wesley, 2002), p. 144, uses the simpler format that I show first--without the $SENDER variable.
  2. Scan the rc file for rules that deliver to a user's local mailbox. You might need to replace the action with a recursive call to sendmail (as above). Thus, an old rule like:
    ## Let these through unconditionally!
    :0 
    * 2147483647^0 ^TO_WHIPPLE-L@rootsweb\.com
    * 2147483647^0 ^From:.*mom\.whipple@whipple\.org
    $HOME/mbox
    
    might become:
    ## Let these through unconditionally!
    :0 
    * 2147483647^0 ^TO_WHIPPLE-L@rootsweb\.com
    * 2147483647^0 ^From:.*mom\.whipple@whipple\.org
    ! -oi -f "$SENDER" "$@"
    
    On second thought, the above is probably not necessary at all. E-mail me if you can find instances where this is necessary.

Step 3: Finish Up

Generate a sendmail.cf file from your mc file.
Watch closely for syntax errors and fix them before proceeding.
Rename the system-wide procmailrc file
You will probably want to rename the old system-wide procmailrc file, so that sendmail's local mailer doesn't call it. (If you specifically want two system-wide procmail filters, however, you can leave it in place.)
Copy the new cf file to /etc/mail/sendmail.cf
Your OS or distribution might store the sendmail.cf in a different location.
Restart sendmail.
Check the maillog, send some test messages and make sure that sendmail works properly.

Variations

You might need to vary the configuration shown above. If you are a secondary MX server (for example), the owner of the primary mail server might want you to do preliminary procmail filtering. Alternately, you might want to filter for only one or two of the virtual domains that your server hosts. Modify the above steps according to the instructions below for slightly different results.

1. Filtering for Relay Domains

Secondary mail servers are typically relay servers for (some other) primary mail server: if the primary mail server is down, the originating mail server sends mail to the secondary, which queues it until the primary server is back up. Then the secondary sends the mail to the primary.

One way of specifying domains for which your server is secondary mail server is with the relay-domains file (typically stored in the same directory as local-host-names--in the /etc/mail directory). Domains are listed in the relay-domains file in the same format as they are in local-host-names--generally one domain per line. The domains listed in relay-domains populate the R class ($=R).

To filter mail that is being relayed through your server (in addition to your server's local domains), modify LOCAL_RULE_0 to read something like:

LOCAL_RULE_0
R$* < @ $=w > $*    <tab>$#procmail $@ /usr/local/etc/procmailrcs/rc.whipple.org $: $1<@$2.procmail.>$3
R$* < @ $=w. > $*   <tab>$#procmail $@ /usr/local/etc/procmailrcs/rc.whipple.org $: $1<@$2.procmail.>$3
R$* < @ $=R > $*    <tab>$#procmail $@ /usr/local/etc/procmailrcs/rc.whipple.org $: $1<@$2.procmail.>$3
R$* < @ $=R. > $*   <tab>$#procmail $@ /usr/local/etc/procmailrcs/rc.whipple.org $: $1<@$2.procmail.>$3
R$* < @$* .procmail. > $*  <tab>$1<@$2.>$3 <tab>Already filtered, map to original address

The third and fourth rules reference $=R (the R class--relay domains) instead of $=w (local domains). They are otherwise identical to the first two rules.

2. Filtering for a Few Local Domains

Maybe you don't want to filter all the domains your server hosts. With this variation, we create a new file /etc/mail/filtered-domains, which will list the domains we want to filter.

In the LOCAL_CONFIG section of your mc file, add a line like:

LOCAL_CONFIG
F{FilteredDomains} -o /etc/mail/filtered-domains

List the domains you want to filter in /etc/mail/filtered-domains,

Then modify LOCAL_RULE_0 to read:

LOCAL_RULE_0
R$* < @ $={FilteredDomains} > $*    <tab>$#procmail $@ /usr/local/etc/procmailrcs/rc.whipple.org $: $1<@$2.procmail.>$3
R$* < @ $={FilteredDomains}. > $*   <tab>$#procmail $@ /usr/local/etc/procmailrcs/rc.whipple.org $: $1<@$2.procmail.>$3
R$* < @$* .procmail. > $*  <tab>$1<@$2.>$3 <tab>Already filtered, map to original address

3. Copying All Mail to an Archive

A solution similar to this one is addressed by sendmail FAQ 4.20, indicating that it is immoral to archive all mail. (I have to agree that it probably is immoral ... but I like challenges.) Homme Bitter and Per Hedeland posted another solution, but I can't make it work. The solution below is not optimal, to be sure, but it seems to pass the test cases I have come up with. As Per said in his post, if you really want to do this, a milter is probably a better solution!

Modify LOCAL_RULE_0 to match any e-mail recipient:

LOCAL_RULE_0
R$* < @ $+ .procmail. > $*  <tab>$@ $1<@$2.>$3   <tab>Already archived, map back
R$* < @ $+ .procmail > $*   <tab>$@ $1<@$2.>$3   <tab>Already archived, map back
R$* < @ $+. > $*    <tab>$#procmail $@ /etc/procmailrcs/rc.archive $: $1<@$2.procmail.>$3
R$* < @ $+ > $*     <tab>$#procmail $@ /etc/procmailrcs/rc.archive $: $1<@$2.procmail.>$3

Notice that the rule order is significantly different. This is by design. The first two rules' RHS is introduced by a $@ "rewrite-and-return" prefix to avoid infinite archiving attempts. Also, I named the procmail rc file rc.archive for this example.

This is the rc.archive procmail rc file that I got to work with the above LOCAL_RULE_0. (Be sure to test this on a non-production server if you implement it!):

VERBOSE=0
LOGABSTRACT=yes
LOGFILE=/var/log/procmail-log
FORMAIL=/usr/bin/formail
EOL="
"
LOG="$EOL Entering rc.archive ... $EOL"

SENDER = "<$1>"                 # fix for empty sender addresses
SHIFT = 1                       # remove it from $@

## My box requires this next rule to make sure that each e-mail in
## the archive begins with "From " ... (???) You might not need it ...
:0 fhw
| ${FORMAIL} -I "From " -a "From "

## Add conditions after the next line for selective archiving.
:0 c:
/var/mail/Archive

:0 w
! -oi -f "$SENDER" "$@"

After you get the above to work, you can add conditions to the rule that saves to /var/mail/Archive, to archive based on the sender and recipient headers. (You may also be able to test the value in $SENDER--the envelope sender?)

4. Filtering Outgoing Mail with Procmail

After I wrote the first version of this document, I purchased a copy of Craig Hunt's sendmail Cookbook (O'Reilly, 2004, ISBN 0-596-00471-0). In recipe 6.8 ("Filtering Outbound Mail with procmail," pp. 226-29), Craig shows hows to filter all outgoing mail with procmail, using sendmail's mailertable feature.

If my variation 3 ("Copying All Mail to an Archive") doesn't work for you, you might want to try Craig's recipe.

Sendmail's mailertable feature is a database hook into sendmail's rule set 0--the same rule set we've been modifying with our LOCAL_RULE_0 throughout this document. As you read Craig's recipe, much of it might seem familiar if you have read--and understood--my document.

Appendix: Minimizing the Generation of Bounce Messages

Background

When configuring a mail server to block incoming mail, it is always preferable (IMHO) to block it early--during the SMTP conversation, if at all possible. If the mail server can refuse the mail during the SMTP conversation, then it won't be responsible for generating and sending a bounce message to the (often non-existent) envelope sender. (Instead, the incoming mail server that connected to my server can create the bounce message, if one is needed.)

In the early days of the virtusertable feature, it was "cool"--and easy--for hosting companies to offer "wildcard" e-mail addresses: e-mail sent to <anybody>@mydomain.com could be delivered to a single mailbox, or even forwarded to the owner's personal e-mail address. Unfortunately, "dictionary" spammers found that they could send to anybody at a wildcarded domain and deliver the mail successfully. Hosting companies have responded by replacing the forwarding wildcard virtusertable entry with a wildcard that blocks by default:


@whipple.org    error:5.7.0:550 Unknown user
Then they add entries for specific addresses like webmaster, service, info, abuse, as well as entries that map to actual local users in /etc/passwd (weldon@whipple.org, tom@whipple.org, dick@whipple.org, harry@whipple.org, ...).

The Problem

Filtering all incoming mail with procmail as described in this document has a problem: LOCAL_RULE_0 rewrites the recipient address by appending ".procmail" and calls the special procmail mailer, ending the SMTP conversation with the incoming mail server--before the address is passed through the virtusertable and found to be an "unknown user." The address isn't passed through virtusertable until the second invocation of sendmail. By that time our server's first sendmail instance has already returned a 2xx status to the incoming server (accepting the mail). When virtusertable finally rejects the e-mail, our server is forced to generate a bounce message. If the e-mail was SPAM (probable), the purported sender doesn't exist, and our bounce message will likely generate a double-bounce, etc., etc.)

The Bat Book (3rd ed., p. 203) shows the "flow of rules through the parse rule set 0" as follows. A close look will help understand the problem we face.

  1. Basic canonicalization (list syntax, delete local host, etc.)
  2. LOCAL_RULE_0 [This is where our procmail mailer is called]
  3. FEATURE(ldap_routing)
  4. FEATURE(virtusertable) [Where our blocking wildcard is]
  5. Addresses of the form "user@$=w" passed to local delivery agent [This is where "normal" local delivery occurs.]
  6. FEATURE(mailertable)
  7. UUCP, BITNET_RELAY, etc.
  8. LOCAL_NET_CONFIG
  9. SMART_HOST
  10. SMTP, local, etc. delivery agents. ["Leftover" local addresses]

In order to block mail to unknown users during the SMTP conversation, we need to detect them before our LOCAL_RULE_0 calls our procmail mailer.

A Solution

I have successfully solved the problem by adding tagged "To:" entries to access db. Here is how I did it:

1. Modify sendmail.cf

My objective is to blacklist--at an earlier stage in the SMTP conversation--recipients that don't exist on my server, so I added the following lines to my sendmail macro configuration file:


FEATURE(`access_db')dnl 
FEATURE(`blacklist_recipients')dnl
Then I regenerated my sendmail.cf file.

2. Add tagged "To:" entries to the access db

Next, while looking through virtusertable and aliases, I added tagged "To:" entries to the access database, allowing mail to valid users for each domain I host. Here is a segment for one domain:


To:postmaster@whipple.org               OK
To:mailer-daemon@whipple.org            OK
To:abuse@whipple.org                    OK
To:webmaster@whipple.org                OK
To:weldon@whipple.org                   OK
To:tom@whipple.org                      OK
To:dick@whipple.org                     OK
To:harry@whipple.org                    OK
# ...
# Additional entries for other valid whipple.org recipients
After enumerating all legal recipients, I added a wildcard to block all other recipients at whipple.org:

To:whipple.org                          ERROR:5.7.0:550 Unknown user

3. Regenerate access.db

I used the following makemap command:


# cd /etc/mail
# makemap hash access < access
Your installation's command for generating access.db might vary from the above command.

Analysis: Why access.db Works when virtusertable.db Doesn't

Before sendmail calls rule set 0 (the parse rule set), it calls some "check_*" rules in response to information provided by the incoming server early in the SMTP conversation. We will mention three of those rules here.

check_relay (also Local_check_relay)
Called by sendmail when the incoming server connects to ours.
check_mail (also Local_check_mail)
Called when the incoming server isues the MAIL FROM: command.
check_rcpt (also Local_check_rcpt)
Called each time the incoming server issues a RCPT TO: command.
The order of the three above check_'s can be reversed if your mc file uses FEATURE(`delay_checks'). Even so, all three are called before LOCAL_RULE_0.

The following fragment shows part of what can happen in the early parts of an SMTP conversation:

Incoming server connects to ours
sendmail calls check_relay rule set, which checks tagged "Connect:" entries in access db.
Incoming server issues EHLO SMTP command
(sendmail's response not part of this example)
Incoming server issues MAIL FROM SMTP command
sendmail calls check_mail rule set, which checks tagged "From:" entries in access db.
Incoming server issues RCPT TO SMTP command
sendmail calls check_rcpt rule set, which checks tagged "To:" entries in access db.
(More SMTP commands)
(More sendmail responses, including LOCAL_RULE_0, if present)

Sendmail's checking of tagged "To:" entries in access db occurs before our LOCAL_RULE_0, successfully identifying (and rejecting) mail to non-existent local users before calling our special procmail mailer.

(Feel free to send your solutions to this puzzle ...)

Feedback

This document is a work in progress. Please send corrections, ideas and other suggestions to me.