C. Mail Systems

(This page uses style sheets.)

This appendix details mail systems in general because the range of OSS mail products can sometimes be confusing and the terminology used is not always clear.

The Internet Mail Model is based on a number of logical components, each of which has a specific job to do and communicates with the others through the use of open protocols. This model is the one used by OSS systems. The model can be best described with the help of some diagrams.

[Mail system diagram 1]

This diagram shows the path for the delivery of a single email. The mail is generated by a Mail User Agent (MUA). It is then passed to a mail server which has to decide whether it can deliver the mail locally or whether the mail must be passed to another server. The mail is passed from server to server until one of them decides that it can deliver the mail locally, which it then does. When this delivery is complete, the mail is then available to a MUA to read it. The final MUA has the responsibility of retrieving the mail as well passing it to a Mail User Interface (MUI) to display it to the user.

How each mail server decides whether to deliver locally or not could be the subject of another chapter. Briefly, each server consults a local configuration file or files together with information from DNS servers (principally the MX records). This is all then used to decide what is considered local. For non-local mail the server then uses this information to determine the address of the next mail server to send the mail to.

Each mail server in general has the structure shown in the Figure 2.

The Mail Transport Agent (MTA) accepts connections from other mail servers and MUAs via the Simple Mail Transport Protocol (SMTP). If the mail is not for local delivery it is then sent by the MTA to another server. If the mail is for local delivery it is passed to a Mail Delivery Agent (MDA). The MDA is responsible for storing the mail in the user's mailstore. The mailstore is simply a way of storing data; for instance, a file, a series of separate files or even an SQL database. The precise storage structure is defined by whatever the MDA supports.

When a user wants to view her mail she uses a MUA which either retrieves the mail directly or contacts a server side component which retrieves the mail from the mailstore and passes it to the MUA. Such server side components do not fit into the traditional MTA/MDA/MUA model and we shall call them Mail Access Agents (MAA). This term is, however, not in common use.

The MUA communicates with a MAA using an open protocol which is usually either the Post Office Protocol (POP), or the Internet Mail Access Protocol (IMAP). The POP protocol normally deletes mail from the mailstore when it is passed to the client and IMAP normally leaves it there. The IMAP protocol also allows the MUA to alter the mailstore, for example by deleting mail or moving it from one directory to another.

[Mail system diagram 2]

The MUA may store the mail locally on the machine it is running on. This normally happens if POP is used. This local storage then allows future access to be independent of the server, which is particularly useful for machines that are not permanently connected to the network. IMAP, on the other hand, normally operates without local copies but it can also operate in what is called disconnected mode, which maintains a local copy, thereby allowing mail to be manipulated without a network connection. In this mode, the local and server mailstores are synchronised when a network connection is made. Unfortunately, not all MUAs fully support disconnected IMAP.

Sometimes a program other than a MUA retrieves the mail and stores it locally for a MUA to access without having to connect to the server itself. Such programs pull mail onto their machine in contrast to a standard MTA which has mail pushed to it by other MTAs. This can be useful if users do not want to allow connections to their machines from the Internet, or are operating behind a firewall. An example of such a program is fetchmail.

The difficulty with this model is that the applications available do not always map directly onto it. Applications very often do more than one of the functions; for instance a MTA may incorporate the MDA function, and the popular MTA Sendmail can even be used as a MUA in some circumstances.

As mail is passed from the originating MUA through the various servers to the final MUA, a series of headers is added which records details of the journey and also controls the processing of the mail by both the intervening servers and the final MUA. Some of these are Multi-purpose Internet Mail Extension (MIME) headers which are used for a range of control purposes, including support for non-ASCII character sets, support for embedded content such as images, and support for attachments. When a MUA attaches a file it records its type as a MIME header and it is then the responsibility of the final MUA to be able to decode it.

Parts of this model are discussed in more detail below.

C.1. MTA

Most MTAs allow the administrator to control where mail is to be accepted from. This is often done by limiting the range of IP addresses that the MTA will accept SMTP connections from. This is extremely valuable in preventing spammers using the MTA as a relay and clogging the network bandwidth to the MTA.

There is a set of about 20 extensions to SMTP called Extended SMTP or ESMTP. These extensions allow, among other things, faster transfer of mail between compliant MTAs, using the pipeline extension.

Another extension enables Transport Layer Security (TLS) encryption between compliant MTAs, and another, SMTP-AUTH, allows users to be authenticated using a range of techniques. Both extensions are useful when the MTA would not normally allow a client to connect because its IP address is outside its trusted address space. This could happen for instance if a laptop user dials in from a random site on the Internet. See C.4.2 below.

The original model assumed that the owner of a mail account had a login account on the mail server. This meant that the MTA could interrogate the local password file to authenticate users. This model is too restrictive and modern MTAs must now support Virtual Users where the account owner details are held in a database, often independently of the normal login account details. This means that a user could have a password for mail and a different one for logging in.

The database could be LDAP-supported, an SQL database or a flat file. MySQL is the preferred SQL server as it is efficient and fast in what is basically a read-only application. PostgreSQL and Oracle could also be used.

An LDAP-supported database is recommended as they offer better support for distribution.

Default LDAP implementations often use the Berkeley Database products from Sleepycat Systems.

Sometimes a machine may only connect to a mail server intermittently. This may happen in the case of home workers or laptop users, for instance. It may also happen in small offices where the cost of a permanent connection cannot be justified. In these circumstances the central MTA cannot forward mail as it would normally would, and it needs to store it until a connection is made. Similar comments hold for the MTA (if there is one) on the client machine, or, in the case of a small office, the gateway mail server. These MTAs need to be able to support such situations and they are often called Smart Hosts when they do.

Delivery from a Smart Host may be carried out using SMTP or POP3.

Delivery via SMTP is straightforward and security on the receiving machine can be improved by restricting inward connection from the Smart Host only.

Delivery via POP3 is achieved either by using a MUA or the fetchmail application. Fetchmail will download mail to a local mailstore as mentioned above or deliver to a local MTA if required, for instance where multiple mail accounts are involved.

Both of these methods work well but have the drawback that they do not allow the use of blacklists to prevent spam from open relays and other undesirable sources being accepted. Tools like SpamAssassin can eliminate most of the spam, but the processing costs are much higher, and increased bandwidth is used downloading the mail for examination.

C.2. MUA

The MUA and MUI together make up the package that most users think of as “the mailer”. This is the client software that runs either on a web server or directly on a desktop machine to allow people to send and receive mail. Some sort of storage is normally provided so that mail can be filed into “folders” or “local mailboxes” for future reference.

The MUA handles protocols such as SMTP for mail submission and IMAP or POP for mail retrieval and filing. It understands the format of mail messages and can decompose MIME messages into their component parts.

Where there is a requirement for strong end-to-end security, the MUA is also responsible for the encryption and signing of messages. There are two competing standards for this: S/MIME, which is based on X.509 certificates, and PGP/GPG which is based on a different certificate format with a web-of-trust model rather than a hierarchy-of-trust.

Most of the OSS MUAs support digital signing using the GNU Privacy Guard (GPG). Only a few support S/MIME signatures. Business and Government bodies have opted for the S/MIME standard and its use must therefore be supported.

C.3. Mailstore

Unix mail systems originally assumed that the owner of a mail account had access to the machine hosting the mail server and could read a file containing their mail – or, alternately, that mail would be delivered to the machine the user usually logged into to work. This was fine for environments with a small number of users who also needed a real login account on a machine with a mail server, but it is not practicable or secure generally.

The original format for storing mail was a single file per user with new mail being appended to the end. This file could get very large and reading through it to read a random piece of mail soon became inefficient. This format is often known as “mbox” and is still used by some MUAs, in particular for mail stored locally for the user. A modification to this was to hold each piece of mail as a separate file in a directory structure, which allows more efficient random access. One variant of this structure is called “mh” and a particular one with certain defined sub-directories and access procedures is called “maildir”.

Sometimes these structures were held on the mail server and exported to the clients using, for instance, NFS. This allowed the mail to be held centrally, which meant that it could be backed up properly, but introduced locking problems with the single file structure. However, using NFS has not proved popular, perhaps due to the lack of good NFS clients on Windows.

Not all MTAs support these different access methods directly, thus the need for MAAs. A MUA which cannot access the mailstore directly will have to use a MAA component using POP or IMAP.

Both POP3 and IMAP send passwords as clear text by default. IMAP can use hashed passwords if the MUA supports them. Use of TLS encrypted links is possible if the MAA and MUA support it, is advisable on local networks, and should be mandatory for remote access.

MTAs sometimes communicate with MDAs using the Local Mail Transport Protocol or LMTP. Most of the MTAs and MDAs support this.

C.4. Roaming Users

The problem with roaming users is that they can connect from unpredictable IP addresses, so the normal methods used by MTAs to decide whether to accept incoming mail will prevent them sending mail via the Administration's mail server. The MTAs have to restrict access to themselves from unknown clients to prevent their use by spammers as a third-party relay.

Three general techniques are available to get round this problem.

C.4.1. Virtual Private Networks (VPNs)

In a VPN, the remote machine can be allocated an address that can be included in the MTAs trusted address space. The problem is that full access to the internal network will be available to anyone who gets access to the remote machine, a significant risk with laptops, unless the access keys are encrypted with a password entered whenever the connection is initiated. Unfortunately, users sometimes configure their machines to remember passwords.

C.4.2. SMTP-AUTH and TLS

The SMTP-AUTH extension to SMTP allows a MTA to be configured to request a password to authenticate the remote user. The principal authentication methods are PLAIN, LOGIN and CRAM-MD5.

PLAIN requires the password to be held in clear on the client but can be encrypted on the server. If the SMTP connection is unencrypted then the password is passed in clear (although in base-64) over the network.

LOGIN is less efficient than PLAIN as it requires three network interactions rather than one and, like PLAIN, the username and password travel in clear over the network.

CRAM-MD5 encrypts the username and password as they pass over the network. However, the password must be held in clear text on both the client and server. It requires only two network interactions.

Not all MUAs support SMTP-AUTH and those that do may only support a limited number of methods. For instance Outlook Express only uses LOGIN.

Compared to using a VPN, the only access enabled is sending mail, so other services would not be compromised if the remote machine were stolen.

ESMTP also allows a TLS session to be negotiated between the client and server. This connection encrypts data on the network and can also authenticate the client machine. Authentication requires a client certificate matching one held on the server.

C.4.3. POP-before-SMTP.

This method takes advantage of the fact that the POP and IMAP protocols require password authentication.

After a successful POP or IMAP connection to read mail, the MAA maintains an authenticated login database with the client IP address, date and time. When the client tries to send mail that is not for the local domain, the MTA checks whether the client IP address is in its trusted address space. If not, it then checks the authenticated login database for the IP address. If no authenticated login has been recorded from the client's IP address, or the last authenticated connection didn't happen recently enough, the MTA refuses to relay the message. The time period is configurable and typically defaults to 20 minutes. This method needs the MAA and the MTA to co-operate. For this reason, not all combinations work.

This method has the disadvantage that users must check for incoming mail first. Some users may find this difficult unless the MTA automatically does it for them.

C.5. Performance

In general an MTA uses little processor power; hosts running nothing else are usually limited by the network bandwidth or disk performance. IMAP and POP servers require more processor power and IMAP requires a little more RAM than POP. However, none of this is likely to be an issue on current hardware.

Anti-virus scanners require a lot of RAM and processor power, especially if MIME attachments are allowed.

Even so, performance limits are usually set by traffic rather than by the number of accounts.

Below are some performance examples culled from reports to mailing lists and from the experience of netproject's consultants. These are included to give some idea of what is required:

Site 1 – 2 x Pentium III Xeon 2.4G, 4 Gb RAM, 3 x 36GB SCSI Raid 5

Virtual Users with lookup on MySQL.

Postfix 2.0.6, Courier-IMAP 1.7, MySQL 4.12, RAV-Antivirus, Mailman 2.1, Red Hat Linux 8.0, no SSL.

Around 4,800 users.

Site 2 – Athlon 1200, 1 Gb RAM, RAID5

Postfix + Courier-IMAP (anti-virus scan on another machine), no SSL.

8,500 Users.

Site 3 – Pentium 133, 40 Mb RAM, IDE disk

Debian GNU/Linux, Courier-MTA + Courier-IMAP + SpamAssassin (the latter for one user only).

Typically 18 POP3 users and 7 IMAP users at any time.

Processor about 20% occupied.

Site 4 – dual Pentium II 450 Xeon, 256 Mb RAM

MySQL, Courier-MTA, Courier-IMAP, sqwebmail, SSL.

50 users, mainly POP3.

Site 5 – Pentium II 400 with 256M RAM

Courier-MTA + SpamAssassin, Red Hat Linux 8.0.

300 mailboxes, around 4,000 messages per day.

Site 6 – Pentium III 677Mhz, 512Mb RAM, 2 x IDE disk

FreeBSD 4.7, Exim 4.05, OpenLDAP 2.1.5, Cyrus 2.1.11, Mailman 2.1, Apache 1.3.26.

The machine is principally a busy webserver, but it also handles several thousand mails per day without any noticeable additional load.