As mentioned, the router picks up message files from a specific directory. Normally, message file names can be arbitrary valid file names, and indeed this is convenient when debugging. However, because the router daemon scans its own current directory, miscellaneous output from the router process may show up in this directory (e.g. profiling data, or core dumps (unthinkable as that is)). Furthermore, it is useful to be able to hide files from the router scanning (indeed the router may wish to do so itself).
When the router process is scanning for message files then, it only considers at file names that have a certain format. Specifically, the message file name must start with a digit. This method was chosen to accomodate the message file names, as generated by the standard submission interface library routines, which will be strings of digits representing the message file's inode number.
A message file contains three sections: the message envelope, the message header, and the message body (in that order). he message body is separated from the previous sections by a blank line. The message body may be empty, and either of the message envelope or message header may be empty. The restriction on the latter situation, is that one of those sections must contain destination information for the message.
The message envelope and the message header have very similar syntax. The only difference is that while the message header must adhere to RFC822, the message envelope header fields are terminated by whitespace (`` '') instead of a colon (``:''). The semantics of the two message file sections is quite different, and will be covered later.
The message envelope headers are used to carry meta-information about the message. The goal is to carry transport-envelope information separate from message (RFC-822) headers, and body. At first the message starts with a set of envelope headers (*-prefix denotes optional):
*external \n *rcvdfrom %s (%s) \n *bodytype %s \n *with %s \n *identinfo %s \n Either: from <%s> \n Or: channel error \n *envid %s \n *notaryret %s \n Then for each recipient pairs of: *todsn [NOTIFY=...] [ORCPT=...] \n to <%s> \n Just before the data starts, a magic entry: env-end \n |
The header fields recognized by ZMailer in the message envelope are:
not used. Compatibility with the sendmail feature
sets the channel corresponding to the message origin(*), usually as ``channel error''
arbitrary comment
separator between the envelope and the RFC822 headers
alias to env-end
ESMTP DSN ENVID value
keyword indicating the external origin of a message
a source address(*)
sets the full name of the local sender
The SMTP server's ident lookup result, this does not guarantee anything about the sender though.
requests using this mail id for the local sender
ESMTP DSN RET=word, either ``FULL'' ``HDRS''
An optional envelope entry, which sets ``Received:'' header's ``from'' field value.
This should only be used on messages that are originated thru ``trusted'' mechanisms, and especially not be used when the message is originated by some John Doe in the system. (E.g. this is reserved for smtpserver and friends, not for arbitary users.)
Normal recipient address list; usually used in form of listing one address in angle braces:
to <user@somewhere> |
ESMTP DSN recipient parameters. Note: this must be before the recipient ``to'' line for which this gives the extra parameters.
Optional envelope entry telling who the message originating user was. The system is extremely suspicuous on this entry, and will check it against system account database, unless the spool file owner uid is known to belong to trusted users.
This optional envelope entry tells the router, what filename the sending client expects the subsystems to use as a feedback channel for reports concerning the file.
This ``filename'' is located into $POSTOFFICE/public/ directory, and has been preopened by the same uid as has created the message spool file.
An optional envelope entry that will define ``Received:'' header's optional ``via'' tag telling what physical transport mechanism was used.
Usually this entry is not used. (For an exceptions, see rmail and listexpand utility.)
An optional envelope entry that will define ``Received:'' header's optional ``with'' tag telling what protocol was used.
Unlike RFC-822 tells, ZMailer supports only one ``with'' instance.
The (*)'s beside the descriptions indicate this is a privileged field. That is, the action will only happen if ZMailer trusts the owner of the message file (*Note Security: security.). As with a normal RFC822 header, other fields are allowed (though they will be ignored), and case is not significant in the field name. The router will do appropriate checks for the fields that require it.
With this knowledge, we can now appreciate the minimal message file:
-------------------- to bond -------------------- |
This will cause an empty message to be sent to bond. A slightly more sophisticated version is:
-------------------- from m to bond via courier env-end From: M To: Bond Subject: do get a receipt, 007! You are working for the Government, remember? -------------------- |
Notice that there is no delimiter between the message envelope and the message header. A more sophisticated example in the same vein:
-------------------- from ps/d-ops to <007@sis.mod.uk> env-end From: M <d-ops@sis.mod.uk> Sender: Moneypenny <ps/d-ops@sis.mod.uk> To: James Bond <007@sis.mod.uk> Subject: where are you???! Classification: Top Secret Priority: Flash We have another madman on the loose. Contact "Q" for usual routine. -------------------- |
If the Classification: header is paid attention to in ZMailer, this requires that the router recognize it in the message header, and take appropriate action. In general the router can extract most of the information in the message header, and make use of it if the information is lacking in the envelope. The envelope headers in the above message are superfluous, since the same information is contained in the message header. Using the following envelope headers would be exactly equivalent to using the ones shown above (assuming the local host is sis.mod.uk):
-------------------- From Moneypenny <ps/d-ops@sis.mod.uk> To James Bond <007@sis.mod.uk> ... -------------------- |
ZMailer will extract the appropriate address information from whatever the field values are, as long as they obey the defined syntax (indicated in the list of recognized envelope fields above). ZMailer will complain in case of unexpected errors in the envelope headers.
The message body is not interpreted by ZMailer itself. As far as the router is concerned, it can be arbitrary data. However, certain Transport Agents may require limitations on the message body data. For example, the SMTP only deals with ASCII data with a small guaranteed line length.
A message control file is a file created by the router to contain all the information necessary for delivery of a message submitted in a corresponding message file. It is interpreted by the scheduler, which needs to know at all times which messages are pending to go where, and how. It is also interpreted by one or more Transport Agents, possibly concurrently, that extract the delivery information relevant to their purpose.
The concurrency aspect means that the Transport Agents must cooperate on a locking protocol to ensure that delivery to a particular destination is attempted by only one Transport Agent at a time, and a status protocol to ensure unique success or failure of delivery for each destination. There are potentially many ways to implement such protocols, but, in the spirit of simplicity, ZMailer uses a control file as a form of shared memory. Specific locations within each control file are reserved for flags that indicate a specific state for their associated destination address. The rest is taken care of by the I/O semantics when multiple processes update the same file.
Apart from necessary envelope and control information, a control file also contains the new message header for the message, which contains the header addresses as rewritten by the router. Since a message may have several destinations with incompatible address format requirements, there may be several corresponding groups of message headers. This will be illustrated by the sample control file shown in the following subsection.
A control file consists of a sequence of fields. Each field starts at the beginning of a line (i.e. at byte 0 or after a Newline), and is identified by the appearance of a specific character in that location. This id character is normally followed by a byte containing a tag value (semaphore flag), followed by the field value.
Here is a simple control file produced by a test message, just before it was removed by the Scheduler:
-------------------- i 24700 o 72 l <88Jan10.003129est.24700@bay.csri.toronto.edu> e Rayan Zachariassen <rayan> s local - rayan r+local - rayan 2003 m Received: by bay.csri.toronto.edu id 24700; Sun, 10 Jan 88 00:31:29 EST From: Rayan Zachariassen <rayan> To: rayan, rayan@ephemeral Subject: a test Message-Id: <88Jan10.003129est.24700@bay.csri.toronto.edu> Date: Sun, 10 Jan 88 00:31:24 EST s local - rayan@bay.csri.toronto.edu r+smtp ephemeral.ai.toronto.edu rayan@ephemeral.ai.toronto.edu 2003 m Received: by bay.csri.toronto.edu id 24700; Sun, 10 Jan 88 00:31:29 EST From: Rayan Zachariassen <rayan@csri.toronto.edu> To: rayan@csri.toronto.edu, rayan@ephemeral.ai.toronto.edu Subject: a test Message-Id: <88Jan10.003129est.24700@bay.csri.toronto.edu> Date: Sun, 10 Jan 88 00:31:24 EST -------------------- |
The id character values are defined in the mail.h system header file, which currently contains:
#define _CF_MESSAGEID 'i' /* inode number of file containing message */ #define _CF_BODYOFFSET 'o' /* byte offset into message file of body */ #define _CF_BODYFILE 'b' /* alternate message file for new body */ #define _CF_SENDER 's' /* sender triple (channel, host, user) */ #define _CF_RECIPIENT 'r' /* recipient n-tuple, n >= 3 */ #define _CF_DSNRETMODE 'R' /* DSN message body return control */ #define _CF_XORECIPIENT 'X' /* one of XOR set of recipient n-tuples */ #define _CF_RCPTNOTARY 'N' /* DSN parameters for previous recipient */ #define _CF_DSNENVID 'n' /* DSN 'MAIL FROM<> ENVID=XXXX' data */ #define _CF_ERRORADDR 'e' /* return address for error messages */ #define _CF_DIAGNOSTIC 'd' /* diagnostic message for ctlfile offset */ #define _CF_MSGHEADERS 'm' /* message header for preceeding recipients */ #define _CF_LOGIDENT 'l' /* identification string for log entries */ #define _CF_OBSOLETES 'x' /* message id of message obsoleted by this */ #define _CF_VERBOSE 'v' /* log file name for verbose log (mail -v) */ #define _CF_TURNME 'T' /* trigger scheduler to attempt delivery now */ #define _CF_RCVFROM 'F' /* Where-from we are coming ? */ |
There is one field per line, except for _CF_MSGHEADERS which has some special semantics described below. The following describes the fields in detail:
This field identifies the message file corresponding to this control file. It is the name of the message file in the QUEUE directory ($POSTOFFICE/queue/). This is typically the same as the inode number for that file, but need not be. It is used by Transport Agents when copying the message body, and by the Scheduler when unlinking the file after all the destination addresses have been processed. For example:
i 21456 |
Specifies the byte offset of the message body in the message file. It is used by Transport Agents in order to copy the message body quickly, without parsing the message file. For example:
o 466 |
Alternate message file for new body.
Gives an address to which delivery errors should be sent. The address must be an RFC822 mailbox. For example:
e "Operations Directorate" <d-ops@sis.mod.uk> |
The field value is an uninterpreted string which should prefix all log messages and accounting records associated with this message. This value is typically the message id string. For example:
l <88Jan6.103158gmt.24694@sis.mod.uk> |
This field specifies an originator (sender) address triple, in the sequence: previous channel, previous host, return address. It remains the current sender address until the next instance of this field. Since there can only be one sender of a message, multiple instances of the field will correspond to different return address formats as produced by the crossbar algorithm in the router. For example:
s smtp sis.mod.uk @lab.sis.mod.uk:q@deadly-sun.lab.sis.mod.uk s uucp sisops lab.sis.mod.uk!deadly-sun.lab.sis.mod.uk!q |
This field specifies a destination (recipient) address triple, in the sequence: next channel, next host, address for next host. Optional information to be passed to the Transport Agent may be placed after the mandatory fields; this currently refers to the delivery privilege of the destination address. Since the optional values of this field are only interpreted by the Transport Agent, changes in what the router writes must be coordinated with the code of the Transport Agents that might interpret this field. For example:
r local - bond 0 r uucp uunet sisops!bond -2 |
DSN message body return control.
One of XOR set of recipient n-tuples.
DSN parameters for previous recipient.
DSN MAIL FROM<> ENVID=XXXX data.
Apart from a message body, a Transport Agent needs the message headers to construct the message it delivers. These message headers are stored as the value of this field. Since message headers obviously can span lines, the syntax for this field is somewhat different than for the others. The field id is immediately followed by a newline, which is followed by a complete set of message headers. These are terminated (in the usual fashion) by an empty line, which also terminates this field. In the following example, the last line of text is followed by an empty line, after which another field may start:
m From: M To: Bond Subject: do get a receipt, 007! |
This field is not written by the router. t is written by the Scheduler or transport agents to remember errors associated with specific addresses. The field value has two parts, the first being the byte offset in the control file of the destination (recipient) address causing the error, and the rest of the line being an error message. The Transport Agents discover these errors and report them to the Scheduler. The Scheduler will collect them and report them to the error return address (if any) after all the destinations have been processed. For example:
d 878 No such local user: 'bond'. |
Message id of message obsoleted by this.
Log file name for verbose log (mail -v):
v ../public/v_some_magic_tmpfile |
Trigger scheduler to attempt delivery now. (ETRN)
Where-from we are coming ?
It should be noted, that in sender and recipient fields the first two field values (channel and host) cannot contain embedded spaces, but the third field value (the address) may. Therefore, in the presence of extra fields, parsing within Transport Agents must be cautious and not assume that an address does not contain spaces.
As mentioned, the second byte of most fields are used for concurrency control and status indication. This tag byte can contain several values that indicate current or previous activity. The fields where this is relevant are the destination (recipient) address and diagnostic fields. The tag values are defined in the ``mail.h' file mentioned previously, as follows:
#define _CFTAG_NORMAL ' ' /* what the router sets it to be */ #define _CFTAG_LOCK '~' /* that line is being processed, lock it */ #define _CFTAG_OK '+' /* positive outcome of processing */ #define _CFTAG_NOTOK '-' /* something went wrong */ #define _CFTAG_DEFER _CFTAG_NORMAL /* try again later */ |
The extract above is self-explanatory.
A message control file will normally contain a preamble that specifies information about the associated message file, the message body offset, an error return address, and a log entry tag. After this comes a repeated sequence of: sender address field, recipient address fields, and the message header corresponding to these recipients. After as many of these groups as are necessary, any diagnostic fields will be appended to the end of the control file. The restrictions on the sequence of addresses and message headers, are that a sender address field must precede any recipient address field, and a recipient address field must (immediately) precede any message header field, and no sender or recipient addresses may follow the last message header field.
The transport agent interface follows a master-slave model, where the TA informs the scheduler that it is ready for the work, and then the scheduler sends it one job description, and waits for diagnistics. Once the job is finished, the TA notifies the scheduler that it is ready for a new job.
A short sample session looks like this:
(start the transport agent) #hungry --> (TA to scheduler) spoolid \t hostspec <-- (scheduler to TA) diagnostics --> (TA to scheduler) #hungry --> (TA to scheduler) ... |
Normal diagnostic output is of the form:
id / offset \t notarydata \t status message |
is the inode number of the message file,
is a byte offset within its control file where the address being reported on is kept,
is a Ctrl-A separated tuple is delivery-status-notification information for the message,
is one of:ok, ok2, ok3, error, error2, deferred, retryat
is descriptive text associated with the report. The text is terminated by a linefeed.
Any other format (as might be produced by subprocesses) is passed to standard output for logging in the scheduler log. The retryat response will assume the first word of the text is a numeric parameter, either an incremental time in seconds if prefixed by ``+'', or otherwise an absolute time in seconds since UNIX epoch.
The exit status is a code from <sysexits.h>.
postmaster: root postoffice: root MAILER-DAEMON: root mailer: postmaster postmast: postmaster proto: postmaster sync: postmaster sys: postmaster daemon: postmaster bin: postmaster uucp: postmaster ingress: postmaster audit: postmaster autoanswer: "|@MAILBIN@/autoanswer.pl" nobody: /dev/null no-one: /dev/null junk-trap: /dev/null #test-gw: "|/..." #test.gw: "|/..." |
Doing expansion lists in sendmail(8) style is not suggested, although we certainly can do it. There is a better mechanism in the ZMailer to handle simple feats like these that sendmail(8) systems do by placing the file containing recipient addresses into the directory $MAILVAR/lists/. This directory must have protection of 2775 or stricter, and the listfile must have protection of 664 or stricter for *-request/owner-*/*-owner auto-aliases to work. — but to sendmail style lists:
listname: "/usr/lib/sendmail -fowner-listname listname-dist" owner-listname: root # Well, what would you suggest for a sample ? listname-owner: owner-listname listname-request: root listname-dist: ":include:/dev/null" |