Next: Concept Index, Previous: Bugs, Up: Top [Contents][Index]
This section gives a sketchy overview of the VM internals for the developers/programmers.
• Folder Internals: | Structure of the folders | |
• Message Internals: | Structure of the message data structure | |
• Summary Internals: | Details of summary generation | |
• Threading Internals: | Details of message threads handling | |
• Sorting Internals: | Details of how messages are sorted | |
• User Interaction: | Handling of the user interaction | |
• Coding Systems: | How VM handles character coding | |
• MIME Display: | How VM displays MIME messages | |
• MIME Composition: | How MIME messages are composed | |
• Virtual Folder Internals: | Details of virtual folders and selectors | |
• Extents and Overlays: | How VM deals with XEmacs and GNU Emacs differences | |
• Timers and Concurrency: | How VM runs asynchronous timers |
Next: Message Internals, Up: Internals [Contents][Index]
VM stores mail folders in the Unix ‘mbox’ format (in all its variants). Internal to Emacs, the mbox is loaded into a text buffer (the Folder buffer) and individual messages are identified by remembering markers into the text buffer. See Message Internals.
The Unix mbox format is described in the RFC 4155 specification of the Internet Engineering Task Force. The mail folder is a text file consisting of a sequence of messages, with each message consisting of a series of headers followed by a message body. The beginning of each message is delineated by a separator line starting with the string “From ” and the end of the message by a blank line. The leading separator line in VM folder is of the form “From VM ...” where the “...” records the time at which VM first saw the message. The format of the individual messages is as per the RFC 2822 specification, except that Line-Feed characters may be used to delineate the end of lines in the "Unix" format.
Three variants of the mbox
format are recognized by VM, called
From_
, BellFrom_
and From_with-Content-Length
.
In a From_
type mbox, every message has a leading and trailing
separator line, as indicated above. In a BellFrom_
type mbox,
the trailing separator line can be missing. (This is so that the
mbox’s from the old System V format can be handled.) In a
From_with-Content-Length
type mbox, the From
separator
line stores the length of the message. So, no trailing separator line
is required.
In addition to these mbox formats, VM also handles the MMDF format and
the Emacs Rmail’s Babyl format. The variable vm-folder-type
stores the type of the folder being used.
To every message, VM adds a header with the field name “X-VM-v5-Data:” and stores in it the information about the message it wishes to remember between sessions.
The first message of the VM folder file contains additional headers used by VM for remembering information between sessions.
vm-visible-headers
and vm-invisible-header-regexp
that were in effect when the
folder was saved. The messages in the folder would have their headers
arranged according to these variables.
Internal to Emacs, VM stores the folder as simply a text buffer. However, it remembers a variety of data about the message contents in the buffer through internal variables.
vm-message-list
. A list of message data structures for all the
messages in the buffer.
vm-folder-type
. The type of the current folder indicating how
the messages are stored: one of ’babyl, ’From_, ’BellFrom_,
’From_-with-Content-Length and ’mmdf.
vm-folder-access-method
. The method for accessing the server
message store: ’pop for pop-folders and ’imap for imap-folders, and nil
for all other folders.
vm-folder-access-data
. A vector of data for accessing the server
message store. The first two elements of the vector are the maildrop
specification for the mail server and a reference to the process
connecting to the mail server. For the ’pop access method, that is all
there is. But, for the ’imap access method, the vector has 9 other
entries detailing various pieces of data about the IMAP server.
vm-folder-read-only
. A boolean flag indicating whether the
folder is read-only. If so, no modifications are allowed, including
attribute changes. However, messages can be fetched from external
storage for viewing.
vm-virtual-folder-definition
. If the current folder is virtual,
then this variable holds the data constituting its definition.
vm-real-buffers
. If the current folder is virtual, then this
variable is a list of all the real folder buffers involved in
constructing it.
vm-virtual-buffers
. A list of all the virtual folder buffers
that the current buffer is involved in.
vm-component-buffers
. An a-list containing all the folder
buffers (real or virtual) that make up the components of the current
virtual folder, and a flag indicating whether those folders were
visited as part of visiting the virtual folder. When the virtual
folder is closed, all the folders purposely visited will also be closed..
vm-summary-buffer
. The Summary buffer of the folder. (If the
Summary buffer gets killed for any reason, the value of this variable
becomes <killed buffer>, which is unfortunate. Therefore, most
interactive commands of VM check for killed Summary buffer and reset
this variable to nil in such a case. So, in the middle of code, this
variable can be regarded as a valid buffer pointer.)
vm-presentation-buffer-handle
. The message Presentation buffer of the
folder. (Same proviso applies as for vm-summary-buffer
.)
vm-presentation-buffer
. This seems to be a copy of the
vm-presentation-buffer-handle
. Its purpose is unknown.
The running state of the folder buffer is represented in a number of buffer-local variables:
vm-message-pointer
. A sublist of vm-message-list starting from
the current message that the cursor is on. So, the first element of
vm-message-pointer is the current message.
vm-last-message-pointer
. Whenever the cursor is moved, the
previous value of vm-message-pointer is remembered in this variable.
vm-summary-pointer
. The message struct of the message which
has the summary pointer in the Summary buffer.
vm-fetched-messages
. List of external messages whose
bodies were fetched for viewing or other operations.
vm-fetched-message-count
. The number of messages in
vm-fetched-messages
. An attempt is made to keep this below the
vm-fetched-message-limit
.
vm-mime-decoded
. The MIME decoding state of the current message
display: undecoded
if the message is shown in undecoded plain
text form, decoded
if the message is shown decoded, and
buttons
if the message is shown as a series of buttons for all
its MIME components. The D command cycles through these
states.
vm-system-state
. The state of VM in a Folder buffer or
Presentation buffer:
previewing
.
if a message is being previewed.
showing
.
if a full message is being shown.
reading
.
if message reading is in progress.
A message edit buffer is in state editing
.
A message composition buffer may be in one of these states:
vm-spooled-mail-waiting
. VM periodically checks if there is new
mail in the spool files of the current folder and set this flag to t if
there is new mail.
vm-undo-record-list
. A list of undo records describing the
actions to be performed if an undo operation is invoked. Each undo
record has an action, the message, if any, to which the action
applies, and any arguments needed for the action.
vm-undo-record-pointer
. A pointer into the
vm-undo-record-list
indicating the current position of the
undoing cycle.
The variable vm-folder-access-data
is a vector storing data about the
state of the mail server (for POP and IMAP servers). It contains the
following items:
pop-maildrop-spec
or imap-maildrop-spec
.
MAILDROP specification of the server folder.
pop-process
or imap-process
.
The Emacs process being used to communicate with the server for this
folder. (Each folder uses a separate process to avoid unwanted
interference.)
imap-uid-validity
.
The UIDVALIDITY value of the IMAP folder.
imap-read-write
.
A boolean flag indicating whether the folder is writable.
imap-can-delete
.
A boolean flag indicating whether the folder allows deletions.
imap-body-peek
.
A boolean flag indicating whether the folder allows the BODYPEEK
command of IMAP.
imap-permanent-flags
.
The list of permananet flags that have been stored in the folder.
imap-mailbox-count
.
The number of messages in the folder.
imap-recent-count
.
The number of messages in the folder that are considered “recent” by the
server.
imap-retrieved-count
.
The number of messages present in the folder when messages were last
retrieved. This would have been the value of imap-mailbox-count
at
that time.
imap-uid-list
.
The list of UID’s and flags of the messages in the folder,
using cons cells of the form (msg-num . uid . size . flags list). The cons
cells (size . flags list) are shared with imap-flags-obarray
below.
imap-uid-obarray
.
An obarray that binds all the UIDs of messages in the folder to their
message sequence numbers.
imap-flags-obarray
. An obarray that binds all the UIDs of messages
in the folder to cons cells of the form (size . flags list). These cons
cells are the same as those occurring in the imap-uid-list
field.
So, any updates will be shared through both the views. The two obarrays,
imap-uid-obarray
and imap-flags-obarray
, bind exactly the same
set of UIDs. Jointly, they are referred to as uid-and-flags-data
.
The reason for their separation is historical.
Next: Summary Internals, Previous: Folder Internals, Up: Internals [Contents][Index]
The message data structure is a vector containing various pieces of data about the message, some of which is permanent and some that is calculated during a VM session. The data is organized into four sub-vectors:
The attributes vector and cached data vector are stored in the
folder on disk as the X-VM-v5-Data
header of the first message.
This vector holds the data about the location of the various parts of the message in the folder buffer. Every folder buffer or folder-like buffer (such as a Presentation buffer) has variables that contain message data structures. The location data is normally expected to refer to locations in that very buffer. However, this condition is not actually required. (See below.)
start
. Marker for the starting position of the message, at which a
leading separator line begins.
headers
. Marker for the position in the buffer where the headers
of the message start.
vheaders
. Marker for the position in the buffer where the
visible headers of the message start. (The headers are rearranged in
such a way that all the visible headers are towards the end of the
headers region.)
text
. Marker for the position in the buffer where the text of the
message starts.
text-end
. Marker for the position in the buffer where the text of
the message ends.
end
. Marker for the position in the buffer where the message
ends.
Unfortunately, in the current versions of VM, the folder buffer to which the location data point is not itself part of this vector. This information is inferred from the context (which makes the code brittle). The Folder buffer of the message can be obtained from the soft data vector but the location data could also point to a Presentation buffer.
This vector contains other calculated data about the message that is specific to a VM session.
number
. The message number as an integer.
padded-number
. The message number as a padded string.
mark
. Flag that indicates if the message has been marked (via
vm-mark-message
).
su-start
. The position in the Summary buffer where the summary line of
the message starts.
su-end
. The position in the Summary buffer where the summary line of
the message ends.
real-message-sym
. If the message is in a virtual folder, then its
corresponding “real message” is the underlying message in another
folder which is described by a message data structure similar to the
current one. The real message data structures are represented by
uninterned symbols written as “<<>>”. This field stores the symbol
representing the real message of the current message. If the current
message is a real message then this field contains its own symbol.
The use of symbols for this purpose avoids the possibility of circular
data structures.
mirrored-message-sym
. This is similar to the
real-message-sym
, except that it points to the message directly
mirrored by the current virtual folder message.
reverse-link-sym
. Reference to the previous message in the message list,
also represented by an uninterned symbol written as “<–”.
message-type
. A symbol indicating the type of the message according to
its folder type, one of BellFrom_
, From_
and
From_-with-Content-Length
.
message-id-number
. A number that uniquely identifies the message
within a VM session.
buffer
. The Folder buffer of the message. (Messages in Presentation
buffers also have this field set to the corresponding Folder buffer.)
thread-indentation
. Indentation level of the message in its message
thread.
thread-list
. List of symbols from vm-thread-obarray
that give
this message’s lineage.
thread-subtree
. List of messages that form the subtree under
this message in a threaded summary display.
babyl-frob-flag
.
saved-virtual-mirror-data
. Saved mirror data, if the message was
switched from unmirrored to mirrored.
virtual-summary
. Summary for unmirrored virtual message.
mime-layout
. MIME layout information; types, ids, positions, etc of
all MIME entities. (See below.)
mime-encoded-header-flag
. Flag that indicates if the headers of the
message are MIME encoded.
su-summary-mouse-track-overlay
. The overlay on the summary of this
message used for selection by mouse.
message-access-method
. The access-method to be used for the message,
inherited from its real folder.
All the hard-wired message attributes are stored in this
vector. They also get saved as part of the X-VM-v5-Data
header
field when the folder is saved to disk.
The data that is cached for the message and stored on the disk as part
of the X-VM-v5-Data
header field. Even though this vector is
only supposed to have data that can be calculated from the message
itself, the fields pop-uidl, imap-uid and imap-uid-validity form an
exception. They are really hard data that cannot be calculated from
anything else.
Some of the data deals with information from message headers. The header fields can have MIME-encoded words in them. The strings stored in the cached-data vector, however, are MIME-decoded versions of the header fields, but they also have text properties that store the names of the original character sets used in the header fields. This allows the strings to be quickly re-encoded for storage on disk.
vm-sort-messages-by-delivery-date
is set to t).
number
and thread-indent
, as well as
MIME-decoded strings with text properties.
body-to-be-retrieved
below.
Extra data shared by virtual messages if vm-virtual-mirror is non-nil.
The MIME layout of a message, stored in the soft data of the message, is in turn a vector containing various pieces of data. Such a vector is used not only for the overall message, but for all its MIME parts and subparts as well.
type
.
A list of strings consisting of the MIME type of the part along
with its attributes. This comes from “Content-Type” header. The type
could be of the form ‘type/subtype’. Quotation marks are stripped from
attribute values. An example is ("multipart/mixed" "boundary=----_=_NextPart_001_01AFE588.63E23840")
.
qtype
. Like type, but the quotation marks are not stripped.
encoding
. The MIME encoding used for the part. It comes from the
“Content-Transfer-Encoding” header.
id
. The id obtained from the “Content-ID” header of the part.
description
. A description string obtained from the
“Content-Description” header of the part.
disposition
. A list of strings obtained from the
“Content-Disposition” header of the part. Quotation marks are
stripped from attribute values. (An example is (``attachment'',
``filename=mydocument.doc'')
.)
qdisposition
. Like disposition, but the quotation marks are not
stripped.
header-start
, header-end
, body-start
and
body-end
. Markers into the content buffer delineating the
headers/body of the MIME part.
parts
. A list of MIME layouts for the individual subparts of this part.
cache
. A symbol that is unique to this MIME part. Other data is
stored as properties of this symbol:
vm-mime-display-external-generic
.
This property stores the id of the process used to externally display
the MIME part as well as the name of the temporary file used.
vm-mime-display-internal-image-xxxx
.
This property stores the name of the temporary file where the image is
stored. For an image represented as image strips, it actually stores
a list with a number of other data items.
vm-image-modified
.
This property stores a boolean flag indicating that the image has been
modified.
vm-mime-display-internal-audio/basic
.
This property stores the name of the temporary file where the audio
clip is stored.
vm-message-garbage
.
message-symbol
. A reference to the message that contains the MIME
part. Represented as a symbol (that is, an interned key into a hash
table). This is a different symbol from the real-message-sym of the
message.
display-error
. If the display of a MIME part fails, its error string is
stored here.
layout-is-converted
. Flag indicating that MIME type conversion has been
performed on this part. see MIME type conversion.
unconverted-layout
. If the MIME type conversion has been performed on
this part, then this holds the original unconverted layout.
Every Folder buffer has a vm-message-list
and a
vm-message-pointer
list containing message data vectors.
Every Presentation buffer also uses a vm-message-pointer
list
with a single message (the one being presented). The message data
vector in the Presentation buffer has its own location data, but
shares all other components with the message in the Folder buffer.
This allows the Presentation buffer to, for example, change the
attributes of the message without having to switch context to the
Folder buffer.
Virtual folders, which contain only references to messages in other
folders, store just a single message body in the Folder buffer.
However, they have message descriptors for all the messages in
vm-message-list
. All the message descriptors use the same
location data vector, because only one message body can be stored in
the Folder buffer, but have separate Soft data vectors. (This allows,
for instance, virtual folders to have their own threads, which could
in general be different from the threads in the underlying folders.)
The other sub-vectors are shared with the underlying real folders. (In
particular, the tokenized summary line is the same in the virual
folders and their underlying folders.)
Next: Threading Internals, Previous: Message Internals, Up: Internals [Contents][Index]
Generating a summary is quite a time-consuming operation. VM uses a variety of tricks to speed up the generation of summaries.
The format of the summary lines is specified in the variable
vm-summary-line-format
. The information that needs to go into
the summary lines is divided into two classes:
A tokenized summary line is a list whose elements can be strings, representing fixed information in a message, and tokens, representing variable information. VM calculates a tokenized summary line for each message and caches it in the cached-data vector. The following forms of tokens are used in tokenized summary lines:
number
.
Stands for the message number in the linear order of the summary.
mark
.
Stands for an indicator of message mark (whether the message is marked
at present).
thread-indent
.
Stands for the indentation to be used for the message’s summary
depending on its position in the message thread.
group-begin
, group-end
.
Brackets used to denote groups of items that might have particular
formatting constraints.
The function vm-tokenized-summary-insert
converts a tokenized
summary line into a string and inserts it in the summary buffer. The
minibuffer message “Generating summary...” is used to show the
progress of generating summary lines from tokenized summaries.
Buffer local variables in each Folder buffer responsible for maintaining summary information:
vm-summary-pointer
. The message selected by the cursor in the
Summary window.
vm-summary-redo-start-point
. A pointer into the
vm-message-list
indicating the first message for which the
summary line must be redisplayed. All the messages from here on are
assumed to require a summary redisplay. The assumption is usually valid
because the message numbers of all the succeeding messages might have
changed. But, if message numbers are not included in the summary lines,
then this results in unnecessary work.
vm-messages-needing-summary-update
. The list of messages for
which summary lines must be redisplayed. Messages are included in this
list by calling the function vm-mark-for-summary-update
.
vm-numbering-redo-start-point
. A pointer into
vm-message-list
indicating the first message whose message number
needs to be recalculated.
vm-numbering-redo-end-point
. A pointer into
vm-message-list
indicating the last message whose message number
needs to be recalculated.
The beginning and the ending positions of each message summary line are stored in the message’s soft data vector. see Message Internals. The positions within the summary line have text-properties set, which give the data about the message:
Next: Sorting Internals, Previous: Summary Internals, Up: Internals [Contents][Index]
Message threads required for threaded summaries are calculated using message ID’s, which are unique when the message was originally composed. However, VM may need to deal with multiple copies of the same message received via possibly different routes. So, message ID’s are not unique for messages inside VM.
Messages composed as replies generally have an “In-Reply-To” header. The message mentioned in this header is referred to as the parent of the message. In addition, messages also arrive with a “References” header which lists all the ancestors of the message, with the oldest message being listed first. The last message listed in the “References” header is the direct parent of message. It is important to keep in mind that all the messages listed in the “References” header may not be present in the VM folder.
Thread trees are constructed using the “In-Reply-To” headers and “References” headers. Jamie Zawinski has done a good analysis of the information contained in these headers which can be found on the web. VM’s threading algorithm is currently based on these ideas. These trees are called reference-based threads.
In addition, VM also allows threads to be built using the subject
headers via the option vm-thread-using-subject
. Subject-based
threading is used in addition to reference-based threading. So, in a
subject-based thread, the root message would be the oldest message
with that subject and, below it, would be reference-based threads all
of which share the same subject. The roots of these reference-based
threads are referred to as the “members” of the subject thread.
Subject threading is only one level deep, whereas reference threading
can be arbitrarily deep.
Threads are built using two hash tables vm-thread-obarray
and
vm-thread-subject-obarray
. The former keeps track of the thread
obtained by following parent and reference chains. The latter keeps track
of messages with the “same subject”. To prevent messages from jumping
from one thread to another within the same VM session, the subject used is
not the message’s own subject, but rather the subject of the oldest message
in the thread. This subject is retained even if the oldest message is
expunged.
The message ID’s are interned in vm-thread-obarray
and the
following information is stored for each message ID:
nil
.
nil
The vm-thread-subject-obarray
interns each subject string found
in the folder and maps it to a vector containing the following elements:
id-sym
is not included as a member.
Building threads involves calculating all the data stored with the
vm-thread-obarray
and vm-thread-subject-obarray
. These two
collections of data are calculated in sequence, because the subject
threads are based on the reference threads.
After the threads are built, the thread-list
,
thread-indentation
and the thread-subtree
fields of the
Soft data vector are calculated as needed on demand and cached.
(See Soft data vector.) These fields cannot be calculated without
building threads first.
When new messages are assimilated, they are added to the threads that
might have been already built, and the thread-related fields in the
Soft data vector are erased so that they will be recalculated. The
thread-subtree
field is erased for all the ancestors of the
assimilated message. The thread-list
and
thread-indentation
fields are erased for all the descendants of
the assimilated message.
Before messages in the folder are expunged, they are unthreaded.
This involves removing them from their respective thread trees. It
also involves the erasure of the thread-subtree
field of all
their ancestors and the thread-list
and
thread-indentation
fields of the descendants.
The code for threading has to be robust in the presence of erroneous information in the message headers. We have no control over the mail clients that produce those messages and faulty information should not lead to VM hanging or producing errors. It should just do the best job it can in the presence of imperfect information.
It is possible that the information in the headers give rise to cycles in the thread trees. Kyle Jones’s original implementation allowed these cycles to exist, but all functions that traversed the thread trees were protected to detect cycles. However, since thread trees are updated when new messages are received or existing messages are expunged, this led to unstable results.
Following Jamie Zawinski’s recommendation, VM now avoids cycles in thread trees. Loop detection is still carried out during traversal as a double safeguard.
VM gives priority to the parent information contained in the “In-Reply-To” headers in preference to the information in the “References” headers. However, if an “In-Reply-To” header gives rise to a cycle, it is ignored, and then “References” headers might be used to fill in the missing information.
Next: User Interaction, Previous: Threading Internals, Up: Internals [Contents][Index]
Sorting of messages in VM is carried out using the Emacs built-in
sorting function, which is generic in the comparison
operation to be used for sorting. The required comparison operation
is expressed as a sequence of basic comparison operations such as
comparison by date, by author, by subject etc. The dynamic
variable vm-key-functions
is bound to a list of comparison
functions before calling the Emacs sort function.
The function vm-sort-compare-xxxxxx
uses the functions listed
in vm-key-functions
to do the overall comparison. It compares
the given messages using the key functions in sequence. If the first
key function decides one of the messages to precede the other, then
the comparison is over. If the messages are found to be equivalent
according to the first key function then the second key function is
tried and, if they are still equivalent, then the next key
function is tried and so on. This is called the lexicographic
combination of the given key functions.
Sorting by threads is special. When messages are to be sorted by
threads, all the messages belonging to a thread should appear
together. The required effect is achieved by using
vm-sort-compare-thread
as the first key function in the
sequence. This function checks to see if the two messages belonging
to the same thread. If they do then the farthest ancestors of the two
messages that share the same parent are returned so that the remaining
comparison operations can be applied to these ancestors. The
rationale is that these ancestors are the roots of the thread subtrees
that the two messages belong to. So, the relative ordering of the
messages should be the same as the relative ordering of these
ancestors. If the two messages belong to different threads then the
thread roots of the two messages are returned, again with the same
rationale.
Threaded summaries can be sorted by any key, e.g., by author (full-name). It is most common to sort them by “activity,” i.e., the order of the most recent message in the thread or subthread. Sorting them by “date” means using the date of the root message of the thread or subthread.
Next: Coding Systems, Previous: Sorting Internals, Up: Internals [Contents][Index]
For each mail folder, VM creates three kinds of buffers in Emacs: the Folder buffer, the Presentation buffer and the Summary buffer. All three types of buffers have the same user interface as far as possible: the same key bindings, menu bars, tool bars and also the same commands. The functions implementing the commands must therefore work irrespective which of the three buffers they are invoked in. This makes VM quite different from most Emacs modes.
VM stores the identity of the Folder buffer in a buffer-local variable
vm-mail-buffer
in each of the other types of buffers.
Conversely, each Folder buffer uses buffer-local variables
vm-summary-buffer
and vm-presentation-buffer
to store
the identity of the other buffers.
Whenever a VM command is invoked by the user, VM calls a function
called vm-select-folder-buffer-and-validate
, which sets the
current-buffer to the Folder buffer. It also stores the identity of
the buffer with the user’s focus in a global variable called
vm-user-interaction-buffer
. Thus, at every point during the
command execution, VM has knowledge of all the buffers involved as
well as the buffer in which the command execution was initiated.
[More to be filled in on vm-display
etc.]
The default menu bar of VM contains VM-specific menus, replacing the
standard Emacs menus. This is achieved by setting the buffer-specific
menu bar to one in which the Emacs menus are undefined
(at
least in Gnu Emacs).
VM computes its standard menu bar and stores it internally:
vm-mode-menu-map
.
The menu bar also has a menu, or a menu item, to switch back to the
standard Emacs menu bar.
The computed menu bar is then installed depending on the setting of
vm-use-menus
.
If the user selects the action to revert to the standard Emacs menu
bar, the installation is easily reverted.
menu-bar
.
When the user picks a menu item to revert to the
Emacs menu bar, the function vm-menu-toggle-menubar
is invoked,
which installs a fresh menu bar retaining the standard Emacs menus.
The same function is used to reinstall the dedicated VM menu bar when
needed.
Next: Virtual Folder Internals, Previous: User Interaction, Up: Internals [Contents][Index]
A Coding System is a way of encoding characters as bit patterns.
see Coding System Basics in Emacs Lisp
manual. US-ASCII is a coding system for English. Other coding
systems are used to encode the various languages of the world, e.g.,
iso-latin-1
for Western European languages, and
hebrew-iso-8bit
for Hebrew. Emacs also uses its own internal
coding system for characters, which can encode all character sets
currently in existence. But the internal coding system can vary between
different versions of Emacs.
Emacs defines a property called mime-charset
for each
implemented coding system, which is the official preferred name of the
MIME character set that it corresponds to. For example,
iso-latin-1
corresponds to the MIME charset iso-8859-1
,
and hebrew-iso-8bit
corresponds to the MIME charset
iso-8859-8
. The Emacs function coding-system-get
can be
used to extract the mime-charset
property of a coding system.
VM stores all the known coding systems and the corresponding MIME
charsets in its internal variables
vm-mime-mule-coding-to-charset-alist
and
vm-mime-mule-charset-to-coding-alist
.
MIME messages specify the character set that their content is in,
in the Content-Type header. VM uses this information to decode the content
to the Emacs internal coding system. This is done using the function
decode-coding-region
. Conversely, VM encodes the outgoing messages
into the default or chosen MIME character set using the function
encode-coding-region
.
The headers of email messages can only be in US-ASCII. So header fields
in other character sets are encoded using either base-64 or
quoted-printable encoding (which give ASCII strings) and annotated
with the name of the original character set. Such annotations look
like =?charset?B?
. They can apply
to individual words or sequences of words appearing the in the
headers. Note that the annotation ?B?
signifies base-64
encoding of the byte stream. Similarly the annotation ?Q?
might
be used to denote the quoted-printable encoding.
VM decodes such strings using the function decode-coding-string
.
Conversely, the headers of outgoing messages are encoded using
encode-coding-string
Next: MIME Display, Previous: Coding Systems, Up: Internals [Contents][Index]
A virtual folder is characterized by its definition, which is stored in the
buffer-local variable virtual-folder-definition
. The form of the
definition is as given in vm-virtual-folder-alist
. See vm-virtual-folder-alist. It is a collection of clauses, with each
clause listing a collection of folders and a collection of virtual
selectors.
Each virtual selector X has a corresponding Lisp function
‘vm-vs-X’, whose purpose is to check whether a given message
matches the selector. The arguments for ‘vm-vs-X’ are a message
data structure m
and all the arguments for the virtual selector
X.
For example, the virtual selector author
has a string argument,
representing the author name. The corresponding Lisp function is defined
as:
(defun vm-vs-author (m author-name) (or (string-match author-name (vm-su-full-name m)) (string-match author-name (vm-su-from m))))
The definition checks to see if the given author-name
pattern occurs in the full name of the author (vm-su-full-name
) or
the email address of the author (vm-su-from
).
The author
selector is then registered in four places:
vm-virtual-selector-function-alist
, which contains pairs
of the form ‘(SELECTOR . FUNCTION)’. For the author
selector, the pair is (author . vm-vs-author)
.
author
is given a property
vm-virtual-selector-arg-type
indicating the type of argument it
requies:
(put 'author 'vm-virtual-selector-arg-type 'string)
vm-supported-interactive-virtual-selectors
, which
contains lists of strings, each string being the name of a virtual
selector. For the author
selector, the list is ("author")
.
Including the selector in this variable allows it to be used in creating
interactive virtual folders (search folders).
author
is given a property
vm-virtual-selector-clause
indicating the prompt string for
interactive use:
(put 'author 'vm-virtual-selector-clause "with author matching")
Evidently, the last two registrations are only needed for interactive selectors that can be used with the V C command.
Next: MIME Composition, Previous: Virtual Folder Internals, Up: Internals [Contents][Index]
The MIME layout of a message is stored in the mime-layout
field of the Soft data vector of the message. (See MIME layout.) The MIME layout is in general a tree structure of
“MIME parts”. The function vm-decode-mime-layout
is
responsible for traversing the tree structure at each MIME part
and displaying it appropriately.
The function vm-decode-mime-layout
goes through the following
sequence of decisions:
multipart
type, then the subparts are
displayed as needed. If it is a single part, it proceeds as follows.
vm-mime-auto-displayed-content-types
but not listed in the corresponding exceptions.)
vm-mime-internal-content-types
but not
listed in the corresponding exceptions.)
vm-mime-external-content-types-alist
and it is invoked to
display the MIME part.
MIME parts of type ‘message/external-body’ need special
treatment. If they are not asked to be auto-displayed, then they are
displayed as buttons, but the button caption may use information from the
child part (the actual object that is in the external-body) such as its type
and description. If a message/external-body
part is asked to be
auto-displayed, then the child part is fetched from the external source and
stored in an internal buffer. It may be auto-displayed if it is appropriate
to do so, or shown in turn as a button.
MIME buttons are displayed as regions of text displaying button labels. In addition, they have an overlay/extent placed on them, which has a number of properties associated with it:
vm-button
.
Always t
.
vm-mime-layout
.
Gives the layout of the MIME part.
vm-mime-function
.
The function that carries out the action represented by pressing the
button.
vm-mime-disposable
.
Set to true if the button should be removed when it is replaced by the
MIME object.
face
.
Set to the value of vm-mime-button-face
.
local-map
(FSF Emacs) or keymap
(XEmacs).
Set to a keymap that includes vm-mime-reader-map
, binding the
$ keys.
Next: Extents and Overlays, Previous: MIME Display, Up: Internals [Contents][Index]
A MIME message is composed just like a normal message. When objects
are attached using commands like vm-attach-file
,
attachment buttons are created in the message composition buffer. An
attachment button is a region of text that looks like:
[Attachment mary.jpeg, image/jpeg]
Various text properties are associated with an attachment button, allowing it to be turned into an actual attachment when the message is sent.
The representation of the attachment buttons differs in GNU Emacs and XEmacs. In GNU Emacs, the region of text is given text properties that represent the metadata about the object. In XEmacs, the region of text is given an extent, which is then given properties representing the metadata. The reason for the different representations is that in GNU Emacs, only text properties are preserved under killing and yanking.
The following properties are defined for attachment buttons:
vm-mime-object
.
The object denoting the MIME attachment. It is either
t
indicating that the attachment is another MIME
object in a VM folder.
In the last case, the vm-mime-layout
property describes the rest of the metadata.
vm-mime-type
.
A string denoting the MIME type of the object. (Note that it is a
single string, unlike the type
component of a MIME layout.)
vm-mime-parameters
.
A list of strings denoting the parameters of the MIME type.
vm-mime-description
.
A string for the MIME description of the object.
vm-mime-disposition
.
A list describing the MIME disposition.
vm-mime-encoded
.
A boolean indicating whether the object has MIME headers.
vm-mime-encoding
.
The MIME encoding used, if it is already encoded.
vm-mime-forward-local-refs
.
Whether or not references to local external-body objects should be
forwarded as is.
fontified
.
Standard text property.
duplicable
.
Set to t
in XEmacs allowing the extent to be
preserved under killing and yanking.
front-nonsticky
and rear-nonsticky
.
Standard stickiness of text properties in GNU Emacs.
When a composed message is sent, the attachment buttons are replaced
by actual attachment objects. In FSF Emacs, the attachment buttons
are first converted into “fake” overlays before MIME encoding, in a
function called vm-mime-fake-attachment-overlays
. This allows
the next stage to treat both FSF Emacs and XEmacs using the same
logic.
The function vm-mime-encode-composition
then encodes the composition
buffer, by selecting each attachment button and replacing it with the
corresponding object. The bodies of ‘external-body’ objects are also
retrieved at this stage. Unless the objects were already
MIME-encoded, they are MIME-encoded and made into
MIME parts by adding suitable headers. The message itself is
given MIME headers describing its content and then handed to Emacs
message-sending functions.
When another message is yanked or “included” in a message composition,
the handling of attachments depends on the variable
vm-include-mime-attachments
. If the variable is nil
, then
the attachments are displayed as token buttons in plain text that
appear similar to:
[DELETED ATTACHMENT mary.jpg, image/jpeg]
The function vm-decode-mime-layout
is employed to
generate the yanked text along with such token buttons.
If vm-include-mime-attachments
is t
, then first the
vm-decode-mime-layout
function is employed to generate proper
MIME buttons for all the attachments. In a second step, the
MIME buttons are replaced by attachment buttons using a function
called vm-mime-convert-to-attachment-buttons
. These attachment
buttons are then handled as described above.
Next: Timers and Concurrency, Previous: MIME Composition, Up: Internals [Contents][Index]
XEmacs and GNU Emacs differ in how they represent non-textual properties in buffers. The web page on “XEmacs vs GNU Emacs” describes the situation as follows:
XEmacs uses "extents" to represent all non-textual aspects of buffers; GNU Emacs 19 uses two distinct objects, "text properties" and "overlays", which divide up the functionality between them. Extents are a superset of the union of the functionality of the two GNU Emacs data types. The full GNU Emacs 19 interface to text properties and overlays is supported in XEmacs (with extents being the underlying representation).
Extents can be made to be copied into strings, and then restored, by kill and yank. Thus, one can specify this behavior on either "extents" or "text properties", whereas in GNU Emacs 19 text properties always have this behavior and overlays never do.
While extents and overlays look similar on the surface, they differ fundamentally in that extents are attached to text and, so, can be killed and yanked, whereas overlays are not attached to text. XEmacs has implemented GNU-like text properties on top of extents. So, text properties may work more uniformly in both the Emacsen, but VM was developed in the early days of the forking and does not use these common features.
The file vm-misc.el
contains definitions whereby both extents
and overlays can be treated as a single type of “VM extents”.
Wherever such VM extents can be used, there is some uniformity in the
code but, in other places, there is not. (Independently, the XEmacs
team has developed the fsf-compat
package by which FSF-style
overlays are implemented on top of extents. This package is not
compatible with the way VM deals with the two types.)
Another major differences between extents and overlays is that the beginning and ending of overlays are markers. This has some advantages. However, if a buffer has many overlays, normal editing operations must update all the overlay markers, which can be time-consuming.
The major applications of extents and overlays in VM are the following:
Previous: Extents and Overlays, Up: Internals [Contents][Index]
VM has been designed as mainly a sequential program. However, there three timer tasks that get scheduled to occur at regular intervals:
vm-flush-itimer-function
Stores message attributes in the folder so that they will be saved
when an auto-save is done. This is controlled by the variable
vm-flush-interval
.
vm-get-mail-itimer-function
Moves new mail from maildrops into the folder. This is controlled by
the variable vm-auto-get-new-mail
.
vm-check-mail-itimer-function
Checks the maildrops for any new mail. This is controlled by the
variable vm-mail-check-interval
.
These timer tasks are scheduled using the itimer
package in
XEmacs and the timer
package in Gnu Emacs.
Previous: Extents and Overlays, Up: Internals [Contents][Index]