Most of the time, Emacs can recognize which coding system to use for any given file--once you have specified your preferences.
Some coding systems can be recognized or distinguished by which byte sequences appear in the data. However, there are coding systems that cannot be distinguished, not even potentially. For example, there is no way to distinguish between Latin-1 and Latin-2; they use the same byte values with different meanings.
Emacs handles this situation by means of a priority list of coding systems. Whenever Emacs reads a file, if you do not specify the coding system to use, Emacs checks the data against each coding system, starting with the first in priority and working down the list, until it finds a coding system that fits the data. Then it converts the file contents assuming that they are represented in this coding system.
The priority list of coding systems depends on the selected language environment (see section Language Environments). For example, if you use French, you probably want Emacs to prefer Latin-1 to Latin-2; if you use Czech, you probably want Latin-2 to be preferred. This is one of the reasons to specify a language environment.
However, you can alter the priority list in detail with the command M-x prefer-coding-system. This command reads the name of a coding system from the minibuffer, and adds it to the front of the priority list, so that it is preferred to all others. If you use this command several times, each use adds one element to the front of the priority list.
Sometimes a file name indicates which coding system to use for the
file. The variable
file-coding-system-alist specifies this
correspondence. There is a special function
modify-coding-system-alist for adding elements to this list. For
example, to read and write all `.txt' files using the coding system
china-iso-8bit, you can execute this Lisp expression:
(modify-coding-system-alist 'file "\\.txt\\'" 'china-iso-8bit)
The first argument should be
file, the second argument should be
a regular expression that determines which files this applies to, and
the third argument says which coding system to use for these files.
You can specify the coding system for a particular file using the
`-*-...-*-' construct at the beginning of a file, or a local
variables list at the end (see section Local Variables in Files). You do this by
defining a value for the "variable" named
coding. Emacs does
not really have a variable
coding; instead of setting a variable,
it uses the specified coding system for the file. For example,
`-*-mode: C; coding: latin-1;-*-' specifies use of the Latin-1
coding system, as well as C mode. If you specify the coding explicitly
in the file, that overrides
auto-coding-alist is the strongest way to specify
the coding system for certain patterns of file names; this variable even
overrides `-*-coding:-*-' tags in the file itself. Emacs uses this
feature for tar and archive files, to prevent Emacs from being confused
by a `-*-coding:-*-' tag in a member of the archive and thinking it
applies to the archive file as a whole.
Once Emacs has chosen a coding system for a buffer, it stores that
coding system in
buffer-file-coding-system and uses that coding
system, by default, for operations that write from this buffer into a
file. This includes the commands
write-region. If you want to write files from this buffer using
a different coding system, you can specify a different coding system for
the buffer using
set-buffer-file-coding-system (see section Specifying a Coding System).
When you send a message with Mail mode (see section Sending Mail), Emacs has
four different ways to determine the coding system to use for encoding
the message text. It tries the buffer's own value of
buffer-file-coding-system, if that is non-
it uses the value of
sendmail-coding-system, if that is
nil. The third way is to use the default coding system for
new files, which is controlled by your choice of language environment,
if that is non-
nil. If all of these three values are
Emacs encodes outgoing mail using the Latin-1 coding system.
When you get new mail in Rmail, each message is translated automatically from the coding system it is written in--as if it were a separate file. This uses the priority list of coding systems that you have specified.
For reading and saving Rmail files themselves, Emacs uses the coding
system specified by the variable
default value is
nil, which means that Rmail files are not
translated (they are read and written in the Emacs internal character
Go to the first, previous, next, last section, table of contents.