Go to the previous, next section.

Character Type

A character in Emacs Lisp is nothing more than an integer. In other words, characters are represented by their character codes. For example, the character A is represented as the integer 65.

Individual characters are not often used in programs. It is far more common to work with strings, which are sequences composed of characters. See section String Type.

Characters in strings, buffers, and files are currently limited to the range of 0 to 255. If an arbitrary integer is used as a character for those purposes, only the lower eight bits are significant. Characters that represent keyboard input have a much wider range.

Since characters are really integers, the printed representation of a character is a decimal number. This is also a possible read syntax for a character, but writing characters that way in Lisp programs is a very bad idea. You should always use the special read syntax formats that Emacs Lisp provides for characters. These syntax formats start with a question mark.

The usual read syntax for alphanumeric characters is a question mark followed by the character; thus, `?A' for the character A, `?B' for the character B, and `?a' for the character a.

For example:

?Q => 81

?q => 113

You can use the same syntax for punctuation characters, but it is often a good idea to add a `\' to prevent Lisp mode from getting confused. For example, `?\ ' is the way to write the space character. If the character is `\', you must use a second `\' to quote it: `?\\'.

You can express the characters control-g, backspace, tab, newline, vertical tab, formfeed, return, and escape as `?\a', `?\b', `?\t', `?\n', `?\v', `?\f', `?\r', `?\e', respectively. Those values are 7, 8, 9, 10, 11, 12, 13, and 27 in decimal. Thus,

?\a => 7                 ; C-g
?\b => 8                 ; backspace, BS, C-h
?\t => 9                 ; tab, TAB, C-i
?\n => 10                ; newline, LFD, C-j
?\v => 11                ; vertical tab, C-k
?\f => 12                ; formfeed character, C-l
?\r => 13                ; carriage return, RET, C-m
?\e => 27                ; escape character, ESC, C-[
?\\ => 92                ; backslash character, \

These sequences which start with backslash are also known as escape sequences, because backslash plays the role of an escape character, but they have nothing to do with the character ESC.

Control characters may be represented using yet another read syntax. This consists of a question mark followed by a backslash, caret, and the corresponding non-control character, in either upper or lower case. For example, either `?\^I' or `?\^i' may be used as the read syntax for the character C-i, the character whose value is 9.

Instead of the `^', you can use `C-'; thus, `?\C-i' is equivalent to `?\^I' and to `?\^i':

?\^I => 9
     
?\C-I => 9

For use in strings and buffers, you are limited to the control characters that exist in ASCII, but for keyboard input purposes, you can turn any character into a control character with `C-'. The character codes for these characters include the 2**22 bit as well as the code for the non-control character. Ordinary terminals have no way of generating non-ASCII control characters, but you can generate them straightforwardly using an X terminal.

The DEL key can be considered and written as Control-?:

?\^? => 127
     
?\C-? => 127

When you represent control characters to be found in files or strings, we recommend the `^' syntax; but when you refer to keyboard input, we prefer the `C-' syntax. This does not affect the meaning of the program, but may guide the understanding of people who read it.

A meta character is a character typed with the META key. The integer that represents such a character has the 2**23 bit set (which on most machines makes it a negative number). We use high bits for this and other modifiers to make possible a wide range of basic character codes.

In a string, the 2**7 bit indicates a meta character, so the meta characters that can fit in a string have codes in the range from 128 to 255, and are the meta versions of the ordinary ASCII characters. (In Emacs versions 18 and older, this convention was used for characters outside of strings as well.)

The read syntax for meta characters uses `\M-'. For example, `?\M-A' stands for M-A. You can use `\M-' together with octal codes, `\C-', or any other syntax for a character. Thus, you can write M-A as `?\M-A', or as `?\M-\101'. Likewise, you can write C-M-b as `?\M-\C-b', `?\C-\M-b', or `?\M-\002'.

The shift modifier is used in indicating the case of a character in special circumstances. The case of an ordinary letter is indicated by its character code as part of ASCII, but ASCII has no way to represent whether a control character is upper case or lower case. Emacs uses the 2**21 bit to indicate that the shift key was used for typing a control character. This distinction is possible only when you use X terminals or other special terminals; ordinary terminals do not indicate the distinction to the computer in any way.

The X Window system defines three other modifier bits that can be set in a character: hyper, super and alt. The syntaxes for these bits are `\H-', `\s-' and `\A-'. Thus, `?\H-\M-\A-x' represents Alt-Hyper-Meta-x. Numerically, the bit values are 2**18 for alt, 2**19 for super and 2**20 for hyper.

Finally, the most general read syntax consists of a question mark followed by a backslash and the character code in octal (up to three octal digits); thus, `?\101' for the character A, `?\001' for the character C-a, and ?\002 for the character C-b. Although this syntax can represent any ASCII character, it is preferred only when the precise octal value is more important than the ASCII representation.

?\012 => 10        ?\n => 10         ?\C-j => 10

?\101 => 65        ?A => 65           

A backslash is allowed, and harmless, preceding any character without a special escape meaning; thus, `?\A' is equivalent to `?A'. There is no reason to use a backslash before most such characters. However, any of the characters `()\|;'`"#.,' should be preceded by a backslash to avoid confusing the Emacs commands for editing Lisp code. Whitespace characters such as space, tab, newline and formfeed should also be preceded by a backslash. However, it is cleaner to use one of the easily readable escape sequences, such as `\t', instead of an actual control character such as a tab.

Go to the previous, next section.