Escape processing

The descriptions of the effect of reprocessing string and character literals make use of several forms of escape.

Each form of escape is characterised by:

  • an escape sequence: a sequence of characters, which always begins with \
  • an escaped value: either a single character or an empty sequence of characters

In the definitions of escapes below:

  • An octal digit is any of the characters in the range 0..=7.
  • A hexadecimal digit is any of the characters in the ranges 0..=9, a..=f, or A..=F.

Simple escapes

Each sequence of characters occurring in the first column of the following table is an escape sequence.

In each case, the escaped value is the character given in the corresponding entry in the second column.

Escape sequenceEscaped value
\0U+0000 NUL
\tU+0009 HT
\nU+000A LF
\rU+000D CR
\"U+0022 QUOTATION MARK
\'U+0027 APOSTROPHE
\\U+005C REVERSE SOLIDUS

Note: the escaped value therefore has a Unicode scalar value which can be represented in a byte.

8-bit escapes

The escape sequence consists of \x followed by two hexadecimal digits.

The escaped value is the character whose Unicode scalar value is the result of interpreting the final two characters in the escape sequence as a hexadecimal integer, as if by u8::from_str_radix with radix 16.

Note: the escaped value therefore has a Unicode scalar value which can be represented in a byte.

7-bit escapes

The escape sequence consists of \x followed by an octal digit then a hexadecimal digit.

The escaped value is the character whose Unicode scalar value is the result of interpreting the final two characters in the escape sequence as a hexadecimal integer, as if by u8::from_str_radix with radix 16.

Unicode escapes

The escape sequence consists of \u{, followed by a hexadecimal digit, followed by a sequence of characters each of which is a hexadecimal digit or _, followed by }, with the restriction that there are no more than six hexadecimal digits in the entire escape sequence

The escaped value is the character whose Unicode scalar value is the result of interpreting the hexadecimal digits contained in the escape sequence as a hexadecimal integer, as if by u32::from_str_radix with radix 16.

String continuation escapes

The escape sequence consists of \ followed immediately by LF, and all following whitespace characters before the next non-whitespace character.

For this purpose, the whitespace characters are HT (U+0009), LF (U+000A), CR (U+000D), and SPACE (U+0020).

The escaped value is an empty sequence of characters.

The Reference says this behaviour may change in future; see String continuation escapes.