String and byte literal pretokens

Single-quoted literal

Grammar
Single_quoted_literal = {
    SQ_PREFIX ~
    "'" ~ SQ_CONTENT ~ "'" ~
    SUFFIX ?
}

SQ_PREFIX = { "b" ? }

SQ_CONTENT = {
    "\\" ~ ANY ~ ( !"'" ~ ANY ) * |
    !"'" ~ ANY
}
Pretoken kind

SingleQuotedLiteral

Attributes
prefixfrom SQ_PREFIX
literal contentfrom SQ_CONTENT
suffixfrom SUFFIX (may be none)

(Non-raw) double-quoted literal

Grammar
Double_quoted_literal_2015 = { DQ_PREFIX_2015 ~ DQ_REMAINDER }
Double_quoted_literal_2021 = { DQ_PREFIX_2021 ~ DQ_REMAINDER }

DQ_PREFIX_2015 = { "b" ? }
DQ_PREFIX_2021 = { ( "b" | "c" ) ? }

DQ_REMAINDER = {
    "\"" ~ DQ_CONTENT ~ "\"" ~
    SUFFIX ?
}
DQ_CONTENT = {
    (
        "\\" ~ ANY |
        !"\"" ~ ANY
    ) *
}

Pretoken kind

DoubleQuotedLiteral

Attributes
prefixfrom DQ_PREFIX_2015 or DQ_PREFIX_2021
literal contentfrom DQ_CONTENT
suffixfrom SUFFIX (may be none)

Raw double-quoted literal

Grammar
Raw_double_quoted_literal_2015 = { RAW_DQ_PREFIX_2015 ~ RAW_DQ_REMAINDER }
Raw_double_quoted_literal_2021 = { RAW_DQ_PREFIX_2021 ~ RAW_DQ_REMAINDER }

RAW_DQ_PREFIX_2015 = { "r" | "br" }
RAW_DQ_PREFIX_2021 = { "r" | "br" | "cr" }

RAW_DQ_REMAINDER = {
    HASHES¹ ~
    "\"" ~ RAW_DQ_CONTENT ~ "\"" ~
    HASHES² ~
    SUFFIX ?
}
RAW_DQ_CONTENT = {
    ( !("\"" ~ HASHES²) ~ ANY ) *
}
HASHES = { "#" {0, 255} }

These definitions require an extension to the Parsing Expression Grammar formalism: each of the expressions marked as HASHES² fails unless the text it matches is the same as the text matched by the (only) successful match using the expression marked as HASHES¹ in the same attempt to match the current pretoken nonterminal.

See Grammar for raw string literals for a discussion of alternatives to this extension.

Pretoken kind

RawDoubleQuotedLiteral

Attributes
prefixfrom RAW_DQ_PREFIX_2015 or RAW_DQ_PREFIX_2021
literal contentfrom RAW_DQ_CONTENT
suffixfrom SUFFIX (may be none)

Reserved or unterminated literal

Grammar
Unterminated_literal_2015 = { "r\"" | "br\"" | "b'" }
Reserved_literal_2021 = { IDENT ~ ( "\"" | "'" ) }
Pretoken kind

Reserved

Attributes

(none)

Note: I believe in the Unterminated_literal_2015 definition only the b' form is strictly needed: if that definition matches using one of the other subexpressions then the input will be rejected eventually anyway (given that the corresponding string literal nonterminal didn't match).

Note: Reserved_literal_2021 catches both reserved forms and unterminated b' literals.

Reserved guard (Rust 2024)

Grammar
Reserved_guard_2024 = { "##" | "#\"" }
Pretoken kind

Reserved

Attributes

(none)

Note: This definition is listed here near the double-quoted string literals because these forms were reserved during discussions about introducing string literals formed like #"…"#.