Numeric literal pretokens

The following nonterminals are common to the definitions below:

Grammar
DECIMAL_DIGITS = { ('0'..'9' | "_") * }
HEXADECIMAL_DIGITS = { ('0'..'9' | 'a' .. 'f' | 'A' .. 'F' | "_") * }
LOW_BASE_PRETOKEN_DIGITS = { DECIMAL_DIGITS }
DECIMAL_PART = { '0'..'9' ~ DECIMAL_DIGITS }

RESTRICTED_E_SUFFIX = { ("e"|"E") ~ "_"+ ~ !XID_START ~ XID_CONTINUE }

Float literal

Grammar
Float_literal_1 = {
    FLOAT_BODY_WITH_EXPONENT ~ SUFFIX ?
}
Float_literal_2 = {
    FLOAT_BODY_WITHOUT_EXPONENT ~ SUFFIX ? |
    FLOAT_BODY_WITH_FINAL_DOT ~ !"." ~ !IDENT_START
}

FLOAT_BODY_WITH_EXPONENT = {
    DECIMAL_PART ~ ("." ~ DECIMAL_PART ) ? ~
    ("e"|"E") ~ ("+"|"-") ? ~ EXPONENT_DIGITS
}
EXPONENT_DIGITS = { "_" * ~ '0'..'9' ~ DECIMAL_DIGITS }

FLOAT_BODY_WITHOUT_EXPONENT = {
    DECIMAL_PART ~ "." ~ DECIMAL_PART
}

FLOAT_BODY_WITH_FINAL_DOT = {
    DECIMAL_PART ~ "."
}
Pretoken kind

FloatLiteral

Attributes
bodyfrom FLOAT_BODY_WITH_EXPONENT,FLOAT_BODY_WITHOUT_EXPONENT, or FLOAT_BODY_WITH_FINAL_DOT
suffixfrom SUFFIX (may be none)

Note: The ! "." subexpression makes sure that forms like 1..2 aren't treated as starting with a float. The ! IDENT_START subexpression makes sure that forms like 1.some_method() aren't treated as starting with a float.

Note: The Reserved_float_empty_exponent pretoken nonterminal is placed between Float_literal_1 and Float_literal_2 in priority order (which is why there are two pretoken nonterminals producing FloatLiteral).

Reserved float

Grammar
Reserved_float_empty_exponent = {
    DECIMAL_PART ~ ("." ~ DECIMAL_PART ) ? ~
    ("e"|"E") ~ ("+"|"-")
}
Reserved_float_e_suffix_restriction = {
    DECIMAL_PART ~ "." ~ DECIMAL_PART ~
    RESTRICTED_E_SUFFIX
}
Reserved_float_based = {
    (
        ("0b" | "0o") ~ LOW_BASE_PRETOKEN_DIGITS |
        "0x" ~ HEXADECIMAL_DIGITS
    )  ~  (
        ("e"|"E") ~ ("+"|"-" | EXPONENT_DIGITS) |
        "." ~ !"." ~ !IDENT_START
    )
}
Pretoken kind

Reserved

Attributes

(none)

Note: The Reserved_float_empty_exponent pretoken nonterminal is placed between Float_literal_1 and Float_literal_2 in priority order. This ordering makes sure that forms like 123.4e+ are reserved, rather than being accepted by FLOAT_BODY_WITHOUT_EXPONENT).

See e-suffix-restriction for discussion of Reserved_float_e_suffix_restriction.

Reserved integer

Grammar
Reserved_integer_e_suffix_restriction = {
    ( INTEGER_BINARY_LITERAL |
      INTEGER_OCTAL_LITERAL |
      INTEGER_DECIMAL_LITERAL ) ~
    RESTRICTED_E_SUFFIX
}
Pretoken kind

Reserved

Attributes

(none)

See e-suffix-restriction for discussion.

Integer literal

Grammar
Integer_literal = {
    ( INTEGER_BINARY_LITERAL |
      INTEGER_OCTAL_LITERAL |
      INTEGER_HEXADECIMAL_LITERAL |
      INTEGER_DECIMAL_LITERAL ) ~
    SUFFIX ?
}

INTEGER_BINARY_LITERAL = { "0b" ~ LOW_BASE_PRETOKEN_DIGITS }
INTEGER_OCTAL_LITERAL = { "0o" ~ LOW_BASE_PRETOKEN_DIGITS }
INTEGER_HEXADECIMAL_LITERAL = { "0x" ~ HEXADECIMAL_DIGITS }
INTEGER_DECIMAL_LITERAL = { DECIMAL_PART }
Pretoken kind

IntegerLiteral

Attributes
baseSee below
digitsfrom LOW_BASE_PRETOKEN_DIGITS, HEXADECIMAL_DIGITS, or DECIMAL_PART
suffixfrom SUFFIX (may be none)

The base attribute is determined from the following table, depending on which nonterminal participated in the match:

INTEGER_BINARY_LITERALbinary
INTEGER_OCTAL_LITERALoctal
INTEGER_HEXADECIMAL_LITERALhexadecimal
INTEGER_DECIMAL_LITERALdecimal

Note: See rfc0879 for the reason we accept all decimal digits in binary and octal pretokens; the inappropriate digits are rejected in reprocessing.

Note: The INTEGER_DECIMAL_LITERAL nonterminal is listed last in the Integer_literal definition in order to resolve ambiguous cases like the following:

  • 0b1e2 (which isn't 0 with suffix b1e2)
  • 0b0123 (which is rejected, not accepted as 0 with suffix b0123)
  • 0xy (which is rejected, not accepted as 0 with suffix xy)
  • 0x· (which is rejected, not accepted as 0 with suffix )