Numeric literal pretokens

The following nonterminals are common to the definitions below:

Grammar

DECIMAL_DIGITS = { ('0'..'9' | "_") * }
HEXADECIMAL_DIGITS = { ('0'..'9' | 'a' .. 'f' | 'A' .. 'F' | "_") * }
LOW_BASE_PRETOKEN_DIGITS = { DECIMAL_DIGITS }
DECIMAL_PART = { '0'..'9' ~ DECIMAL_DIGITS }

RESTRICTED_E_SUFFIX = { ("e"|"E") ~ "_"+ ~ !XID_START ~ XID_CONTINUE }

Float literal

Grammar

Float_literal_1 = {
    FLOAT_BODY_WITH_EXPONENT ~ SUFFIX ?
}
Float_literal_2 = {
    FLOAT_BODY_WITHOUT_EXPONENT ~ SUFFIX ? |
    FLOAT_BODY_WITH_FINAL_DOT ~ !"." ~ !IDENT_START
}

FLOAT_BODY_WITH_EXPONENT = {
    DECIMAL_PART ~ ("." ~ DECIMAL_PART ) ? ~
    ("e"|"E") ~ ("+"|"-") ? ~ EXPONENT_DIGITS
}
EXPONENT_DIGITS = { "_" * ~ '0'..'9' ~ DECIMAL_DIGITS }

FLOAT_BODY_WITHOUT_EXPONENT = {
    DECIMAL_PART ~ "." ~ DECIMAL_PART
}

FLOAT_BODY_WITH_FINAL_DOT = {
    DECIMAL_PART ~ "."
}

Pretoken kind

FloatLiteral

Attributes


`body`	from `FLOAT_BODY_WITH_EXPONENT`,`FLOAT_BODY_WITHOUT_EXPONENT`, or `FLOAT_BODY_WITH_FINAL_DOT`
`suffix`	from `SUFFIX` (may be none)

Note: The ! "." subexpression makes sure that forms like 1..2 aren't treated as starting with a float. The ! IDENT_START subexpression makes sure that forms like 1.some_method() aren't treated as starting with a float.

Note: The Reserved_float_empty_exponent pretoken nonterminal is placed between Float_literal_1 and Float_literal_2 in priority order (which is why there are two pretoken nonterminals producing FloatLiteral).

Reserved float

Grammar

Reserved_float_empty_exponent = {
    DECIMAL_PART ~ ("." ~ DECIMAL_PART ) ? ~
    ("e"|"E") ~ ("+"|"-")
}
Reserved_float_e_suffix_restriction = {
    DECIMAL_PART ~ "." ~ DECIMAL_PART ~
    RESTRICTED_E_SUFFIX
}
Reserved_float_based = {
    (
        ("0b" | "0o") ~ LOW_BASE_PRETOKEN_DIGITS |
        "0x" ~ HEXADECIMAL_DIGITS
    )  ~  (
        ("e"|"E") ~ ("+"|"-" | EXPONENT_DIGITS) |
        "." ~ !"." ~ !IDENT_START
    )
}

Pretoken kind

Reserved

Attributes

(none)

Note: The Reserved_float_empty_exponent pretoken nonterminal is placed between Float_literal_1 and Float_literal_2 in priority order. This ordering makes sure that forms like 123.4e+ are reserved, rather than being accepted by FLOAT_BODY_WITHOUT_EXPONENT).

See e-suffix-restriction for discussion of Reserved_float_e_suffix_restriction.

Reserved integer

Grammar

Reserved_integer_e_suffix_restriction = {
    ( INTEGER_BINARY_LITERAL |
      INTEGER_OCTAL_LITERAL |
      INTEGER_DECIMAL_LITERAL ) ~
    RESTRICTED_E_SUFFIX
}

Pretoken kind

Reserved

Attributes

(none)

See e-suffix-restriction for discussion.

Integer literal

Grammar

Integer_literal = {
    ( INTEGER_BINARY_LITERAL |
      INTEGER_OCTAL_LITERAL |
      INTEGER_HEXADECIMAL_LITERAL |
      INTEGER_DECIMAL_LITERAL ) ~
    SUFFIX ?
}

INTEGER_BINARY_LITERAL = { "0b" ~ LOW_BASE_PRETOKEN_DIGITS }
INTEGER_OCTAL_LITERAL = { "0o" ~ LOW_BASE_PRETOKEN_DIGITS }
INTEGER_HEXADECIMAL_LITERAL = { "0x" ~ HEXADECIMAL_DIGITS }
INTEGER_DECIMAL_LITERAL = { DECIMAL_PART }

Pretoken kind

IntegerLiteral

Attributes


`base`	See below
`digits`	from `LOW_BASE_PRETOKEN_DIGITS`, `HEXADECIMAL_DIGITS`, or `DECIMAL_PART`
`suffix`	from `SUFFIX` (may be none)

The base attribute is determined from the following table, depending on which nonterminal participated in the match:


`INTEGER_BINARY_LITERAL`	binary
`INTEGER_OCTAL_LITERAL`	octal
`INTEGER_HEXADECIMAL_LITERAL`	hexadecimal
`INTEGER_DECIMAL_LITERAL`	decimal

Note: See rfc0879 for the reason we accept all decimal digits in binary and octal pretokens; the inappropriate digits are rejected in reprocessing.

Note: The INTEGER_DECIMAL_LITERAL nonterminal is listed last in the Integer_literal definition in order to resolve ambiguous cases like the following:

0b1e2 (which isn't 0 with suffix b1e2)

0b0123 (which is rejected, not accepted as 0 with suffix b0123)

0xy (which is rejected, not accepted as 0 with suffix xy)

0x· (which is rejected, not accepted as 0 with suffix x·)