Fine-grained tokens
Tokenising produces fine-grained tokens.
Each fine-grained token has a kind. Most kinds of fine-grained token also have attributes.
The possible kinds of fine-grained token are listed in the table below together with their attributes.
| Kind | Attributes |
|---|---|
Whitespace | |
Line_comment | style, body |
Block_comment | style, body |
Character_literal | represented character, suffix |
Byte_literal | represented byte, suffix |
String_literal | represented string, suffix |
Byte_string_literal | represented bytes, suffix |
C_string_literal | represented bytes, suffix |
Raw_string_literal | represented string, suffix |
Raw_byte_string_literal | represented bytes, suffix |
Raw_c_string_literal | represented bytes, suffix |
Float_literal | body, suffix |
Integer_literal | base, digits, suffix |
Raw_lifetime_or_label | name |
Lifetime_or_label | name |
Raw_ident | represented ident |
Ident | represented ident |
Punctuation | mark |
These attributes have the following types:
| Attribute | Type |
|---|---|
| base | binary / octal / decimal / hexadecimal |
| body | sequence of characters |
| digits | sequence of characters |
| mark | single character |
| name | sequence of characters |
| represented byte | single byte |
| represented bytes | sequence of bytes |
| represented character | single character |
| represented ident | sequence of characters |
| represented string | sequence of characters |
| style | non-doc / inner doc / outer doc |
| suffix | sequence of characters |
Note: At this stage
- Both _ and keywords are treated as instances of
Ident.- There are explicit tokens representing whitespace and comments.
- Single-character tokens are used for all punctuation.
- A lifetime (or label) is represented as a single token (which includes the leading ').