Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Fine-grained tokens

Tokenising produces fine-grained tokens.

Each fine-grained token has a kind. Most kinds of fine-grained token also have attributes.

The possible kinds of fine-grained token are listed in the table below together with their attributes.

KindAttributes
Whitespace
Line_commentstyle, body
Block_commentstyle, body
Character_literalrepresented character, suffix
Byte_literalrepresented byte, suffix
String_literalrepresented string, suffix
Byte_string_literalrepresented bytes, suffix
C_string_literalrepresented bytes, suffix
Raw_string_literalrepresented string, suffix
Raw_byte_string_literalrepresented bytes, suffix
Raw_c_string_literalrepresented bytes, suffix
Float_literalbody, suffix
Integer_literalbase, digits, suffix
Raw_lifetime_or_labelname
Lifetime_or_labelname
Raw_identrepresented ident
Identrepresented ident
Punctuationmark

These attributes have the following types:

AttributeType
basebinary / octal / decimal / hexadecimal
bodysequence of characters
digitssequence of characters
marksingle character
namesequence of characters
represented bytesingle byte
represented bytessequence of bytes
represented charactersingle character
represented identsequence of characters
represented stringsequence of characters
stylenon-doc / inner doc / outer doc
suffixsequence of characters

Note: At this stage

  • Both _ and keywords are treated as instances of Ident.
  • There are explicit tokens representing whitespace and comments.
  • Single-character tokens are used for all punctuation.
  • A lifetime (or label) is represented as a single token (which includes the leading ').