Common definitions

Some grammar definitions which are needed on the following pages appear below.

Sets of characters

The following special terminals specify sets of Unicode characters:

Grammar
ANY
PATTERN_WHITE_SPACE
XID_START
XID_CONTINUE

ANY matches any Unicode character.

PATTERN_WHITE_SPACE matches any character which has the Pattern_White_Space Unicode property. These characters are:

U+0009(horizontal tab, '\t')
U+000A(line feed, '\n')
U+000B(vertical tab)
U+000C(form feed)
U+000D(carriage return, '\r')
U+0020(space, ' ')
U+0085(next line)
U+200E(left-to-right mark)
U+200F(right-to-left mark)
U+2028(line separator)
U+2029(paragraph separator)

Note: This set doesn't change in updated Unicode versions.

XID_START matches any character which has the XID_Start Unicode property (as of Unicode 16.0.0).

XID_CONTINUE matches any character which has the XID_Continue Unicode property (as of Unicode 16.0.0).

Identifier-like forms

Grammar
IDENT = { IDENT_START ~ XID_CONTINUE * }
IDENT_START = { XID_START | "_" }
SUFFIX = { IDENT }