Machine-readable escape-processing grammar
The machine-readable Pest grammar for escape processing is presented here for convenience.
See Parsing Expression Grammars for an explanation of the notation.
ANY, EOI, PATTERN_WHITE_SPACE, XID_START, and XID_CONTINUE are built in to Pest and so not defined below.
TAB, CR, LF, DOUBLEQUOTE, and BACKSLASH are treated as special terminals in this writeup,
but they are not built in to Pest so they have definitions below using character-sequence terminals which include escapes.
These definitions use Pest's silent rules.
LITERAL_COMPONENTS = {
LITERAL_COMPONENT *
}
LITERAL_COMPONENT = {
!BACKSLASH ~ ANY |
BACKSLASH ~ ESCAPE_BODY
}
ESCAPE_BODY = {
SIMPLE_ESCAPE_BODY |
UNICODE_ESCAPE_BODY |
HEXADECIMAL_ESCAPE_BODY |
STRING_CONTINUATION_ESCAPE_BODY
}
SIMPLE_ESCAPE_BODY = {
"0" | "t" | "n" | "r" | DOUBLEQUOTE | "'" | BACKSLASH
}
UNICODE_ESCAPE_BODY = {
"u" ~ "{" ~ ( HEXADECIMAL_DIGIT ~ "_" * ){1,6} ~ "}"
}
HEXADECIMAL_ESCAPE_BODY = {
"x" ~ HEXADECIMAL_DIGIT ~ HEXADECIMAL_DIGIT
}
STRING_CONTINUATION_ESCAPE_BODY = {
LF ~ ( TAB | LF | CR | " " ) *
}
HEXADECIMAL_DIGIT = {
'0'..'9' | 'a'..'f' | 'A'..'F'
}
TAB = _{ "\u{0009}" }
CR = _{ "\u{000d}" }
LF = _{ "\u{000a}" }
DOUBLEQUOTE = _{ "\"" }
BACKSLASH = _{ "\\" }