SICStus Prolog supports wide characters (up to 31 bits wide), interpreted as a superset of UNICODE.
Each character in the code set has to be classified as belonging to one of the character categories, such as small-letter, digit, etc. This classification is called the character-type mapping, and it is used for defining the syntax of tokens.
Only character codes 0..255 can be part of tokens, i.e. the ISO 8859/1 (Latin 1) subset of UNICODE1. This restriction may be lifted in the future.
+ - * / \ ^ < > = ~ : . ? @ # $ &
In addition, the non-ASCII character codes 161..191, 215, and 247 belong
to this character type.
% ( ) , [ ] { | }
Other characters are unclassified and may only appear in comments.
token | ::= name
| |
| natural-number
| ||
| unsigned-float
| ||
| variable
| ||
| string
| ||
| punctuation-char
| ||
| layout-text
| ||
| full-stop
| ||
name | ::= quoted-name
| |
| word
| ||
| symbol
| ||
| solo-char
| ||
| [ ?layout-text ]
| ||
| { ?layout-text }
| ||
word | ::= small-letter ?alpha...
| |
symbol | ::= symbol-char... | { except in the case of a full-stop or where the first 2 chars are `/*' }
|
natural-number | ::= digit...
| |
| base-prefix alpha... | { where each alpha must be digits of the base indicated by base-prefix, treating a,b,... and A,B,... as 10,11,... }
| |
| 0 ' char-item | { yielding the character code for char }
| |
unsigned-float | ::= simple-float
| |
| simple-float exp exponent
| ||
simple-float | ::= digit... . digit...
| |
exp | ::= e | E
| |
exponent | ::= digit... | sign digit...
| |
sign | ::= - | +
| |
variable | ::= underline ?alpha...
| |
| capital-letter ?alpha...
| ||
string | ::= " ?string-item... "
| |
string-item | ::= quoted-char | { other than `"' or `\' }
|
| ""
| ||
| \ escape-sequence
| ||
quoted-atom | ::= ' ?quoted-item... '
| |
quoted-item | ::= quoted-char | { other than `'' or `\' }
|
| ''
| ||
| \ escape-sequence
| ||
backquoted-atom | ::= ` ?backquoted-item... `
| |
backquoted-item | ::= quoted-char | { other than ``' or `\' }
|
| ``
| ||
| \ escape-sequence
| ||
layout-text | ::= layout-text-item...
| |
layout-text-item | ::= layout-char | comment
| |
comment | ::= /* ?char... */ | { where ?char... must not contain `*/' }
|
| % ?char... <LFD> | { where ?char... must not contain <LFD> }
| |
full-stop | ::= . | { the following token, if any, must be layout-text}
|
char | ::= layout-char
| |
| printing-char
| ||
printing-char | ::= alpha
| |
| symbol-char
| ||
| solo-char
| ||
| punctuation-char
| ||
| quote-char
| ||
alpha | ::= capital-letter | small-letter | digit | underline
| |
escape-sequence | ::= b | { backspace, character code 8 }
|
| t | { horizontal tab, character code 9 }
| |
| n | { newline, character code 10 }
| |
| v | { vertical tab, character code 11 }
| |
| f | { form feed, character code 12 }
| |
| r | { carriage return, character code 13 }
| |
| e | { escape, character code 27 }
| |
| d | { delete, character code 127 }
| |
| a | { alarm, character code 7 }
| |
| other-escape-sequence
| ||
quoted-name | ::= quoted-atom
| |
| backquoted-atom
| ||
base-prefix | ::= 0b | { indicates base 2 }
|
| 0o | { indicates base 8 }
| |
| 0x | { indicates base 16 }
| |
char-item | ::= quoted-item
| |
other-escape-sequence | ::= x alpha... \ | {treating a,b,... and A,B,... as 10,11,... } in the range [0..15], hex character code }
|
| o digit... \ | { in the range [0..7], octal character code }
| |
| <LFD> | { ignored }
| |
| \ | { stands for itself }
| |
| ' | { stands for itself }
| |
| " | { stands for itself }
| |
| ` | { stands for itself }
| |
quoted-char | ::= <SPC>
| |
| printing-char
|
[1] Characters outside this range can still be included in quoted atoms and strings by using escape sequences (see ref-syn-syn-esc).