The OmegaConf grammar¶
OmegaConf uses an ANTLR-based grammar to parse string expressions, where the lexer rules rules define the tokens used by the parser rules. Currently this grammar’s main usage is in the parsing of interpolations, detailed below.
Interpolation strings¶
An interpolation string is any string containing the ${
character sequence (denoting the start of an interpolation),
and is parsed using the text
rule of the grammar:
text: (interpolation | ANY_STR | ESC | ESC_INTER | TOP_ESC | QUOTED_ESC)+;
Such a string can either be a single interpolation, or the concatenation of multiple fragments that can either be interpolations or regular strings (with a special handling of escaped characters, see Escaping in interpolation strings below). These are all examples of interpolation strings:
${foo.bar}
https://${host}:${port}
Hello ${name}
${a}${oc.env:B}${c}
Interpolation types¶
An interpolation
as found in the rule above can either be a Config node interpolation
(e.g., ${host}
) or a call to a resolver (e.g., ${oc.env:B}
).
This is reflected in the following parser rules:
interpolation: interpolationNode | interpolationResolver; interpolationNode: INTER_OPEN // ${ DOT* (configKey | BRACKET_OPEN configKey BRACKET_CLOSE) (DOT configKey | BRACKET_OPEN configKey BRACKET_CLOSE)* INTER_CLOSE; // } interpolationResolver: INTER_OPEN // ${ resolverName COLON sequence? BRACE_CLOSE; // }
The following are all valid examples of config node interpolations according to the interpolationNode
rule
(note in particular that it supports both dot and bracket notations to access child nodes):
${host}
${.sibling}
${..uncle.cousin}
${some_list[3]}
${some_deep_dict[key1][subkey2].subsubkey3}
Here are also examples of resolver calls from the interpolationResolver
rule:
${oc.env:B}
${my_resolver_without_args:}
${oc.select: missing, default}
Resolver arguments must be provided in a comma-separated list as per the following
sequence
parser rule:
sequence: (element (COMMA element?)*) | (COMMA element?)+;
Note that this rule currently supports empty arguments to preserve backward compatibility with OmegaConf 2.0, but this has been deprecated (see #572 ).
Element types¶
As seen in the sequence
rule above, each resolver argument is parsed by an element
rule,
which currently supports four main types of arguments:
element: quotedValue | listContainer | dictContainer | primitive ;
A quotedValue
is a quoted string that may contain basically anything in-between either double or single quotes
(including interpolations, which will be resolved at evaluation time).
For instance:
"Hello World!"
'Hello ${name}!'
"I ${can: ${nest}, ${interpolations}, 'and quotes'}"
The quotedValue
parser rule is formally defined as:
quotedValue: (QUOTE_OPEN_SINGLE | QUOTE_OPEN_DOUBLE) text? MATCHING_QUOTE_CLOSE;
listContainer
and dictContainer
are respectively lists and dictionaries, using a familiar syntax:
List examples:
[]
,[1, 2, 3]
,[${a}, ${oc.env:B}, c]
Dict examples:
{}
,{a: 1, b: 2}
,{a: ${a}, b: ${oc.env:B}}
Their corresponding parser rules are:
listContainer: BRACKET_OPEN sequence? BRACKET_CLOSE; dictContainer: BRACE_OPEN (dictKeyValuePair (COMMA dictKeyValuePair)*)? BRACE_CLOSE;
Regarding dictionaries, note that although values can be any element
, keys are more
restricted, and in particular quoted strings and interpolations are currently not allowed as
dictionary keys (see the definition of dictKey
in the grammar).
Finally, a primitive
is everything else that is allowed, including in particular (see the full grammar
for details):
Unquoted strings (that support only a subset of characters, contrary to quoted ones):
foo
,foo_bar
,hello world 123
Integer numbers:
123
,-5
,+1_000_000
Floating point numbers (with special case-independent keywords for infinity and NaN):
0.1
,1e-3
,inf
,-INF
,nan
Other special keywords (also case-independent):
null
,true
,false
,NULL
,True
,fAlSe
. IMPORTANT:None
is not a special keyword and will be parsed as an unquoted string, you must use thenull
keyword instead (as in YAML).Interpolations (thus allowing for nested interpolations)
Escaped characters¶
Some characters need to be escaped, with varying escaping requirements depending on the situation. In general, however, you can use the following rule of thumb: you only need to escape characters that otherwise have a special meaning in the current context.
Escaping in interpolation strings¶
In order to define fields whose value is an interpolation-like string, interpolations can be escaped with \${
.
For instance:
>>> c = OmegaConf.create({"path": r"\${dir}", "dir": "tmp"})
>>> print(c.path) # does *not* interpolate into the `dir` node
${dir}
If you actually want to follow a \
with a resolved interpolation, this backslash
needs to be escaped into \\
to differentiate it from an escaped interpolation:
>>> c = OmegaConf.create({"path": r"C:\\${dir}", "dir": "tmp"})
>>> print(c.path) # *does* interpolate into the `dir` node
C:\tmp
Note that we use Python raw strings here to make code
more readable – otherwise all \
characters would need be duplicated due to how Python handles
escaping in regular string literals.
Finally, since the \
character has no special meaning unless followed by ${
,
it does not need to be escaped anywhere else:
>>> c = OmegaConf.create({"path": r"C:\foo_${dir}", "dir": "tmp"})
>>> print(c.path) # a single \ is preserved...
C:\foo_tmp
>>> c = OmegaConf.create({"path": r"C:\\foo_${dir}", "dir": "tmp"})
>>> print(c.path) # ... and multiple \\ too (no escape sequence)
C:\\foo_tmp
Escaping in unquoted strings¶
Unquoted strings can be found in a number of contexts, including dictionary keys/values,
list elements, etc. As a result, the escape sequences are used for some
special characters
(\\
, \[
, \]
, \{
, \}
, \(
, \)
, \:
, \=
, \,
),
for instance:
C\:\\$\{dir\}
resolves to the string"C:\${dir}"
\[a\, b\, c\]
resolves to the string"[a, b, c]"
In addition, leading and trailing whitespaces must be escaped in unquoted strings if we do not want them to be stripped (while inner whitespaces are always preserved):
>>> c = OmegaConf.create({"esc": r"${oc.decode: \ hi u \ }"})
>>> c.esc # one leading whitespace and two trailing ones
' hi u '
>>> # Tabs are handled similarly (NB: r-strings can't be used below)
>>> c = OmegaConf.create({"esc": "${oc.decode:\t\\\thi u\t\\\t\t}"})
>>> c.esc # one leading tab and two trailing ones
'\thi u\t\t'
Escaping in unquoted strings can lead to hard-to-read expressions, and it is recommended to switch to quoted strings instead of relying heavily on the above escape sequences.
Escaping in quoted strings¶
As can be seen from the definition of the quotedValue
parser rule above, quoted strings
are just text
fragments surrounded by quotes, and are thus very similar to Interpolation strings.
As a result, the \${
escape sequence can also be used to escape interpolations
in quoted strings (as described in Escaping in interpolation strings):
"\${dir}"
resolves to the string"${dir}"
"C:\\${dir}"
resolves to the string"C:\<value of dir>"
However, one key difference with interpolation strings is that quotes of the same type as the enclosing quotes must be escaped, unless they are within a nested interpolation. For instance:
'\'Hi you\', I said'
resolves to the string"'Hi you', I said"
"'Hi ${concat: 'y', "o", u}', I said"
also resolves to the string"'Hi you', I said"
ifconcat
is a custom resolver concatenating its inputs. The main point to pay attention to in this example is that the quoted strings'y'
and"o"
found within the resolver interpolation${concat: ...}
do not need to be escaped, regardless of existing quotes outside of this interpolation.