2. Language Reference¶
This section gives a systematic overview of the NaBL2 language.
2.1. Lexical matters¶
2.1.1. Identifiers¶
Most identifiers in NaBL2 fall into one of two categories, which we will refer to as:
- Lowercase identifiers, that start with a lowercase character, and
must match the regular expression
[a-z][a-zA-Z0-9\_]*. - Uppercase identifiers, that start with an uppercase character, and
must match the regular expression
[A-Z][a-zA-Z0-9\_]*.
2.1.2. Comments¶
Comments in NaBL2 follow the C-style:
// ... single line ...for single-line comments/* ... multiple lines ... */for multi-line comments
Multi-line comments can be nested, and run until the end of the file
when the closing */ is omitted.
2.2. Modules¶
module module-id
section*
NaBL2 specifications are organized in modules. A module is identified
by a module identifier. Module identifiers consist of one or more
names seperated by slashes, as in {name "/"}+. The names
must match the regular expression
[a-zA-Z0-9\_][a-zA-Z0-9\_\.\-]*.
Every module is defined in its own file, with the extensions
.nabl2. The module name and the file paths must coincide.
Example. An empty module
analysis/main, defined in a file.../analysis/main.nabl2.module analysis/main // work on this
Modules consist of sections for imports, signatures, and rule definitions. The rest of this section describes imports, and subsequents sections deal with signatures and rules.
2.2.1. Imports¶
imports
module-ref*
A module can import definitions from other modules be importing the
other module. Imports are specified in an imports section, which
lists the modules being imported. A module reference can be:
- A module identifier, which imports a single module with that name.
- A wildcard, which imports all modules with a given prefix. A
wildcard is like a module identifier, but with a dash as the last
part, as in
{name "/"}+ "/-".
A wildcard import does not work recursively. For example,
analysis/- would imports analysis/functions, and
analysis/classes, but not analysis/lets/recursive.
Example. A main module importing several submodules.
module main imports builtins functions/- classes/- types
2.3. Signatures¶
signatures
signature*
Signatures contain definitions and parameters used in the specification. In the rest of this section, signatures for terms, name binding, functions and relations, and constraint rules are described.
2.3.1. Terms¶
Terms in NaBL2 are multi-sorted, and are defined in the sorts and
constructors signatures.
Sorts¶
sorts
sort-id*
Available since version 2.3.0
The sorts signature lists the sorts that are available. Sort are
identified by uppercase identifiers.
Example. Module declaring one sort Type.
module example
signature
sorts Type
Constructors¶
constructors
ctor-def*
Constructors are defined in a constructors signature, and
identified by uppercase identifiers. Constructor definitions are
written as follows:
- Nullary constructors are defined using
ctor-id ":" sort-id. - N-ary constructors are defined using
ctor-id ":" {sort-ref "*"}+ "->" sort-id.
Sort references can refer to sorts defined in the signature, or to several builtin sorts. One can refer to the following sorts:
- User-defined sorts using its
sort-id. - Tuples using
"(" {sort-ref "*"}* ")". - Lists using
"list(" sort-ref ")". - Maps using
"map(" sort-ref "," sort-ref ")". - Generic terms using the
"term"keyword. The term sort contains all possible terms, and can be seen as a supertype of all other sorts. - Strings using the
"string"keyword. - Scopes using the
"scope"keyword. - Occurrences using the
"occurrence"keyword. - Sort variables are written using lowercase identifiers.
For example, a module specifying the types for a language with numbers, functions, and records identified by scopes, might look like this:
module example
signature
sorts Type
constructors
NumT : Type
FunT : Type * Type -> Type
RecT : scope -> Type
2.3.2. Name binding¶
Two signatures are relevant for name binding. One describes namespaces, that are used for occurrences, and one describes the parameters for name resolution.
Namespaces¶
namespaces
namespace-def*
Namespaces are defined in the namespaces signature. Namespaces are
identified by uppercase identifiers. A namespace definition has the
following form: namespace-id (":" sort-ref)?
properties?. The optional ":" sort-ref indicates the sort
used for the types of occurrences in this namespace.
Other properties of occurrences in this namespace, are specified as a
block of the form "{" {(prop-id ":" sort-ref) ","}*
"}". Properties are identified by lowercase identifiers, and type
is a reserved property keyword that cannot be used.
The following example defines three namespaces: 1) for modules,
without a type or properties, 2) for classes, which has a property
to record the body of the class, and 3) for variables, which has a
type property, of sort Type. For completeness the sort
declaration for Type is shown as well.
module example
signature
sorts Type
namespaces
Module
Class { body : term }
Var : Type
Name resolution¶
name resolution
labels
label-id*
order
{label-order ","}*
well-formedness
label-regexp
Name resolution parameters are specified in a name-resolution
signature. Note that this block can only be specified once per
project.
Edge labels are specified using the labels keyword, followed by a
list of uppercase label identifiers. The label "D" is reserved and
signifies a declaration in the same scope.
The specificity order on labels is specified using the order
keyword, and a comma-separated list of label-ref "<"
label-ref pairs. Label references refer to a label identifier, or the
special label D.
Finally, the well-formedness predicate for paths is specified as a
regular expression over edge labels, after the well-formedness
keyword. The regular expression has the following syntax:
- A literal label using its
label-id. - Empty sequence using
"e". - Concatenation with
regexp regexp. - Optional (zero or one) with
regexp "?". - Closure (zero or more) with
regexp "*". - Non-empty (one or more) with
regexp "+". - Logical or with
regexp "|" regexp. - Logical and with
regexp "&" regexp. - Empty language using
"0", i.e., this will not match on anything. - Parenthesis, written as
"(" regexp ")", can be used to group complex expressions.
The following example shows the default parameters, that are used if no parameters are specified:
name resolution
labels
P I
order
D < P,
D < I,
I < P
well-formedness
P* I*
2.3.3. Functions and relations¶
Functions¶
functions
Relations¶
relations
( relation-option* relation-id (":" sort-ref "*" sort-ref)? "{" {variance-pattern ","}* "}" )*
relation-option = "reflexive" | "irreflexive"
| "symmetric" | "anti-symmetric"
| "transitive" | "anti-transitive"
variance-pattern = ctor-id "(" {variance ","}* ")"
| "[" variance "]"
| "(" {variance ","}* ")"
variance = "="
| "+"relation-id?
| "-"relation-id?
2.3.4. Rules¶
constraint generator
rule-def*
The type signatures for constraint generation rules are defined in a
constraint generator signature. Rule signatures describe the sort
being matched, the sorts of any parameters, and optionally the sort of
the type. A rule signature is written as rule-id? "[["
sort-ref "^" "(" {sort-ref ","}* ")" (":" sort-ref)? "]]". Rules are
identified by uppercase identifiers.
The following example shows a module that defines a default rule
for expressions, and rules for recursive and parallel bindings. The
rule for expressions has one scope parameter, and expressions are
assigned a type of sort Type. The bind rules are named, and
match on the same AST sort Bind. They take two scope
parameters, and do not assign any type to the bind construct.
module example
signature
constraint generator
[[ Expr ^ (scope) : Type ]]
BindPar[[ Bind ^ (scope, scope) ]]
BindRec[[ Bind ^ (scope, scope) ]]
NaBL2 supports higher-order rules. In those cases, the
rule-id is extended with a list of parameters, written as
rule-id "(" {rule-id ","}* ")".
For example, the rule that applies some rule, given as a parameter
X, to the elements of a list has signature Map1(X)[[ a ^ (b)
]]. Note that we use variables a and b for the AST and
parameter sort respectively, since the map rule is polymorphic.
2.4. Rules¶
rules
rule*
The rules section of a module defines syntax directed constraint generation rules.
2.4.1. Init rule¶
init ^ ( {parameter ","}* ) (":" type)? := {clause ","}+ .
init ^ ( {parameter ","}* ) (":" type)? .
Constraint generation starts by applying the default rule to the
top-level constructor. The init rule, which must be specified
exactly once, provides the initial values to the parameters of the
default rule.
Init rules come in two variants. The first variant outputs rule
clauses. These can create new scopes, or defined constraints on
top-level declarations. If the rule has no clauses, the rule can be
closed without a clause definition. For example, init ^ (). is
shorthand for init ^ () := true.
In the example module below, the default rule takes one scope parameter. The init rule creates a new scope, which will be used as the initial value for constraint generation.
module example
rules
init ^ (s) := new s.
[[ t ^ (s) ]].
2.4.2. Generation rules¶
rule-id? [[ pattern ^ ( {parameter ","}* ) (":" type)? ]] := {clause ","}+ .
rule-id? [[ pattern ^ ( {parameter ","}* ) (":" type)? ]] .
Variables not matched in the pattern, bound to parameters, or new scopes, are automatically inferred to be unification variables.
pattern = ctor-id "(" {pattern ","}* ")"
| "(" {pattern ","}* ")"
| "[" {pattern ","}* "]"
| "[" {pattern ","}* "|" pattern "]"
| "_"
| var-id
Recursive calls¶
2.5. Constraints¶
2.5.1. Error messages¶
clause = "true"
| "false" message?
message = "|" message-kind message-content message-position?
message-kind = "error" | "warning" | "note"
message-content = "\"" chars "\""
| "$[" (chars | "[" term "]")* "]"
message-position = "@" var-id
2.5.2. Term equality¶
clause = term "==" term message?
| term "!=" term message?
term = ctor-id "(" {term ","}* ")"
| "(" {term ","}* ")"
| "[" {term ","}* "]"
| "[" {term ","}* "|" term "]"
| namespace-id? "{" term ("@" var-id)? "}"
2.5.3. Name binding¶
Scope graph¶
clause = occurrence "->" scope
| occurrence "<-" scope
| scope "-"label-id"->" scope
| occurrence "="label-id"=>" scope
| occurrence "<="label-id"=" scope
| "new" var-id*
Name resolution¶
clause = occurrence "|->" occurrence message?
| occurrence "?="Label"=>" scope message?
| occurrence ":" type priority? message?
| occurrence"."prop-id ":=" term priority? message?
priority = "!"*
2.5.4. Set¶
clause = "distinct"("/"set-proj)? set-expr message?
| set-expr "subseteq"("/"set-proj)? set-expr message?
set-expr = "0"
| "(" set-expr "union"("/"set-proj)? set-expr ")"
| "(" set-expr "isect"("/"set-proj)? set-expr ")"
| "(" set-expr "minus"("/"set-proj)? set-expr ")"
| "D(" scope ")"("/"namespace-id)?
| "R(" scope ")"("/"namespace-id)?
| "V(" scope ")"("/"namespace-id)?
| "W(" scope ")"("/"namespace-id)?
set-proj = "name"
2.5.5. Functions and relations¶
clause = term "<"relation-id?"!" term message?
| term "<"relation-id?"?" term message?
| term "is" function-ref "of" term message?
function-ref = function-id
| relation-id".lub"
| relation-id".glb"
2.5.6. Symbolic¶
clause = "?-" term
| "!-" term