Quoted Identifiers in Ballerina

Hinduja Balasubramaniyam
3 min readNov 13, 2020

Identifiers used in a programming language are as significant as a name for a person. Identifiers are the sequence of characters (or symbols) that are used to uniquely identify and represent the programming entities such as variables, types, labels, functions etc.

For example,

int x = 5;

Here, x is an identifier used to uniquely represent a variable.

According to the lexical structure of the language, the character types and sequence patterns are specified for the developers to use them in their identifiers. In order to avoid the complexities in implementation and execution, most of the languages forbid the usage of certain words and character sets in their programs.

if(x != 0) {
//do something
}

Here, the word if gives a meaning to the control structure of the program. Therefore it is considered as a keyword.

These forbidden words are called reserved-words. The reserved words of a language indicate either they are the keywords that gives a meaning to the program context or they are reserved to be used in the language semantics in future (for forward compatibility).

Either way, it is an unavoidable limitation for programmers as it restricts them from utilizing the language features up-to some extent. The similar scenario can be applied to the character sets used in an identifier. In most of the languages alphanumeric character sets are the commonly allowed ones in identifiers.

Ballerina, the programming language with developer-first approach, uniquely treats this problem with its feature of Quoted Identifiers. The Quoted Identifier feature allows an arbitrary non-empty string to be treated as an identifier. Specifically, it allows the user to have identifiers with a Ballerina reserved keyword.

This is achieved by preceding the identifier with a single quote mark (').

For example,

int 'int = 5;
string 'if = "keyword as an identifier";

The feature allows the user to include reserved set of words in their identifiers. In addition, the quoted identifiers allow a huge variety of characters to be used in identifiers.

The character sets that can be used in a quoted identifier are as follows,

  • Alphanumeric characters
string 'sampleVariable123 = "sample";
  • Underscore
int 'sample_variable = 2;
  • ASCII special characters with a preceding \ escape character
string '\{http\:\/\/test\.com\}_name = "Jack";
  • Unicode characters
string 'string_ɱȇşşağę = "Hello World";
  • Characters specified with hexadecimal Unicode code points
string 'unicode_\u{2324} = "John Doe";
//the expected identifier is 'unicode_⌤

With the quoted identifiers, Ballerina ensures that the programmer gets a vast availability for identifiers.

And not only for primitive variables, the quoted identifiers are supported in naming various entities of a Ballerina program. We can use them in,

As Ballerina is currently one of the JVM dependent languages, it is subject to usage restrictions for some ASCII special characters according to the JVM specifications. But, Ballerina overcomes this limitation through a unique encoding scheme supported by UTF-8 based unicode values. With this approach, users can include JVM reserved special characters such as .;[\ in their quoted identifiers with a preceding \ .

This feature is now fully available from the release of Ballerina Swan Lake Preview 4. Especially, Ballerina being a language that provides special support for networking, it is more likely to need identifiers with various possibilities. Therefore, we can now experience the newly found freedom of using identifiers with this Quoted Identifier feature of Ballerina.

--

--

Hinduja Balasubramaniyam

Software Engineer at WSO2 , BSc.(Hons.) in Information Technology, University of Moratuwa.