/ domain specific languageprogrammingVSCodeSyntax TreePythonTypescript

Domain specific languages for everyone

It is a widely spread opinion that domain specific languages (DSLs) sacrifice some of the flexibility to express any program (as in general purpose languages) for productivity and conciseness of relevant programs in a particular domain [1]. However many shy away from creating a DSL for their problem domain. Why is that? Can we find ways to simplify the creation of new DSLs?

DSLs are used in a variety of domains, ranging from HTML for web pages over regular expressions to SQL for database applications, thus they are prevalent in the day to day work of pretty much any programmer. Since domain specific languages focus on a definite field of problems, they are easier to use than general-purpose languages such as C++, Java or Python, especially for domain experts without much or any programming knowledge. DSLs add another layer of abstraction to a problem domain, reducing the syntactic overhead compared to general purpose languages. This makes writing DSL code, in my opinion, fast and easy, as its syntax is usually much simpler than the syntax of a general purpose language. Especially beginners can profit a lot from the simpler syntax. An example of this could be DSLs for data science frameworks. They can allow rookie programmers to develop data intensive applications without much knowledge of the underlying frameworks. However, creating a DSL generally requires a lot of expertise, so domain experts are usually not suited for this task. Instead they have to work together with specialists to provide them with a useable and useful DSL, a costly and complicated process that scares away many.

How can the creation of new DSLs be made more accessible, both for experienced and inexperienced programmers? How can DSL rules be produced from source code automatically, with as little input and knowledge required from the user as possible? Would a generic tool that allows anyone to construct their own DSL increase the DSLs acceptance? What would be the main requirements on it? Such a tool might build a DSL from user-input source code and a little additional information by the user. To make the creation of new rules fast and easy, the tool might propose to the user parts of his code that are likely to be variables in his DSL rules. These proposals could be generated from information found in the abstract syntax tree of the input Python code.

I dared a try and developed the described tool. The project was build on top of NLDSL [2], a library for creating DSLs that can be translated into executable Python code. NLDSL streamlines the creation of DSLs, by bundling the syntax description and code generation into annotated Python functions. DSLs created with NLDSL can be translated into executable Python code in an interactive mode in real time, directly from the source code in the editor. NLDSL focuses on creating DSLs for data science, by supporting chains or pipelines of operations, as they are often found in popular libraries like Pandas and Spark. However NLDSL is not limited to chains of operations, but instead can be used for other domains too. NLDSL provides code completions for its DSL code and allows mixing DSL code with general purpose Python code, supporting a wide range of use cases.

The resulting prototype is a VSCode extension that helps users with creating a new DSL. It takes care of all the work with NLDSL by automatically creating the annotated Python functions that define the grammar rules. The plugin is composed of a Typescript client and a Python server, communicating over WebSocket, a TCP-based network protocol. The Typescript client takes care of gathering the required information about input source code and additional user input. The client sends this data to the Python server. From the data the server then creates an annotated Python function that represents a new grammar rule. The newly created rule is then written to a file and the DSL, represented by a Python module, is then installed by the server. The DSL and its newly created grammar rule are now ready to use with the NLDSL infrastructure for in-editor code generation.

This tool greatly reduces the time and knowledge required to generate rules for a new DSL. New rules can be created within seconds. The user requires only very basic knowledge of DSLs in general and NLDSL in particular. Also he does not have to bother with any of the inner workings of NLDSL, but instead only has to know about some very basic rules regarding the grammar that NLDSL supports.

For more information on this tool, feel free to contact me at fabian@inctec.de. If you are interested in this topic in general, check out the parallel and distributed systems group of Heidelberg University at https://pvs.ifi.uni-heidelberg.de/home.

[1] Voelter, Markus: DSL Engineering. Designing, Implementing and Using Domain- Specific Languages. CreateSpace Independent Publishing Platform, 2013, p. 30.
[2] NLDSL Project: https://pypi.org/project/nldsl/ (30.09.2020).

    Fabian Schenk

    Fabian Schenk

    Software engineer working on data intensive applications. He studied computer sciences with focus on distributed systems. Loves climbing, bouldering and video games.

    Read More