Basic Introduction to Domain Specific Languages

The way most people program today is through general purpose languages (GPL) such as c++, java etc. These are good for making programs that span across multiple domains or problem spaces, although they lack specific features to allow them to be suited for small tasks. This is where domain-specific languages (DSL or sometimes called application-oriented languages) come in. DSL can be programing languages or a specification language that is executable though notation tailored for a very specific domain. They are based on the concepts and features of that domain, as a result they give up generality for expressiveness for there target area. Examples of DSL’s in use today are HTML, LaTek, SQL etc.

Advantages of DSL’s over GPL’s are that they allow very concise code, which expresses its purpose and the domains idioms clearly. Furthermore, they allows faster programing for the domain the DSL is designed for, as a lot of the complexity of programing is abstracted away. For example with an Internal DSL, a library could be used to abstract the complexity of algorithms from the user. As DSL’s reduce the amount of programing and domain expertises needed, less experienced programmers are able to utilize the algorithms without having to understand their implementation in a GPL. In addition the most important advantage is that they incorporate domain knowledge. This means that the language supports the domain concepts and notation.

Disadvantages of DSL’s include the creation, maintenance and support of the tools supporting the DSL and as well as the language itself. Furthermore the users must be trained in it use, all of which can be expensive in time and money. Another drawback is the limited availability of resources and support. For example GPLs have large user base, meaning there are large ecosystems of tools build around them. Due to the specialist domain the DSLs are create for, the user base will be small. This means potentially less support is available for the DSL. Furthermore the tools supporting it will often be proprietary and limited. Also issues may arise from a proliferation of non-standard DSLs, such that skills learned for one DSL are made useless by going to work for a different company in the same domain that uses an alternative DSL. Furthermore issues arise from the focus on full code generation; it is probable that there will be loss of efficiency with the generated code compared to if the code was hand written. Although this is an issue, it will probably only affect a small number of use cases.

DSL’s are not a new concept, for example APT, a DSL for programming numerically controlled machines was developed in 1957-1958. But DSL’s are become more popular in recent times. There are two main types of DSLs used today: internal (Embedded DSL) and external.

Internal DSL often form API’s in a GPLs. Such that an internal DSL is part of and managed by a general purpose language.  These are often used in the Ruby and Lisp communities. These are becoming a popular way to be able to create a programing language, as they get around some of the complexities of an external DSL. A large problem with a DSL is creating the language itself, as you need to create and maintain an infrastructure and ecosystem to support it.  Such as a compiler or interpreter. Therefore having your DSL based on top of a GPL often negates these issues.

When designing an Internal DSL, a host language and it’s constructs can be used to implement the new language. This means that the new language has to be syntactically compatible with the GPL, as it will be complied by the same compiler or interpreter. This allows the DSL to be an extension or a reduction to the host GPL.

DSL’s can be an extension to a host GP such that the abstraction made through the DSL is available to the Host GPL. This means that the full GPL is made available to the user, therefore the Host GPL features don’t need to be implemented in the DSL. This approach does limit the syntax of the DSL as it has to be compromised to fit the the limited syntactic rules of the host language.

When the DSL is a reduction to the host GPL, the new language is specialized to the domain. Therefore, it may be important to hide parts of the host GPL constructs that are not relevant to the domain. This means that the DSL ends up being a filtered out section of the host GPL mixed with the new domain specific concepts.

My DSL diagram

Another architecture of DSLs are External DSLs. This is where the language is parsed independently of the host GPL. They essentially follow the typical compiler architecture. Furthermore the language is independent from the rest of the program (XML is often used), which often means more effort involved in programming it. They allow you to have a very custom syntax which contrasts with the internal DSL. This often means the end user will find that it will be easier to write, as the language can be very tailored to the idioms of the domain. For the external DSL, a full parser will need to be written. These can be implemented through code interpretation or code generation. Using an interpreter is often easier but code generation may be the only option, for example when runtime performance is important. The code generated is often in a high level language. Examples of External DSLs used today are regular expressions, SQL, XML.

Internal vs External

Finally there are also language workbenches. This is a specialized IDE for building and defining DSLs. They allow you to define the abstract syntax and structure of a DSL; an editor to allow people to write DSL scripts and a generator which translates the DSL to an executable representation.

This is just a basic introduction to DSL’s, I’ll be exploring this subject much more in depth in the near future. DSL’s are a fuzzy concept, as what you class as a DSL can be extremely broad. In the next section I’ll explore what makes a DSL compared to a framework with a normal command-query API. Furthermore I’ll be exploring more DSL concepts such as Semantic Models of DSL’s and running through some examples.

Article sources:

[1] – http://en.wikipedia.org/wiki/Domain-specific_language

[2] – Fowler, Martin. Domain-specific languages. Pearson Education, 2010.

[3] – Van Deursen, Arie, and Paul Klint. “Domain-specific language design requires feature descriptions.” CIT. Journal of computing and information technology10.1 (2002): 1-17.

[4] – Data-parallel Structural Optimisation in Agent-based Modelling Alwyn V. Husselmann May 2014

[5] –  http://philcalcado.com/research-on-dsls/domain-specific-languages-dsls/internal-dsls/

[6] – Van Deursen, Arie, Paul Klint, and Joost Visser. “Domain-Specific Languages: An Annotated Bibliography.” Sigplan Notices 35.6 (2000): 26-36.

[7] – Mernik, Marjan, Jan Heering, and Anthony M. Sloane. “When and how to develop domain-specific languages.” ACM computing surveys (CSUR) 37.4 (2005): 316-344.

[8] – http://martinfowler.com/articles/languageWorkbench.html

Leave a Reply

Your email address will not be published. Required fields are marked *