3. The Data Interchange Language

This chapter outlines the syntax of None source code at the data interchange format level from the perspective of nesting arbitrary expressions.

3.1. Syntax Tree Elements

The None parser has been designed for minimalism and recognizes only five types of syntax tree elements:

  • Comments
  • Numbers
  • Symbols
  • Strings
  • Lists

All further understanding of types is done with additional parsing at later stages that depend on language context and evaluation scope.

3.1.1. Comments

Comments are not skipped by the parser but stored as strings and stripped before expansion stage. A comment token is recognized by its ; prefix and scanned until the next newline character. Here are some examples for valid comments:

;mostly harmless
;123
;"test"
;(an unresolved tension
;; ;(another comment); ;;

Comments are first-class tokens and kept by the parser until the next processing stage. In naked notation, they contribute to syntax. For more information, see Pitfalls of Naked Notation.

3.1.2. Numbers

Numbers are atomic elements and internally stored as 64-bit floating point numbers, which gives them 52-bit integer precision. Here are some examples for valid numbers:

; positive and negative integers
0 23 42 -303 12 -1
; positive and negative floating point numbers
0.0 1.0 3.14159 -2.0 0.000003
; numbers in scientific notation
1.234e+24 -1e-12

3.1.3. Symbols

Symbols are atomic elements and internally stored as strings. They may contain any character from the UTF-8 character set except whitespace and any character from the set ;()[]{}, and the character / in naked notation (more on that later). Any symbol that parses as a number is also excluded. Here are some examples for valid symbols:

; classic underscore notation
some_identifier _some_identifier
; hyphenated
some-identifier
; mixed case
SomeIdentifier
; fantasy operators
&+ >~ >>= and= str+str
; numbered
_300

3.1.4. Strings

Strings are atomic elements and stored as such, scoped by " (double quotes). The \ escape character can be used to describe various control characters. Here are some examples for valid strings:

"single-line string"
"multi-
line
string"
"return: \n, tab: \t, backslash: \\, double quote: \"."

3.1.5. Lists

Lists are the only nesting type, scoped by the parenthesis characters ( and ). Lists can be empty or contain an unlimited number of elements, separated by whitespace. They typically describe expressions in None. Here are some examples for valid lists:

; empty list
()
; list containing a symbol, a string, a number, and an empty list
(print "hello world" 303 ())
; three nesting lists
((()))

3.2. Naked & Coated Expressions

Every None source file is parsed as a single expression, where the head element is the language name and all remaining elements are expressions.

The classic notation (what we will call coated notation) uses the syntax known to Lisp and Scheme users as restricted S-expressions:

(none ; None scripts and modules use this header

; there must not be any tokens outside the parentheses guarding the
; top level expression.

; statement expressions go here
(print "Hello World")
...

; the value returned when loaded as a module.
; scripts usually return null
null)

As a modern alternative, None offers a naked notation where the scope of expressions is implicitly balanced by indentation, an approach used by Python, Haskell, YAML, Sass and many other languages.

This source parses as the same expression in the coated example:

none
; a single element in a single line without sub-expressions is assumed to
; be a complete expression.

; multiple elements in a single line are transformed to an expression
print "Hello World"
...

; return null
null

3.2.1. Mixing Modes

Naked expressions can contain coated expressions, but coated expressions can only contain other coated expressions:

; compute the value of (1 + 2 + (3 * 4)) and print the result
(print
    (+ 1 2
        (* 3 4)))

; the same expression in naked notation.
; indented expressions are spliced into the parent expression:
print
    + 1 2
        * 3 4

; any part of a naked expression can be coated
print
    + 1 2 (* 3 4)

; but a coated expression can not contain naked parts
print
    (+ 1 2
        * 3 4) ; parsed as (+ 1 2 * 3 4), a syntax error

; correct version:
print (+ 1 2 (* 3 4))

Because it is more convenient for users without specialized editors to write in naked notation, and balancing parentheses can be challenging for beginners, the author suggests to use coated notation sparingly and in good taste. Purists and enthusiasts may however prefer to keep only the top level naked, as in most Lisp-like languages, and work exclusively with coated expressions otherwise.

Therefore None’s reference documentation describes all available symbols in coated notation, while code examples make ample use of naked notation.

3.3. Pitfalls of Naked Notation

As naked notation giveth the user the freedom to care less about parentheses, it also taketh away. In the following section we will discuss the various difficulties that can arise and how to solve them.

3.3.1. Single Element Expressions

Special care must be taken when expressions with single elements are used.

Here is a coated expression printing the number 42:

(print 42)

The naked equivalent declares two elements in a single line, which are implicitly wrapped in a single expression:

print 42

A single element on its own line is always taken as-is:

print
    42

What happens when we want to call functions without arguments? Consider this example:

; a coated expression in a new scope, printing a new line,
; followed by the number 42
(do
    (print)
    (print 42))

A naive naked transcription would probably look like this, and be very wrong:

do
    ; suprisingly, the new line is never printed, why?
    print
    print 42

Translating the naked expression back to coated reveals what is going on:

(do
    ; interpreted as a symbol, not as an expression
    print
    (print 42))

The straightforward fix to this problem would be to explicitly wrap the single element in parentheses:

do
    (print)
    print 42

Nudists might however want to use a line comment symbol that forces the line to be wrapped in an expression and therefore has the same effect:

do
    print ;
    print 42

This is possible because comments are first-class token and work as “empty symbols” that are kept until the next parsing stage. This allows to use ; as a helper to express complex trees in naked notation.

3.3.2. Wrap-Around Lines

There are often situations when a high number of elements in an expression interferes with best practices of formatting source code and exceeds the line column limit (typically 80 or 100).

In coated expressions, the problem is easily corrected:

; import many symbols from an external module into the active namespace
(import-from "OpenGL"
    glBindBuffer GL_UNIFORM_BUFFER glClear GL_COLOR_BUFFER_BIT
    GL_STENCIL_BUFFER_BIT GL_DEPTH_BUFFER_BIT glViewport glUseProgram
    glDrawArrays glEnable glDisable GL_TRIANGLE_STRIP)

The naked approach interprets each new line as a nested expression:

; produces runtime errors
import-from "OpenGL"
    glBindBuffer GL_UNIFORM_BUFFER glClear GL_COLOR_BUFFER_BIT
    GL_STENCIL_BUFFER_BIT GL_DEPTH_BUFFER_BIT glViewport glUseProgram
    glDrawArrays glEnable glDisable GL_TRIANGLE_STRIP

; coated equivalent of the expression above; each line is interpreted
; as a function call and fails.
(import-from "OpenGL"
    (glBindBuffer GL_UNIFORM_BUFFER glClear GL_COLOR_BUFFER_BIT)
    (GL_STENCIL_BUFFER_BIT GL_DEPTH_BUFFER_BIT glViewport glUseProgram)
    (glDrawArrays glEnable glDisable GL_TRIANGLE_STRIP))

It comes easy to just fix this issue by putting each element on a separate line, which is not the worst solution:

; correct solution using single element lines
import-from "OpenGL"
    glBindBuffer
    GL_UNIFORM_BUFFER
    glClear
    GL_COLOR_BUFFER_BIT
    GL_STENCIL_BUFFER_BIT
    GL_DEPTH_BUFFER_BIT
    glViewport
    glUseProgram
    glDrawArrays
    ; comments should go on a separate line
    glEnable
    glDisable
    GL_TRIANGLE_STRIP

A terse approach would be to make use of the \ (splice-line) control character, which is only available in naked notation and splices the line starting at the next token into the active expression:

; correct solution using splice-line, postfix style
import-from "OpenGL" \
    glBindBuffer GL_UNIFORM_BUFFER glClear GL_COLOR_BUFFER_BIT \
    GL_STENCIL_BUFFER_BIT GL_DEPTH_BUFFER_BIT glViewport glUseProgram \
    glDrawArrays glEnable glDisable GL_TRIANGLE_STRIP

Unlike in other languages, \ splices at the token level rather than the character level, and can therefore also be placed at the beginning of nested lines, where the parent is still the active expression:

; correct solution using splice-line, prefix style
import-from "OpenGL"
    \ glBindBuffer GL_UNIFORM_BUFFER glClear GL_COLOR_BUFFER_BIT
    \ GL_STENCIL_BUFFER_BIT GL_DEPTH_BUFFER_BIT glViewport glUseProgram
    \ glDrawArrays glEnable glDisable GL_TRIANGLE_STRIP

3.3.3. Tail Splicing

While naked notation is ideal for writing nested expressions that accumulate at the tail:

; coated
(a b c
    (d e f
        (g h i))
    (j k l))

; naked
a b c
    d e f
        g h i
    j k l

...there are complications when additional elements need to be spliced back into the parent expression:

(a b c
    (d e f
        (g h i))
    j k l)

A simple but valid approach would be to make use of the single-element rule again and put each tail element on a separate line:

a b c
    d e f
        g h i
    j
    k
    l

But we can also reuse the splice-line control character to this end:

a b c
    d e f
        g h i
    \ j k l

3.3.4. Left-Hand Nesting

When using infix notation, conditional blocks or functions producing functions, expressions arise that nest at the head level rather than the tail:

((((a b)
    c d)
        e f)
            g h)

While this expression tree is easy to describe in coated notation, pure naked expressions will need to make use of line comment characters and optional splice-line characters to trade parentheses for additional indentation:

;removing these
    ;comments is
        ;a syntax error
            a b
            \ c d
        \ e f
    \ g h

Once again, the first-class token nature of ; is put to use in order to be able to start left-hand nesting expressions.

A more complex tree which also requires splicing elements back into the parent expression can be realized with the same combo of line comment and splice-line control character:

; coated
(a
    ((b
        (c d)) e)
    f g
    (h i))

; naked
a
    ;
        b
            c d
        e
    \ f g
    h i

While this example demonstrates the versatile usefulness of splice-line control and line comments as empty symbols, less seasoned users may prefer to express similar trees in partial coated notation instead.

3.4. Block Comments

While all comments are recorded at parser stage, they are stripped before macro expansion stage. In addition to ; single line comments, None recognizes and strips a special kind of multi-line comment.

A list beginning with a symbol that starts with a ### (triple hash) describes a block comment. Block comments have to remain syntactically consistent. Here are some examples for valid block comments:

; block comments in coated notation
(###this comment
    will be removed)
(###
    ; commenting out whole sections
    (function ()
        true)
    (function ()
        false))

; block comments in naked notation
###this comment
    will be removed

###
    ; commenting out whole sections
    function ()
        true
    function ()
        false