Skip to content

syntactik/Syntactik

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Syntactik Logo Join the chat at https://gitter.im/syntactik/Syntactik Build status

Syntactik - if you want a better XML or JSON, but don't want a change.

Overview

Syntactik is a people friendly markup language and preprocessor with code reuse features designed to be semantically compatible with XML and JSON.

  • Define data using clean and intuitive syntax.
  • Create aliases and reuse them as code fragments or document templates.
  • Compile Syntactik documents into XML or JSON files.
  • Validate Syntactik documents against XML schema.

Syntactik uses indents to define document structure similarly to Python and YAML. Inline and whitespace agnostic syntax is also available if it's needed to keep it short.

The purpose of the language is:

  • to expand the audience of people working with data structures.
  • to improve the productivity of individuals working with XML and JSON.
  • creating new use cases by introducing the data-oriented markup language with people-friendly syntax and code reuse features.

Table of contents

Command line tool

The executable file slc.exe is a command line tool that compiles specified files, stores results in the output directory and validate output XML against XML schema if XSD files are listed in the options. The command line format is:

slc [options] [inputFiles]

Options:

  • -i=DIR Input directory with mlx, mlj and xsd files
  • -o=DIR Output directory
  • -r Turns on recursive search of files in the input directories

You can specify one or many input directories or files of type .s4x, .s4j and xsd. If neither directories nor files are given, then compiler takes them from the current directory. If s4x files are found, then s4j files are ignored.

Language design objectives

  • Must be people friendly
  • Minimal number of syntax rules
  • Semantic compatibility with XML and JSON
  • Must have code reuse features
  • Simple implementation of parser. No dependencies on parser generators.

Language design principles

  • Syntactik uses indents to define document structure (like Python, YAML, etc.).
  • Inline and whitespace agnostic syntax is also available to satisfy needs of advanced users.
  • Quotes are not required to define literals. Quotes should be used on special occasions like inline definitions, strings with special (escaped) symbols, string interpolation, etc.
  • Syntactik supports basic primitives of both XML (like namespaces, elements, and attributes) and JSON (like collections and lists).
  • Syntactik has code reuse features in the form of parameterized aliases and string interpolation.

Example

'''Namespace definitions
!#ipo = http://www.example.com/ipo 
''' "xsi" is a namespace prefix.
!#xsi = http://www.w3.org/2001/XMLSchema-instance

"""Document "PurchaseOrder" starts on the next line. 
You can define several documents per file."""
!PurchaseOrder: 
    ipo.purchaseOrder: ''' Xml element "purchaseOrder" with namespace prefix "ipo"
        @orderDate == 2016-12-20 '''This is attribute
        shipTo:
            @xsi.type = ipo:EU-Address
            @export-code = 1
            ipo.name = Helen Zoe
            ipo.street = 47 Eden Street
            ipo.city = Cambridge
            ipo.postcode = 126            
        billTo:
            $Address.US.NJ '''Alias with compound name
        Items:
            $Items.Jewelry.Necklaces.Lapis:
                %quantity == 2 '''Argument of the alias
            item:
                @partNum = 748-OT
                productName = Diamond heart
                quantity = 1
                price = 248.90
                ipo.comment = Valentine's day packaging.
                shipDate = 2016-12-24
'''Alias definitions                
!$Address.US.NJ:
    @xsi.type = ipo:US-Address
    ipo.name = Robert Smith
    ipo.street = 8 Main St
    ipo.city = Fort Lee
    ipo.state = NJ
    ipo.zip = 07024            

!$Items.Jewelry.Necklaces.Lapis:
    item:
        @partNum = 833-AA
        productName = Lapis necklace
        quantity := %quantity == 1 '''Parameter with default value
        price = 99.95
        ipo.comment = Need this for the holidays!
        shipDate = 2016-12-12                

Text encoding

  • Currently the parser supports UTF-8.
  • Syntactik is case sensitive.
  • Indents are defined by tabs (ASCII 0x9, \t) or spaces (ASCII 0x20) symbols. The parser checks the consistency of indents. It returns an error if indents are defined by both tabs and spaces in the same file. The parser also checks that same number of tabs or spaces is used for one indentation.
  • Syntactik supports "Windows" (ASCII 0x0D0A, \r \n) and "Linux" (ASCII 0x0A, \n) newline symbols.

Comments

Single line comments start with '''

'''The whole line is a comment
name == John Smith '''The rest of the line is a comment

Multiline comments start and end with """.

""" 
If the comment is too long, it can be continued on the 
next line.
"""
name == John Smith """ The multiline comment can start anywhere
and continue as far as needed.
"""

Module

Syntactik has a notion of a module. A module is physically represented by a file with extension "s4x" or "s4j". Module can include namespace definitions, documents, alias definitions, elements, attributes, aliases, namespace scopes.

s4x-module

s4x-module is physically represented by a file with extension "s4x" (meaning "syntactik for XML"). Text in s4x-module semantically represents XML nodes.

s4j-module

s4j-module is physically represented by a file with extension "s4j" (meaning "syntactik for JSON"). Text in s4j-module semantically represents JSON name/value pairs.

The difference between the S4X and S4J semantics is explained later in this document.

Name/value pair

The name/value pair abstraction is a cornerstone of Syntactik language. This abstraction consists of 3 parts:

Any part could be omitted. For example, omitted name or value means that the name or value is empty. The assignment can be only omitted if the value is omitted.

Name

A name can be either open or quoted.

Open name

An open name is defined using open string rules. Open name starts with an optional prefix followed by an identifier.

Prefix

A prefix defines a type of the language construct. If it is omitted, then the language construct is an element. This is the list of the prefixes and the corresponding language constructs:

Prefix Type
Element
! Document
!$ Alias Definition
!# Namespace Definition
!% Parameter
# Namespace Scope
@ Attribute
$ Alias
% Argument

The symbols used to define prefix are easy to find on top of the buttons with numbers from 1 to 5:

prefix symbols

Identifier

IDs are defined by the same rules as for an XML element name:

  • Element id is case-sensitive
  • Element id must start with a letter or underscore
  • Element id can contain letters, digits, hyphens, underscores, and periods
  • Element names cannot contain spaces
  • See the full spec for the XML Name here http://www.w3.org/TR/REC-xml/#NT-Name

Compound identifier

An identifier with the dot(s) is called a compound identifier. Dots split a compound identifier into several parts. In the element, attribute or namespace scope a compound identifier is used to specify namespace prefix. For example, in the following code, the attribute and the element have the namespace prefix ipo.

@ipo.export-code = 1
ipo.name = Robert Smith

Only first dot is used to define namespace prefix. All following dots are part of the name. So if an identifier starts with ., then the name has no namespace prefix, and all following dots will be included in the element's name. The following example shows how to define attributes and elements which doesn't have namespace prefix and have a dot in the name:

''' Attribute "name.first"
@.name.first = Robert
''' Elements "name.first" and "name.last"
.name.first = Robert
''' Quoted name is used to create the name with dot
"name.last" = Smith

All language constructs except a namespace definition can have a compound identifier. The compound IDs are used in alias definitions to structure them in the same way classes are structured using namespaces in programming languages:

!$Items.Jewelry.Necklaces.Lapis:
...
!$Items.Jewelry.Necklaces.Pearl:
...
!$Items.Jewelry.Diamonds.Heart:
...
!$Items.Jewelry.Diamonds.Uncut:
...
!$Items.Jewelry.Rings.Amber:
...
!$Items.Jewelry.Earrings.Jade:
...

Code editors can take advantage of compound IDs to provide a better code completion.

Simple identifier

The term simple identifier is used as the opposite to the compound identifier, meaning that the identifier doesn't have any dots in it. Thus it is not divided into parts and representing a single word identifier. In the example below the name is a simple identifier:

name = John Smith

Quoted name

A quoted name is defined by a single quoted or a double quoted string. Pair with a quoted name always represents an element. Elements with a quoted name don't have a namespace prefix.

Value

There are three types of values: literal, block and pair.

Literal

In Syntactik, literals can have the following meaning:

  • string (in s4x and s4j-modules)
  • number (only in s4j-modules)
  • boolean (only in s4j-modules)
  • null (only in s4j-modules)

Strings are defined using string literals. Number, boolean and null literals are defined by JSON literals. The omitted value represents the empty string.

Block

A block is a syntax construct that defines parent/child relationship between pairs. Blocks define tree structure of the document. Block starts with the object assignment : or one of the auxiliary assignments (::, :::, =: or =::) followed by indented name/value pairs. As mentioned in the section "language design principles", Syntactik uses indents to define document structure. The lines of code, having the same indentation, belong to the same block.

In the following example, the element billTo has a block with one attribute and five elements.

billTo:
    @xsi.type = ipo:US-Address
    ipo.name = Robert Smith
    ipo.street = 8 Main St
    ipo.city = Fort Lee
    ipo.state = NJ
    ipo.zip = 07024

Object

Syntactik has a notion of an object. An object is a set of name/value pairs where each pair corresponds to XML element, XML attribute, JSON name/value pair or JSON array item. Objects are defined by blocks that starts from object assignment :, choice :: or array assignment :::.

Empty object

An empty object is an object that doesn't contain any name/value pairs. The empty block represents the empty object.

Empty block

Block can be empty. In this case, it represents the empty object. In the example below, the element items has the empty block.

items:
comments = this is the empty order

Pair value

Pair value represents a single pair. It can be confused with a block with one name/value pair, but a pair value has a different meaning. It means that value of the pair is another pair. It is used when we need to specify a literal alias or a literal parameter as the literal value of the pair. See pair assignment for more details.

Assignment

Syntactik has three types of assignment operators, one for each value type:

Literal assignment

A literal assignment is used when the right side of the assignment is a literal. There are two subtypes of a literal assignment.

Free open string assignment

Equality sign = is used for :

and their multiline versions:

name = John Smith
address = 1100 Main St
    Prescott, AZ, 86303
description = 'Text in single quotes' '''Here could be your comment

Open string assignment

Double equality sign == is used for:

and their multiline versions:

name == John Smith '''Open strings end on quotes so you can put comments at the end of the line
address == "1100 Main St\nPrescott, AZ, 86303" '''Address string has the new line symbol \n
comment == This is a long string which continues
    on the next line, but still, it is a single line string.

Block assignment

A block assignment is used to define a block value of a pair. Block can start with the object assignment : or one of the auxiliary assignments (::, :::, =:, =::).

Object assignment

An object assignment is used when the right side of the assignment is an object. Colon : is used for the assignment of an object value. In the following example, the element address has an object value which consists of three elements: street, city, and state.

address:
    street = 1100 Main St
    city = Prescott
    state = AZ

Auxiliary assignments

Auxiliary assignments are the following types of assignment:

Pair assignment

The pair assignment := is used when the right side of the assignment is a name/value pair. Assigning pair value makes sense when the literal value is stored in alias or parameter. The pair assignment := must be used whenever a literal alias or a literal parameter is assigned as a value. In the example below, the alias $ZipCodes.NJ.Fort_Lee is assigned as a value to the element 'zipCode'.

zipCode := $ZipCodes.NJ.Fort_Lee

In the next example, the element quantity takes its value from parameter !%quantity with the default value 1. In other words, the name/value pair !%quantity = 1 is assigned to the element quantity.

item:
	@partNum = 238-KK
	productName = Amber ring
	quantity := !%quantity = 1
	price = 89.90
	ipo.comment = With no inclusions, please.

A literal alias or a literal parameter can't be assigned with a literal assignment because they are not literals but pairs.

Assignment naming convention

There are 8 types of assignments in total: = == : :: ::: := =: =:: It easier to remember the meaning of the assignment if you remember the following rules:

  • Assignments starting with the equality sign = assign a literal value.
  • Assignments starting with the colon : assign an object value.
  • Pair assignment := is a hybrid which assigns literal value stored in a pair.

String literals

Syntactik has two types of string literals: quoted string literals and open string literals.

Quoted string literals

Quoted string literals use single or double quotes to define a literal. So there are single quoted strings and double quoted string.

Open string literals

Open string literals do not use quotes to define a literal. An open string literal can be an open string or a free open string.

Multiline string literal

There is also a multiline version of each string type, so in total there are eight variations of string literals:

Open string

An open string defines a literal without the help of quotes. Open string rules apply to the literals that are defined after literal assignment ==. Open string rules also apply to an open name. If the literal value is omitted, then the open string represents the empty string.

There are several restrictions on the open strings:

In the example below names items, productName, quantity, price, and literals Diamond heart, 1, 248.90 are defined by the open strings:

items:
    productName == Diamond heart
    quantity == 1
    price == 248.90 '''It's possible to add the comment here because open string ends on a single quote

There is a multiline version of the open string called folded open string. There is also a free open string which is a less restrictive version of an open string.

Control symbols

Structure of the Syntactik document is defined by space and/or tabs, newline and the following control symbols: ' " = : ( ) ,. When the parser is processing a text, it is particularly searching for these symbols. Other symbols don't change a state of the parser. It is important to remember these seven symbols because they are the only visible symbol that can't be included in open names and open strings.

Free open string

A free open string defines a literal without the help of quotes like an open string but with fewer restrictions. Free open string rules apply to literals that are defined after literal assignment = if they are not beginning with the quotes ' ". If the literal value is omitted, then the free open string represents the empty string. There are just two constraints for the free open strings:

  • it can't start or end with whitespace because parser ignores leading and trailing whitespaces. If a string value starts or ends with the whitespace, then quoted string has to be used instead.
  • it can't start with the quotes ' ". Only quoted string literals start with the quotes.
  • it can’t have any non-visible symbols that require escaping. Double quoted string has to be used in this case. If only the new line symbol has to be escaped then multiline open string or folded open string can be used.

A free open string starts after literal assignment = and continues till the end of the line. There is no way to terminate a free open string before the end of the line.

In the example below all literals are defined by the free open strings.

url = https://www.google.com/
equalitySign = =
emptyString = ''' This is comment. Free open string can't start with quotes.

There is the multiline version of the free open string called multiline open string.

An indented open string can be used as a workaround for the first two constraints.

Single quoted string

A single quoted string defines a literal using single quote '. A single quoted string is used to create a quoted name or a quoted string literal. A single quoted string can’t include a single quote or any special non-visible symbols that require escaping (like newline). In the following example, all literals are defined using a single quoted string.

dirName == 'c:\Windows'
factOfTheDay == '"Stranger Things" is a science fiction-horror television series.'

There is the multiline version of the single quoted string called multiline single quoted string.

Double quoted string

A double quoted string defines a literal using double quotes ". A double quoted string is used to create a quoted name or a quoted string literal. A double quoted string can include escape sequences.

Escape sequences

A double quoted string and multiline double quoted string support JSON escape sequences:

Unicode character value Escape sequence Meaning
\u0022 \" quotation mark "
\u005C \\ reverse solidus \
\u005C \/ solidus /
\u0008 \b backspace
\u000C \f formfeed
\u000A \n newline
\u000D \r carriage return
\u0009 \t horizontal Tab
\uxxxx xxxx Unicode escape sequence

String interpolation

A double quoted string supports string interpolation. A string interpolation allows you to insert a value of an alias or parameter in the string. String interpolation starts with \$ (for alias) or \!% (for parameter) followed by an identifier of an alias or parameter. The parentheses () can surround the identifier. The parentheses are optional if the interpolation is followed by space or another symbol that can’t be interpreted as part of the identifier.

Example:

line1 = "The customer name is \!%CustomerName"
line2 = "The customer's phone number is \$(Phone.AreaCode)-\$(Phone.LocalNumber)-\$Phone.Extention"

Multiline open string

In a multiline open string, each line represents a line of text. The second, third, etc. lines must be indented by precisely one indent. Extra indents will be included in the line text. For example:

Comment = This is line number 1. 
    This is line number 2. 

The parser ignores the leading whitespaces in the first line and trailing spaces of the last line.

If the first line doesn’t have any symbols, then it will be ignored, and the string will start from the next line. The multiline string terminator === is used to specify the end of the string. If it's omitted, then the string will end based on indents, and the last newline symbol will be ignored. The multiline string terminator must have the same indent as the name of the current pair.

Text = 
    This is line number 1. 
    This is line number 2. 
    New line symbol at the end of this line is preserved because === ends this multiline string.
===

Folded open string

Sometimes strings are too long and hard to read and edit. The folded open string can help in this case. Folded open string is led by the literal assignment ==. In the folded open string, a single newline (\r\n or \n) becomes a space. Two consecutive newline symbols are treated as one; three consequent newlines are treated as two etc. For example:

Comment ==
    Line 1 starts here  
    and continues here  
    still line 1  
    finally line 1 end here because next line is empty. 

    Line 2 starts here  
    and continues here.
    Still line 2.
    Line 2 ends.
===

Please note that in the example above the first line is empty and, therefore, is ignored. Also, the output string will end with new line symbol because multiline string terminator === is used to end the folded open string. Indented lines (like second, third, etc) of a folded open string can be terminated with no symbols and continues until the end of a line.

Indented open string

An indented open string is a special use case of a multiline open string or a folded open string when the first line of a multiline string is empty (whitespaces are ignored). In this case, all visible symbols are allowed in the string. Each line can start with and include any visible symbols. It can be used as a workaround for the following limitations:

element ==
    "this single line string starts with " and includes = : that are not allowed in the open string"

Multiline single quoted string

A multiline single quoted string can be used to define multiline strings. It works like a hybrid of single quoted string and multiline open string. The first line of the string starts after the first single quote '. The second and the next lines of the string must be indented. The string ends with the single closing quote. Single quote or any special non-visible symbols that require escaping are not allowed in the string.

LoremIpsum = 'Lorem ipsum dolor sit amet, consectetur ""adipiscing"" elit. Pellentesque a congue. 
    Quisque mollis ut odio sed facilisis. Donec dictum ullamcorper lectus, convallis volutpat at'

Multiline double quoted string

A Multiline double quoted string allows usage of the string interpolation and escape sequences in the multiline string. At the same time, it works with new lines like folded open string.

CheckoutOfferMessage = "\tHi, \$(CustomerName)! Welcome to the check out page!     
    Get a \$OfferName upon approval for the \$CCName. 
    
    The Cart is a temporary place to store a list of your items and 
    reflects each item's most recent price."

JSON literals

Besides strings, JSON recognizes numbers and literals true, false, null. To be semantically compatible with JSON, Syntactik automatically recognizes JSON literals in open strings inside s4j-modules. JSON literals can't be defined in the quoted string. The example below shows cases when the literal is recognized as a JSON literal or a string:

json_literal_number = 123
string == '123'
json_literal_true = true
string == "true"
json_literal_true == true
json_literal_false = false
string == 'null'
json_literal_null == null

Indent

Indents are used to define blocks and multiline string literals. Syntactik parser works with both tab and space indents like most computer formats and languages. But unlike other formats, Syntactik enforces the same style of indentation in the file. Therefore there can't be indents defined by both tabs and spaces in the same file. The parser also checks that the same number of tabs or spaces is used to define one indent. The first indent that the parser finds in the file defines the indent symbol (space or tab) and the indent multiplicity (the number of symbols used for one indent).

Namespace definition

Overview

Syntax feature Options
Prefix #
Identifier simple
Value literal
Can be child of Module (only s4x), Document, Alias Definition

Description

Namespaces can be declared in the Module, Document or Alias Definition. Definition of a namespace starts with !# followed by a simple identifier and a string literal. A simple identifier represents an XML namespace prefix, and a string literal represents an XML namespace name. For example:

!#xsi = http://www.w3.org/2001/XMLSchema-instance
!#ipo = http://www.example.com/myipo

Namespace prefix

A namespace prefix is a simple identifier. The namespace declared in the module is visible in the all documents and alias definitions in this module. The namespace declared in the document or alias definitions is visible only inside the document or alias definition. The namespace declared in the document or alias definition overrides the namespace with the same namespace prefix declared in the module.

Document

Overview

Syntax feature Options
Prefix !
Identifier simple or compound
Value block or literal (only in s4j-module)
Can be child of Module
Can be parent of Namespace Definition, Element, Attribute, Alias, Namespace Scope

Description

A document starts with the exclamation mark ! followed by identifier and block.

!PurchaseOrder: '''Document "PurchaseOrder" starts here
    ipo.purchaseOrder:
        @orderDate = 1999-12-01
        shipTo:
            @xsi.type = ipo:EU-Address
            @export-code = 1
            ipo.name = Helen Zoe
            ipo.street = 47 Eden Street
            ipo.city = Cambridge
            ipo.postcode = 126

s4x-document

An s4x-document represents XML file. It is defined in the s4x-module. s4x-document must have one root element.

s4j-document

An s4j-document represent then JSON file. It is defined in s4j-modules.

Because in JSON the literal value represents the valid document, an s4j-document pair can have a literal assignment and a literal value.

!document = This is a valid JSON document.

Module document

If a module has any elements, attributes, aliases or namespace scopes as direct children, then a document with the same name as module's file is implicitly declared. This implicitly declared document is called a module document. All those pairs will become direct children of the module document.

''' Module's file name is "purchaseOrder.s4x"
''' The document "purchaseOrder" is implicitly declared.
ipo.purchaseOrder:
    @orderDate = 1999-12-01
    shipTo:
        @xsi.type = ipo:EU-Address
        @export-code = 1
        ipo.name = Helen Zoe
        ipo.street = 47 Eden Street
        ipo.city = Cambridge
        ipo.postcode = 126

Element

Overview

Syntax feature Options
Prefix none
Identifier simple, compound, quoted name
Value literal or block
Can be child of Document, Element, Alias Definition, Namespace Scope, Alias, Argument, Parameter
Can be parent of Element, Attribute, Alias, Namespace Scope, Parameter

Description

An element is the most used type of a name/value pair. It corresponds to XML element and a name\value pair in a JSON object. An element doesn't have a prefix in its name. If the name of the element is a compound identifier, then the first part of the name represents a namespace prefix. The namespace prefix has to be declared in the module, document or alias definition.

ipo.name = Robert Smith

An element can be defined with a quoted name. It is useful in S4J-module because JSON doesn't restrict format of names. A quoted name can also be used in S4X-modules to define names with a dot(s). Dot defines a namespace prefix in an open name, but in a quoted name dot is just part of the name.

''' "ipo" is a namespace prefix
ipo.name = John Smith
''' "first" is not a namespace prefix but part of the name
"first.name" = John Smith

If assignment and value are omitted, then the element has the empty object value.

Attribute

Overview

Syntax feature Options
Prefix @
Identifier simple or compound
Value literal
Can be child of Element, Alias Definition, Namespace Scope, Alias, Argument, Parameter

Description

An attribute corresponds to an attribute in XML and a name\value pair in JSON object. If s4j-document has an attribute, then the attribute will be treated as an element with the same name. An attribute starts with "at sign", @, followed by an identifier. The attribute can have only literal value. If assignment or value are omitted, then the attribute has the empty string value.

@orderDate = 2016-12-01
@ipo.export-code = 1
@emptyString = 
@empty

Alias

Overview

Syntax feature Options
Prefix $
Identifier simple or compound
Value literal or block
Can be child of Document, Element, Alias Definition, Namespace Scope, Alias, Argument, Parameter
Can be parent of Element, Attribute, Alias, Namespace Scope, Argument, Parameter

Description

An alias is a useful code reuse feature of Syntactik language. An Alias is a short name for a fragment of code. An alias starts with a dollar sign $ followed by a simple identifier or compound identifier. In some cases, an alias doesn't have a value or assignment. In other cases, an alias has a block or literal value defining a block of arguments or default argument.

An alias has to be defined by an alias definition. It is recommended to use compound identifiers to organize aliases in a tree structure. An alias can be either a literal alias or an object alias.

Literal alias

A Literal alias represents a literal. A Literal alias must be assigned with pair assignment :=.

''' Alias CurrentDate has the simple identifier and the literal value
@orderDate := $CurrentDate

A literal alias can also be used in the string interpolation:

''' The alias $customer.firstName has the compound identifier. 
''' The alias is used in the string interpolation.
customerGreating == "Hello, \$(customer.firstName)."

Object alias

An object alias has a block value. Block of an object alias can include elements, namespace scopes, attributes or aliases. An object alias can also represent an empty object. Example:

ipo.purchaseOrder:
	shipTo: 
	    $Address.US.AK '''This is alias for US address

Alias definition

Overview

Syntax feature Options
Prefix !$
Identifier simple or compound
Value literal or block
Can be child of Module
Can be parent of Element, Attribute, Alias, Namespace Scope, Parameter

Description

Aliases are defined in a module. An alias definition starts with the exclamation and dollar signs !$ followed by an Identifier.

Object alias definition

Literal alias definition

Alias definition declares either a literal alias or an object alias. In the example below, the alias definition declares the object alias $Address.US and the literal alias $Pi:

'''Object Alias Definition
!$Address.US:
    @xsi.type = ipo:US-Address
    ipo.name = Robert Smith
    ipo.street = 8 Oak Avenue
    ipo.city = Old Town
    ipo.state = AK
    ipo.zip = 95819

'''Literal Alias Definition
!$Pi = 3.14159265359

Parameter

Overview

Syntax feature Options
Prefix !%
Identifier simple or compound
Value literal or block
Can be child of Element, Alias Definition, Namespace Scope, Attribute, Alias, Argument, Parameter
Can be parent of Element, Attribute, Alias, Namespace Scope, Parameter

Description

A parameter can be declared only in alias definition.

Parameterized alias

Parameterized alias definition

An alias definition with parameters is called a parameterized alias definition. An [alias] that is defined by parameterized alias definition is called parameterized alias. There are two types of parameters: object parameter and literal parameter. Two or more parameters with the same name are allowed if they have the same type.

Object parameter

Object parameter represents an object. In the example below the alias definition !$Templates.PurchaseOrder.With.Necklace.Lapis has 3 object parameters: !%shipTo, !%billTo and !%items.

!$Templates.PurchaseOrder.With.Necklace.Lapis:
    purchaseOrder:
        shipTo:
            !%shipTo
        billTo:
            !%billTo
        Items:
            $Items.Jewelry.Necklaces.Lapis
            !%items

Parameters !%shipTo and !%billTo represent the whole object. The parameter !%items represents the trailing part of the object where the leading part of the object is defined by the alias $Items.Jewelry.Necklaces.Lapis.

Literal parameter

A Literal parameter represents a literal. A Literal parameter must be assigned with pair assignment :=. In the example below, the alias definition !$Address.US.NJ has two literal parameters: !%name and !%street.

!$Address.US.NJ:
    @xsi.type = ipo:US-Address
    ipo.name := !%name
    ipo.street := !%street
    ipo.city = Fort Lee
    ipo.state = NJ
    ipo.zip = 07024

A Literal parameter can be also used in the string interpolation. In the following example, the literal parameter !%customerName is defined in the literal alias definition !$CustomerGreating inside of the string interpolation 'Hello \!%customerName'.

!$CustomerGreating == "Hello \!%customerName"

Default value of parameter

Parameters can have a default value. In the example below, the parameter !%shipTo has the default block value assigned by the alias $Address.UK.Cambridge. The parameter !%billTo has the default block value which consists of 5 elements. The parameter !%items has the empty object as a default value.

!$Templates.PurchaseOrder.International:
    purchaseOrder:
        shipTo:
            !%shipTo:
                $Address.UK.Cambridge
        billTo:
            !%billTo:
                name = Robert Smith
                street = 8 Oak Avenue
                city = Old Town
                state = AK
                zip = 95819
        Items:
            !%items:

The literal parameter can have a default value too. In the following example, the parameter !%name has the default string value John Smith. The parameter !%street has the default value assigned by the alias $DefaultStreetAddress.

!$Address.US.NJ:
    @xsi.type = ipo:US-Address
    ipo.name := !%name = John Smith
    ipo.street := !%street := $DefaultStreetAddress
    ipo.city = Fort Lee
    ipo.state = NJ
    ipo.zip = 07024

Argument

Overview

Syntax feature Options
Prefix %
Identifier simple or compound
Value literal or block
Can be child of Alias
Can be parent of Element, Attribute, Alias, Namespace Scope, Parameter

Description

An argument starts with % followed by an identifier. If an alias definition has parameters, then the corresponding Alias can have arguments. Each argument corresponds to the parameter with the same name. There are two types of arguments: object argument and literal argument. The following rules are applied to all arguments:

  1. The argument must have the same value type (block or literal) as the corresponding parameter.
  2. There can't be two or more arguments with the same name.
  3. If the parameter doesn't have a default value, then the corresponding argument must be specified in the alias.
  4. If the parameter does have a default value, then the corresponding argument can be omitted.

All arguments are passed to the alias in the Block of Arguments.

Block of arguments

A block of arguments is a block where each pair is an argument.

$Templates.PurchaseOrder:
    ''' Block of Arguments starts here
	%orderDate = 2017-02-13
	%shipTo: $Address.UK.Cambridge
	%billTo: $Address.US.AK
	%items: 
		$Items.Jewelry.Necklaces.Lapis: %quantity = 1

Object argument

An object argument is used to specify the value of an object parameter. In the example below, the alias $Templates.PurchaseOrder has the block of arguments which consists of 3 object arguments: .shipTo, .billTo and .items.

$Templates.PurchaseOrder:
    %shipTo:
        $Address.UK.Cambridge
    %billTo:
        name = Robert Smith
        street = 8 Oak Avenue
        city = Old Town
        state = NJ
        zip = 95819
    %items:
        $Items.Jewelry.Necklaces.Lapis
        $Items.Jewelry.Diamonds.Heart

Literal argument

The literal argument is used to specify the value of the literal parameter. In the following example, the alias $Items.Jewelry.Necklaces.Lapis has the block of arguments which consists of one literal argument: %quantity with the assigned literal value 2.

Items:
    $Items.Jewelry.Necklaces.Lapis: 
        %quantity = 2

Default parameter

Default argument

If the alias definition has the only parameter with name %_ then this parameter is called a default parameter. In the corresponding alias, you don't need to specify an argument for the default parameter. Instead, the value of the argument has to be assigned directly to the alias. In the example below:

  • the alias definition !$Greating has the default literal parameter !%_
  • the alias definition !$Bold has the default object parameter !%_
  • in the document !htmlDocument, the alias $Bold has the value of default object parameter defined as span := $Greating = Hello, World!
  • the alias $Greating, in the alias $Bold, has the value of default literal parameter defined as Hello, World!
!$Greating := !%_

!$Bold:
    b:
        !%_

!htmlDocument:
    html:
        body:
            $Bold:
                span := $Greating = Hello, World!

Namespace scope

Overview

Syntax feature Options
Prefix #
Identifier simple, compound or omitted
Value literal or block
Can be child of Document, Element, Alias Definition, Namespace Scope, Alias, Argument, Parameter
Can be parent of Element, Attribute, Alias, Namespace Scope, Parameter

Description

A namespace scope is analog of default namespace in XML. It defines the default namespace for the elements that have no namespace prefix. A namespace scope starts with the hash symbol # followed by simple or compound identifier. If the namespace scope has a simple identifier, then the simple identifier represent the namespace prefix and is always followed by a block. All the elements inside this block that don't have a namespace prefix will get the default namespace prefix defined by this simple identifier. In the following example, the scope #ipo defines the default namespace for all elements in its block.

#ipo:
    purchaseOrder:
        shipTo:
            name = Helen Zoe
            street = 47 Eden Street
            city = Cambridge
            postcode = 126

A scope with a compound identifier is used to declare a default namespace prefix and an element at the same time. The first part of the compound identifier represents the default namespace prefix, and the rest represent the name of the element. The example below uses the namespace scope with the compound identifier. The scope #ipo.purchaseOrder declares the default namespace prefix ipo and the element purchaseOrder. This example will generate the same XML as the example above.

#ipo.purchaseOrder:
        shipTo:
            name = Helen Zoe
            street = 47 Eden Street
            city = Cambridge
            postcode = 126

The namespace scope impacts only elements that are defined directly inside its block. It doesn't effect elements and attributes defined in alias definitions of aliases defined in the block. Namespace scopes can be nested. The inner scope override the action the outer scope.

Empty namespace scope

To clear default namespace prefix value, the name of the scope has to be empty. For example:

#:
    purchaseOrder:

or

#.purchaseOrder:

Array

An array is an object where each pair represents an array item. Like objects, arrays are defined by blocks. All root pairs of array, have no name. A pair without name is called an array item. An array can start either with an object assignment : or an array assignment :::.
When the parser meets an object assignment : it can't tell if the block represents an array or a regular object until it processes the first pair in the block. If the first pair of the block is an array item, then the whole block is considered to be an array. The array assignment ::: explicitly tells parser that the block is an array.

Empty array

The empty block which starts with array assignment ::: declares an empty array.

Array item

Overview

Syntax feature Options
Prefix none
Identifier omitted
Value literal or object
Can be child of Document, Element, Alias Definition, Namespace Scope, Alias, Argument, Parameter
Can be parent of Element, Attribute, Alias, Namespace Scope, Parameter

Description

An array item is a name/value pair without name. There are two types of the array items: Literal Array Item and Object Array Item.

Literal array item

A literal array item has no name, so it starts with a literal assignment followed by an optional literal value. If the literal value is omitted, then the item represents the empty string. In the example below, block of the element colors defines array of 7 literal array items.

colors:
    == red ''' primary color
    = orange
    = yellow
    == green  ''' primary color
    == blue  ''' primary color
    = indigo
    = violet

Value of a literal array item can be also represented by an alias or parameter. In this case it has to start from pair assignment := followed by an alias or parameter

color_codes:
    := $Colors.Red
    := $Colors.Green
    := $Colors.Blue

A literal array item can be also implemented as concatenation or literal choice.

Object array item

An object array item has no name. It can be implemented as a block with object assignment :, choice :: or array assignment ::: In the example below, the element Items has the value defined as the array of 3 object array items.

Items:
    :
        productName = Lapis necklace
        quantity = 1
        price = 99.95
        ipo.comment = Need this for the holidays!
        shipDate = 1999-12-05
    :
        productName = Diamond heart
        quantity = 1
        price = 248.90
        ipo.comment = Valentine's day packaging.
        shipDate = 2000-02-14
    :
        productName = Uncut diamond
        quantity = 1
        price = 79.90
        shipDate = 2000-01-07

Object array item starting from the array assignment ::: represents an array. If an object array item is a choice :: then this choice must produce an array.

Array block

An array block is a block where each pair has no name. Array block doesn't always represent an array. If block array starts with choice :: then it can represent either an object or an array. If block array starts with concatenation =: or literal choice =:: then it represents a literal.

Array assignment

The array assignment ::: explicitly tells the parser that the block is an array. An empty block that starts with the array assignment ::: declares an empty array.

colors:::
    == red ''' primary color
    = orange
    = yellow
    == green  ''' primary color
    == blue  ''' primary color
    = indigo
    = violet
empty_array:::

XML array

Unlike JSON, XML doesn't support arrays but is used to serialize arrays anyway. For example, the array with color names can be represented like this:

<colors>
	<color>red</color>
	<color>orange</color>
	<color>yellow</color>
	<color>green</color>
	<color>blue</color>
	<color>indigo</color>
	<color>violet</color>
</colors>

In the example above, each element color represents an array item. When Syntactik compiler finds an array that is lead by array assignment :::, it adds the name of the pair of the array block to each array item. So the previous XML can be represented in the s4x-module like this:

colors:
    color:::
        == red ''' primary color
        = orange
        = yellow
        == green  ''' primary color
        == blue  ''' primary color
        = indigo
        = violet

With the use of the inline syntax XML arrays can be defined in more compact form:

primary_colors: color:::
    = red
    = green
    = blue

Concatenation

Pair with concatenation assignment =: always has an array block where each array item represents a string literal. All the string literals from the array block are concatenated, and the result value is assigned to the pair.

''' abc = abc
abc =:
    = a
    = b
    = c

Concatenation can be used in the alias definition:

!$message =:
	= Dear Mr.
	:= !%name
	= ". Your order "
	:= !%orderId
	== " will be shipped on "
	:= !%date
	= .

Concatenation is the only way to calculate a string literal if it includes the value of a parameterized alias because a parameterized alias can't be used in a string interpolation. Only alias without arguments can be used in a string interpolation.

bold_message =:
    = <b>
    := $message:
		%name = John Smith
		%orderId = 123
		%date = 2017-07-13
    = </b>

Choice

A choice assignment :: allows the creation of object alias definitions that can be used with different sets of arguments. A choice assignment can be used in the alias definition pair or/and anywhere in its block. There can be as many choices as needed in one alias definition. Pair with choice assignment :: always has an array block. Each array item in this block represents a case. A case is a block with or without parameters. When Syntactik compiler is processing an alias, and there is a choice in the corresponding alias definition, then the compiler tries to resolve cases in the choice one after another. The first case that is successfully resolved represents the value of the choice and the processing of the choice stops. A case is considered to be resolved if all parameters (without a default value) in the case have corresponding arguments in the alias. If a case has no parameters, then it will always be resolved unless any sibling cases are resolved before.

!$coffee_drink::
    :
        capuchino:
            foamed_milk := !%foamed_milk
            steamed_milk := !%steamed_milk
            espresso := !%espresso
    :
        mocha:
            steamed_milk := !%steamed_milk
            chocolate := !%chocolate
            espresso := !%espresso
    :
        americano:
            hot_water := !%hot_water
            espresso := !%espresso
    :
        espresso:
            espresso = 30
coffee_drinks:
    ''' capuchino
    $coffee_drink: 
        %foamed_milk = 60
        %steamed_milk = 60
        %espresso = 60
    ''' mocha
    $coffee_drink: 
        %steamed_milk = 30
        %chocolate = 60
        %espresso = 60
    ''' americano
    $coffee_drink: 
        %hot_water = 90
        %espresso = 60
    ''' espresso is a coffe_drink by default
    $coffee_drink

The example above generates the following JSON fragment. Notice that alias $coffee_drink is replaced with the correct name of the drink based on the list of ingredients specified as arguments.

{
  "coffee_drinks": {
    "capuchino": {
      "foamed_milk": 60,
      "steamed_milk": 60,
      "espresso": 60
    },
    "mocha": {
      "steamed_milk": 30,
      "chocolate": 60,
      "espresso": 60
    },
    "americano": {
      "hot_water": 90,
      "espresso": 60
    },
    "espresso": {
      "espresso": 30
    }
  }
}

Literal choice

A literal choice assignment =:: allows the creation of literal alias definitions that can be used with different sets of arguments. A literal choice works exactly like the choice with the exception that each case represents not an object but a literal.

!$url =::
	= "www.\!%(domain)/\!%(path)?\!%params"
	= "www.\!%(domain)/\!%(path)"
	=:	
		= www.
		:= !%domain = example.com
root:
	url1 := $url: 
		%domain = google.com
		%path = search
		%params = q = foo
	url2 := $url: 
		%domain = google.com
		%path = search
	url3 := $url: 
		%domain = google.com
	url4 := $url

The example above generates the following JSON fragment:

{
  "root": {
    "url1": "www.google.com/search?q = foo",
    "url2": "www.google.com/search",
    "url3": "www.google.com",
    "url4": "www.example.com"
  }
}

Inline syntax

The inline syntax allows defining several language constructs in one line of code.

Inline definitions

Two or more name/value pairs can be defined on the same line. In this case, they are treated like they are defined in separate lines with the same indent. The first example below shows regular, "one definition per line" way to define a block of elements. The second example shows the example of the inline syntax used to define the same block of elements.

name = Helen Zoe
street = 47 Eden Street
city = Cambridge
postcode = 126
name == Helen Zoe, street == 47 Eden Street, city == Cambridge, postcode = 126

In the second example, the open strings are used to define the string literals. The usage of the open strings instead of free open strings is essential in this case because it makes possible for the parser to end the strings with a comma ,. Please also note that the last pair in the second example is still using the free open string. It works because it is the last pair in the line. In the previous example, literals were ended by a comma. Quoted literals end with the closing quote. In this case, Syntactik parser still requires commas between inline pairs.

name == "Helen Zoe", street == "47 Eden Street", city == "Cambridge", postcode == "126"

Comma is required before any new pair that is defined in the same line with its previous sibling pair.

Inline block

An inline block is a block that is defined in the same line with the name and the block assignment. The first example below shows the element shipTo with the regular block definition. The second example shows the same data structure defined using the inline block syntax.

shipTo:
    name = Helen Zoe
    street = 47 Eden Street
    city = Cambridge
    postcode = 126
shipTo: name == Helen Zoe, street == 47 Eden Street, city == Cambridge, postcode == 126

Inline name/value pair

Pair defined in the inline block is called inline name/value pair or inline pair.

Closed inline pair

Unclosed inline pair

Inline pair followed by a comma , is called closed inline pair. If it is not followed by a comma, then it is called unclosed inline pair. A pair is also called closed if it is the last pair of a closed wsa region. In other words, closing bracket ) closes an inline pair.

Nested inline blocks

Nested inline blocks are nested blocks defined in the same line. Any pair in the inline block can also have its inline block. A pair in the inline block belongs to the last declared unclosed block. In the following example, elements name, street, city and postcode are defined in the inline block of the element shipTo which itself is in the inline block of the element purchaseOrder.

purchaseOrder: shipTo: name == Helen Zoe, street == 47 Eden Street, city == Cambridge, postcode == 126        

With the use of comma , it is possible to create the nested inline block of any topology. Comma , terminates the last unclosed current pair in the inline block. Just one comma , is needed to close a current empty inline block. If the inline block is not empty, then two commas , are needed (the first comma will close current pair and the second comma will close the current inline block). In the first example below, the nested blocks are defined by indents. The second example defines the same data structure using the nested inline block syntax with commas ,. The block of el0 is empty, so it is closed with one comma ,. The block of el1 is closed with two commas ,,. The first comma is used to end pair el1_2. The second comma is closing the inline block of el1.

root:
    el0:
    el1:
        el1_1 = text1_1
        el1_2 = text1_2
    el2:
        el2_1 = text2_1
        el2_2 = text2_2
root: el0:, el1: el1_1 == text1_1, el1_2 == text1_2,, el2: el2_1 == text2_1, el2_2 == text2_2

The readability of the inline block can be improved with parentheses ():

root: (el0:), (el1: el1_1 == text1_1, el1_2 == text1_2), (el2: el2_1 == text2_1, el2_2 == text2_2)

Parentheses () are used to create wsa-region. The comma , is used as a separator between wsa-regions.

Hybrid block

A hybrid block is a block that starts as an inline block and continues on the next lines like a regular block with indented pairs. In the following example, a block of the element shipTo is a hybrid block. The attribute @export-code starts on the same line as its parent element shipTo. Then block continues as a set of indented pairs.

shipTo: @export-code = 1 
	name = Helen Zoe
	street = 47 Eden Street
	city = Cambridge
	postcode = 126

There are many ways to represent the same data structure using a hybrid block:

shipTo: @export-code == 1, name = Helen Zoe
	street == 47 Eden Street
	city = Cambridge
	postcode = 126

or

shipTo: @export-code == 1
    name == Helen Zoe, street == 47 Eden Street
    city == Cambridge, postcode == 126

Inline syntax and multiline literals

Multiline string literals can't be defined in the inline block.

WSA mode

WSA region

WSA is an acronym for whitespace agnostic mode. It is a mode when the Syntactik parser ignores indents and dedents. Parentheses () define a wsa-region. Parentheses can be nested. Parser quits the whitespace agnostic mode when the number of closing parentheses ) is equal the number of opening parentheses (. In a wsa-region, the free open string assignment = works like open string assignment ==. It means that it is not possible to define free open string inside wsa-region. It is not possible to define a multiline string literal in a wsa-region, because the parser works in wsa-mode and ignores indents and dedents. In the first example below, the blocks are defined by indents. The second example defines the same data structure using the whitespace agnostic mode:

root:
    el1:
        el1_1 = text1_1
        el1_2 = text1_2
        el1_3 = text1_3
        el1_4 = text1_4
    el2:
        el2_1 = text2_1
        el2_2 = text2_2
root:
    el1:(
                el1_1 = text1_1,
    el1_2 = text1_2,
            el1_3 = "text1_3", el1_4 = text1_4
    )
    el2: (
        el2_1 = text2_1,
            el2_2 = text2_2
        )

You can think about pairs defined in wsa-region, like pairs defined in the inline block because indents are ignored and newline just ends the current pair (but does not close it. See Closed Inline Pair).

Inline WSA region

WSA-region defined in the inline block is called inline Wsa-Region.

Closed WSA Region

Unclosed WSA Region

Inline WSA-region that followed by a comma , is called closed wsa-region. If it is not followed by a comma, then it is called unclosed wsa-region

Comma

The comma , is required between:

XML mixed content

Syntactik allows you to create XML elements with mixed content. The mixed content means that XML element's content has at least one text node and at least one element or an attribute. To create a mixed content, text nodes have to be included in the element's block in the form of literal array item. Example XML:

<message>
Dear Mr.<name>John Smith</name>. Your order <orderid>1032</orderid>
will be shipped on <shipdate>2017-07-13</shipdate>.
</message>

In Syntactik, the previous XML fragment can be represented like this:

message:
    = Dear Mr.
    name = John Smith
    == ". Your order "
    orderid = 1032
    == " will be shipped on "
    shipdate = 2001-07-13
    = .

The same data structure defined using the inline syntax:

message: == Dear Mr., name == John Smith, == ". Your order ", orderid == 1032, == " will be shipped on ", shipdate == 2001-07-13, == .

The same data structure defined using WSA mode:

message: ( 
    = "Dear Mr.", name = John Smith,
    = ". Your order ", orderid = 1032,
    = " will be shipped on " shipdate = "2001-07-13" )

Name literal

In s4j-modules, if the assignment and the value of the name/value pair are omitted then the name represents a literal and is called name literal. Name literals can be used to make arrays more compact:

colors: red, orange, yellow, green, blue, indigo, violet

The same data structure in JSON:

{
  "colors": [
    "red",
    "orange",
    "yellow",
    "green",
    "blue",
    "indigo",
    "violet"
  ]
}

About

đź‘Ť Preprocessor for XML and JSON with people-friendly syntax and code reuse features.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published