Wirth syntax notation (WSN) is a metasyntax, that is, a formal way to describe formal languages. Originally proposed by Niklaus Wirth in 1977 as an alternative to Backus–Naur form (BNF). It has several advantages over BNF in that it contains an explicit iteration construct, and it avoids the use of an explicit symbol for the empty string (such as <empty> or ε).[1]

WSN has been used in several international standards, starting with ISO 10303-21.[2] It was also used to define the syntax of EXPRESS, the data modelling language of STEP.

WSN defined in itself

edit
 SYNTAX     = { PRODUCTION } .
 PRODUCTION = IDENTIFIER "=" EXPRESSION "." .
 EXPRESSION = TERM { "|" TERM } .
 TERM       = FACTOR { FACTOR } .
 FACTOR     = IDENTIFIER
            | LITERAL
            | "[" EXPRESSION "]"
            | "(" EXPRESSION ")"
            | "{" EXPRESSION "}" .
 IDENTIFIER = letter { letter } .
 LITERAL    = """" character { character } """" .

The equals sign indicates a production. The element on the left is defined to be the combination of elements on the right. A production is terminated by a full stop (period).

  • Repetition is denoted by curly brackets, e.g., {a} stands for ε | a | aa | aaa | ....
  • Optionality is expressed by square brackets, e.g., [a]b stands for ab | b.
  • Parentheses serve for groupings, e.g., (a|b)c stands for ac | bc.

We take these concepts for granted today, but they were novel and even controversial in 1977. Wirth later incorporated some of the concepts (with a different syntax and notation) into extended Backus–Naur form.

Notice that letter and character are left undefined. This is because numeric characters (digits 0 to 9) may be included in both definitions or excluded from one, depending on the language being defined, e.g.:

 digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" .
 upper-case = "A" | "B" |  | "Y" | "Z" .
 lower-case = "a" | "b" |  | "y" | "z" .
 letter = upper-case | lower-case .

If character goes on to include digit and other printable ASCII characters, then it diverges even more from letter, which one can assume does not include the digit characters or any of the special (non-alphanumeric) characters.

Another example

edit

The syntax of BNF can be represented with WSN as follows, based on translating the BNF example of itself:

 syntax         = rule [ syntax ] .
 rule           = opt-whitespace "<" rule-name ">" opt-whitespace "::=" 
                  opt-whitespace expression line-end .
 opt-whitespace = { " " } .
 expression     = list [ "|" expression ] .
 line-end       = opt-whitespace EOL | line-end line-end .
 list           = term [ opt-whitespace list ] .
 term           = literal | "<" rule-name ">" .
 literal        = """" text """" | "'" text "'" .

This definition appears overcomplicated because the concept of "optional whitespace" must be explicitly defined in BNF, but it is implicit in WSN. Even in this example, text is left undefined, but it is assumed to mean "ASCII-character { ASCII-character }". (EOL is also left undefined.) Notice how the kludge "<" rule-name ">" has been used twice because text was not explicitly defined.

One of the problems with BNF which this example illustrates is that by allowing both single-quote and double-quote characters to be used for a literal, there is an added potential for human error in attempting to create a machine-readable syntax. One of the concepts that migrated to later meta syntaxes was the idea that giving the user multiple choices made it harder to write parsers for grammars defined by the syntax, so computer languages in general have become more restrictive in how a quoted-literal is defined.

Syntax diagram

edit

Syntax diagram:

References

edit
  1. ^ Wirth, Niklaus (November 1977). "What Can We Do about the Unnecessary Diversity of Notations for Syntax Definitions?". Communications of the ACM. 20 (11): 822–823. doi:10.1145/359863.359883. S2CID 35182224.
  2. ^ "ISO 10303-21, Industrial automation systems and integration — Product data representation and exchange — Part 21: Implementation methods: Clear text encoding of the exchange structure". International Organization for Standardization. 24 January 2002. {{cite journal}}: Cite journal requires |journal= (help)[dead link]

📚 Artikel Terkait di Wikipedia

Extended Backus–Naur form

metasyntax notation. The earliest EBNF was developed by Niklaus Wirth, incorporating some of the concepts (with a different syntax and notation) from Wirth syntax

Backus–Naur form

/ˌbækəs ˈnaʊər/), also known as Backus normal form, is a notation system for defining the syntax of programming languages and other formal languages, developed

Niklaus Wirth

Wirth died in Zürich on New Year's Day 2024, at age 89. 21655 Niklauswirth asteroid Extended Backus–Naur form Wirth syntax notation Bucky bit Wirth–Weber

Metasyntax

languages are Backus–Naur form (BNF), extended Backus–Naur form (EBNF), Wirth syntax notation (WSN), and augmented Backus–Naur form (ABNF). Metalanguages have

Coco/R

Coco/R is a compiler generator that takes wirth syntax notation grammars of a source language and generates a scanner and a parser for that language.

Reverse Polish notation

Polish notation (RPN), also known as reverse Łukasiewicz notation, Polish postfix notation or simply postfix notation, is a mathematical notation in which

WSN (disambiguation)

autosomal dominant skin condition Willison railway station, Melbourne Wirth syntax notation, a metasyntax, or formal way to describe formal languages World

EXPRESS (data modeling language)

Interchange Format Diagram General-purpose modeling Modeling language Wirth syntax notation DOT (graph description language)  This article incorporates public