Semantics

The semantics of a programming language describe the relationship between the syntax and the model of computation.

Keywords and phrases: Algebraic semantics, axiomatic semantics, denotational semantics, operational semantics, semantic algebra, semantic axiom, semantic domain, semantic equation, semantic function, loop variant, loop invariant, valuation function, sort, signature, many-sorted algebra


Semantics is concerned with the interpretation or understanding of programs and how to predict the outcome of program execution. The semantics of a programming language describe the relation between the syntax and the model of computation. Semantics can be thought of as a function which maps syntactical constructs to the computational model.

semantics: syntax --> computational model

This approach is called syntax-directed semantics.

There are several widely used techniques ( algebraic, axiomatic, denotational, operational, and translation) for the description of the semantics of programming languages.

Much of the work in the semantics of programming languages is motivated by the problems encountered in trying to construct and understand imperative programs---programs with assignment commands. Since the assignment command reassigns values to variables, the assignment can have unexpected effects in distant portions of the program.

Algebraic Semantics

An algebraic definition of a language is a definition of an algebra. An algebra consists of a domain of values and a set of operations (functions) defined on the domain.
Algebra = < set of values; operations >

Figure N.1 contains an example of an algebraic definition. It is an algebraic definition of a fragment of Peano arithmetic.


Figure N.1: Algebraic Definition of Peano Arithmetic
Domains:

Bool = {true, false} (Boolean values)
N in Nat (the natural numbers)
N ::= 0 | S(N)

Functions:

= : (Nat, Nat) -> Bool
+ : (Nat, Nat) -> Nat
× : (Nat, Nat) -> Nat

Axioms and equations:

not S(N) = 0
if S(M) = S(N) then M = N
( n + 0 ) = n
( m + S(n) ) = S( m + n )
( n × 0 ) = 0
( m × S(n)) = (( m × n) + m)

where m,n in Nat


The equations define equivalences between syntactic elements; they specify the transformations that are used to translate from one syntactic form to another. The domain is often called a sort and the domain and the function sections constitute the signature of the algebra. Functions with zero, one, and two operands are referred to as nullary, unary, and binary operations. Because there is more than one domain, the algebra is called a many sorted algebra. As in this example, abstract data types may require values from several different sorts. The signature of the algebra is a set of sorts and a set of functions taking arguments and returning values of different sorts. A stack of natural numbers may be modeled as a many-sorted algebra with three sorts (natural numbers, stacks and booleans) and four operations (newStack, push, pop, top, empty). Figure N.2 contains an algebraic definition of a stack.


 
Figure N.2: Algebraic definition of an Integer Stack ADT
Domains: 

Nat (the natural numbers 
Stack ( of natural numbers) 
Bool (boolean values) 

Functions: 

newStack: () -> Stack 
push : (Nat, Stack) -> Stack 
pop: Stack -> Stack 
top: Stack -> Nat 
empty : Stack -> Bool 

Axioms:   or 

pop(push(N,S)) = S 
top(push(N,S)) = N 
empty(push(N,S)) = false 
empty(newStack()) = true 

Errors: 

pop(newStack()) 
top(newStack()) where N in Nat and S in Stack.


 
 
 
 
 
 
 
 
 
 
 
 

Defining Equations: 

newStack() = [] 
push(N,S) = [N|S] 
pop([N|S]) = S 
top([N|S]) = N

 

In Figure N.1, the structure of the numbers is described. In Figure N.2 the structure of a stack is not defined. This means that we cannot use equations to describe syntactic transformations. Instead, we use axioms that describe the relationships between the operations. The axioms are more abstract than equations because the results of the operations are not described. To be more specific would require decisions to be made concerning the implementation of the stack data structure. Decisions which would tend to obscure the algebraic properties of stacks. The axioms impose constraints on the stack operations that are sound in the sense that they are consistent with the actual behavior of stacks reguardless of the implementation. Finding axioms that are complete, in the sense that they completely specify the behavior of the operations of an ADT, is more difficult.

The goal of algebraic semantics is to capture the semantics of behavior by a set of axioms with purely syntactic properties. Algebraic definitions (semantic algebras) are the favored method for defining the properties of abstract data types.

Axiomatic Semantics

The axiomatic semantics of a programming language are the assertions about relationships that remain the same each time the program executes. Axiomatic semantics are defined for each control structure and command. The axiomatic semantics of a programming language define a mathematical theory of programs written in the language. A mathematical theory has three components. The semantic formulas are triples of the form:
{P} c {Q}

where c is a command or control structure in the programming language, P and Q are assertions or statements concerning the properties of program objects (often program variables) which may be true or false. P is called a pre-condition and Q is called a post-condition. The pre- and post-conditions are formulas in some arbitrary logic and summarize the progress of the computation.

The meaning of

{P} c {Q}

is that if c is executed in a state in which assertion P is satisfied and c terminates, then c terminates in a state in which assertion Q is satisfied. We illustrate axiomatic semantics with a program to compute the sum of the elements of an array (see Figure N.3).


Figure N.3: Program to compute S = sumi=1nA[i]
S,I := 0,0
while I < n do
   S,I := S+A[I+1],I+1
end

The assignment statements are simultaneous assignment statements. The expressions on the righthand side are evaluated simultaneously and assigned to the variables on the lefthand side in the order they appear.

Figure N.4 illustrates the use of axiomatic semantics to verify the program of Figure N.3.


Figure N.4: Verification of S = sumi=1nA[i]
 
Pre/Post-conditions Code
1. { 0 = Sumi=10A[i], 0 < |A| = n }
2. S,I := 0,0
3. {S = Sumi=1IA[i], I <= n }
4. while I < n do
5. {S = Sumi=1IA[i], I < n }
6. {S+A[I+1] = Sumi=1I+1A[i], I+1 <= n }
7. S,I := S+A[I+1],I+1
8. { S = Sumi=1IA[i], I <= n }
9. end
10. {S = Sumi=1IA[i], I <= n, I >= n }
11. {S = Sumi=1nA[i] }
 

The program sums the values stored in an array and the program is decorated with the assertions which help to verify the correctness of the code. The pre-condition in line 1 and the post-condition in line 11 are the pre- and post-conditions respectively for the program. The pre-condition asserts that the array contains at least one element zero and that the sum of the first zero elements of an array is zero. The post-condition asserts that S is sum of the values stored in the array. After the first assignment we know that the partial sum is the sum of the first I elements of the array and that I is less than or equal to the number of elements in the array.

The only way into the body of the while command is if the number of elements summed is less than the number of elements in the array. When this is the case, The sum of the first I+1 elements of the array is equal to the sum of the first I elements plus the I+1st element and I+1 is less than or equal to n. After the assignment in the body of the loop, the loop entry assertion holds once more. Upon termination of the loop, the loop index is equal to n. To show that the program is correct, we must show that the assertions satisfy some verification scheme. To verify the assignment commands, we use the Assignment Axiom:

Assignment Axiom
{P[x:E]} x:= E {P}

This axiom asserts that:

If after the execution of the assignment command the environment satisfies the condition P, then the environment prior to the execution of the assignment command also satisfies the condition P but with E substituted for x (In this and the following axioms we assume that the evaluation of expressions does not produce side effects.).
An examination of the respective pre- and post-conditions for the asssignment statements shows that the axiom is satisfied.

To verify the while command of lines 4. 7 and 9, we use the Loop Axiom:

Loop Axiom:
{I /\ B /\ V > 0 } C {I /\ V > V' >= 0}

{I} while B do C end {I /\ ¬ B}

The assertion above the bar is the condition that must be met before the axiom (below the bar) can hold. In this rule, {I} is called the loop invariant. This axiom asserts that:

To verify a loop, there must be a loop invariant I which is part of both the pre- and post-conditions of the body of the loop and the conditional expression of the loop must be true to execute the body of the loop and false upon exit from the loop.
The invariant for the loop is: S = sumi=1IA[i], I <= n. Lines 6, 7, and 8 satisfy the condition for the application of the Loop Axiom. To prove termination requires the existence of a loop variant. The loop variant is an expression whose value is a natural number and whose value is decreased on each iteration of the loop. The loop variant provides an upper bound on the number of iterations of the loop.
A variant for a loop is a natural number valued expression V whose run-time values satisfy the following two conditions:
The loop variant for this example is the expression n - I. That it is non-negative is guaranteed by the loop continuation condition and its value is decreased by one in the assignment command found on line 7. More general loop variants may be used; loop variants may be expressions in any well-founded set (every decreasing sequence is finite). However, there is no loss in generality in requiring the variant expression to be an integer. Recursion is handled much like loops in that there must be an invariant and a variant. The correctness requirement for loops is stated in the following:
Loop Correctness Principle: Each loop must have both an invariant and a variant.
Lines 5 and 6 and lines 10 and 11 are justified by the Rule of Consequence.
Rule of Consequence:
P -> Q, {Q} C {R}, R -> S 
{P} C {S}

The justification for the composition the assignment command in line 2 and the while command requires the following the Sequential Composition Axiom.

Sequential Composition Axiom:
{P} C0 {Q}, {Q} C1 {R} 
{P} C0; C1 {R}

This axiom is read as follows:

The sequential composition of two commands is permitted when the post-condition of the first command is the pre-condition of the second command.
The following rules are required to complete the deductive system.
Selection Axiom:
{P /\ B} C0 {Q}, {P /\ ¬ B } C1 {Q} 
{P} if B then C0 else C1 fi {Q}
Conjunction Axiom:
{P} C {Q}, {P'} C {Q'} 
{P /\ P' } C {Q /\ Q'}
Disjunction Axiom:
{P} C {Q}, {P'} C {Q'} 
{P \/ P' } C {Q \/ Q'}

The axiomatic method is the most abstract of the semantic methods and yet, from the programmer's point of view, the most practical method. It is most abstract in that it does not try to determine the meaning of a program, but only what may be proved about the program. This makes it the most practical since the programmer is concerned with things like, whether the program will terminate and what kind of values will be computed.

Axiomatics semantics are appropiate for program verification and program derivation.

Assertions for program construction

The axiomatic techniques may be applied to the construction of software. Rather than proving the correctness of an existing program, the proof is integrated with the program construction process to insure correctness from the start. As the program and proof are developed together, the assertions themselves may provide suggestions which facilitate program construction.

Loops and recursion are two constructs that require invention on the part of the programmer. The loop correctness principle requires the programmer to come up with both a variant and an invariant. Recursion is a generalization of loops so proofs of correctness for recursive programs also require a loop variant and a loop invariant. In the summation example, a loop variant is readily appearent from an examination of the post-condition. Simply replace the summation upper limit, which is a constant, with a variable. Initializing the sum and index to zero establishes the invariant. Once the invariant is established, either the index is equal to the upper limit in which case there sum has been computed or the next value must be added to the sum and the index incremented reestablishing the loop invariant. The position of the loop invariants define a loop body and the second occurrence suggests a recursive call. A recursive version of the summation program is given in Figure N.5.


Figure N.5: Recursive version of summation
S,I := 0,0
loop: if I < n then S,I := S+A[I+1],I+1; loop
      else skip fi

The advantage of using recursion is that the loop variant and invariant may be developed separately. First develop the invariant then the variant.

The summation program is developed from the post-condition by replacing a constant by a variable. The initialization assigns some trivial value to the variable to establish the invariant and each iteration of the loop moves the variable's value closer to the constant.

A program to perform integer division by repeated subtraction can be developed from the post-condition { 0 <= r < d, (a = q× d + r) } by deleting a conjunct. In this case the invariant is { 0 <= r, (a = q× d + r) } and is established by setting the the quotient to zero and the remainder to a.

Another technique is called for in the construction of programs with multiple loops. For example, the post condition of a sorting program might be specified as:

{ forall i.(0 < i < n -> A[i] <= A[i+1]), s = perm(A)}

or the post condition of an array search routine might be specifies as:

{ if exists i.(0 < i <= n and t = A[i]) then location = i else location = 0}

To develop an invariant in these cases requires that the assertion be strengthened by adding additional constraints. The additional constraints make assertions about different parts of the array.

Denotational Semantics

A denotational definition of a language consists of three parts: the abstract syntax of the language, a semantic algebra defining a computational model, and valuation functions. The valuation functions map the syntactic constructs of the language to the semantic algebra. Recursion and iteration are defined using the notion of a limit. the programming language constructs are in the syntactic domain while the mathematical entity is in the semantic domain and the mapping between the various domains is provided by valuation functions. Denotational semantics relies on defining an object in terms of its constituent parts. The Figure N.6 is an example of a denotational definition.


 
Figure N.6: Denotational definition of Peano Arithmetic
Abstract Syntax: 

N in Nat (the Natural Numbers) 
N ::= 0 | S(N) | (N + N) | (N × N) 

Semantic Algebra: 

Nat (the natural numbers (0, 1, ...) 
+ : Nat -> Nat -> Nat 

Valuation Function: 

D : Nat -> Nat 

D[( n + 0 )] = D[n] 
D[( m + S(n) )] = D[(m+n)] + 1 
D[( n × 0 )] = 0 
D[( m × S(n))] = D[ (( m × n) + m) ] 

where m,n in Nat

 

It is is a denotational definition of a fragment of Peano arithmetic. Notice the subtle distinction between the syntactic and semantic domains. The syntactic expressions are mapped into an algebra of the natural numbers by the valuation function. The denotational definition almost seems to be unnecessary. Since the syntax so closely resembles that of the semantic algebra.

Programming languages are not as close to their computational model. Figure N.7 is a denotational definition of the small imperative programming language Simple encountered in the previous chapter.



 
Figure N.7: Denotational semantics for Simple
Abstract Syntax:

C in Command
E in Expression
O in Operator
N in Numeral
V in Variable

C ::= V := E | if E then C1 else C2 end |
      while E do C3 end | C1;C2 | skip
E ::= V | N | E1 O  E2  | (E)
O ::= + | - | * | / | = | < | > | <>

Semantic Algebra:
Domains:

tau in  T = {true, false}; the  boolean values
zeta in  Z = {...-1,0,1,...}; the integers
+ : Z -> Z -> Z
...
= : Z -> Z -> T
...
sigma in  S = Variable -> Numeral; the state

Valuation Functions:

C in C -> (S -> S)
E in E -> E -> (N union T)

C[ skip ] sigma = sigma
C[ V := E ] sigma = sigma [ V:E[ E ] sigma
C[ C1; C2 ] = C[ C2 ] · C[ C1]
C[ if E then C1 else C2 end ] sigma
          = C[ C1 ] sigma  if E[ E ]sigma = true
          = C[ C2 ] sigma  if E[ E ]sigma = false
C[ while E do C end}]sigma
          = limn -> infty C[ (if E then C else skip end)n ] sigma
E[ V ] sigma = sigma(V)
E[ N ] = zeta
E[ E1+E2 ] = E[ E ] sigma + E[ E ] sigma
...
E[ E1=E2 ] sigma = E[ E ] sigma = E[ E ] sigma

Denotational definitions are favored for theoretical and comparative programming language studies. Denotational definitions have been used for the automatic construction of compilers for the programming language. Denotations other than mathematical objects are possible. For example, a compiler writer would prefer that the object denoted would be appropriate object code. Systems have been developed for the automatic construction of compilers from the denotation specification of a programming language.

Operational Semantics

An operational definition of a language consists of two parts: an abstract syntax and an interpreter. An interpreter defines how to perform a computation. When the interpreter evaluates a program, it generates a sequence of machine configurations that define the program's operational semantics. The interpreter is an evaluation relation that is defined by rewriting rules. The interpreter may be an abstract machine or recursive functions. Figure N.8 is an example of an operational definition.


Figure N.8: Operational semantics for Peano arithmetic
Abstract Syntax:

N in Nat (the natural numbers)

N ::= 0 | S(N) | (N + N) | (N × N)

Interpreter:

I: N -> N

I[ ( n + 0 ) ]     ==> n
I[ ( m + S(n) ) ]  ==> S( I[ (m+n ) ] )
I[ ( n × 0 ) ]     ==> 0
I[ ( m × S(n)) ]   ==> I[ (( m × n) + m) ]


where m,n in Nat

It is is an operational definition of a fragment of Peano arithmetic.

The interpreter is used to rewrite natural number expressions to a standard form (a form involving only S and 0 ) and the rewriting rules show how move the + and × operators inward toward the base cases. Operational definitions are favored by language implementors for the construction of compilers and by language tutorials because operational definitions describe how the actions take place.

The operational semantics of Simp is found in Figure N.9.


Figure N.9: Operational semantics for Simple
Interpreter:

I: C × Sigma -> Sigma
{nu} in E × Sigma} -> T union Z

Semantic Equations:

I(skip,sigma) = sigma
I(V := E,sigma) = sigma[V:nu(E,sigma)]
I(C1 ;C2,sigma) = E(C2,E(C1,sigma))
I(if E then C1 else C2 end,sigma) = I(C1,sigma)&if nu(E,sigma) = true}
                                                   I(C2,sigma)&if nu(E,sigma) = false}
while E do C end = if E then (C;while E do C end) else skip
nu(V,sigma) = sigma(V)
nu(N,sigma) = N
nu(E1+E2,sigma) = nu(E1,sigma) + nu(E2,sigma)
...
nu(E1=E2,sigma) = true if nu(E,sigma) = nu(E,sigma)}
             false if nu(E,sigma) != nu(E,sigma)}
otherwise
...

The operational semantics are defined by using two semantic functions, I which interprets commands and nu which evaluates expressions. The interpreter is more complex since there is an environment associated with the program with does not appear as a syntactic element and the environment is the result of the computation. The environment (variously called the store or referencing environment}) is an association between variables and the values to which they are assigned. Initially the environment is empty since no variable has been assigned to a value. During program execution each assignment updates the environment. The interpreter has an auxiliary function which is used to evaluate expressions. The while command is given a recursive definition but may be defined using the interpreter instead. Operational semantics are particularly useful in constructing an implementation of a programming language.

Pragmatics

The use of formal semantic description techniques is playing an increasing role in software engineering. Algebraic semantics are useful for the specification of abstract data types. However, the lack of robust theorem provers has limited the effective use axiomatic semantics for program varification. Denotational semantics are beginning to play a role in compiler construction and a prescriptive rather than a descriptive role in the design of programming languages. Operational semantics have always proved helpful in the design of compilers.

Historical Perspectives and Further Reading

An excellent short introduction to operational, denotational and axiomatic semantics is found in Other references include:

Exercises

  1. (axiomatic) Give axiomatic semantics for the following:
    1. Multiple assignment command:  x0,...,xn := e0,...,en
    2. The following commands are a nondeterministic if and a nondeterministic loop.  The IF command allows for a choice between alternatives while the DO command provides for iteration.  In their simplest forms, an IF statement corresponds to an If condition then command and a LOOP statement corresponds to a While condition Do command.

    3.  
      IF guard --> command FI = if guard then command
      LOOP guard --> command POOL = while guard do command 
      A command proceded by a guard can only be executed if the guard is true.  In the general case, the semantics of the IF - FI and LOOP - POOL commands requires that only one command corresponding to a guard that is true be selected for execution.  The selection is nondeterministic..  Define the axiomatic semantics for the IF and LOOP commands:
      1. if c0 -> s0
           ...
          cn -> sn 
        fi
      2. do c0 -> s0 
           ...
           cn -> sn 
        od
    4. A for statement
    5. A repeat-until statement
  2. (axiomatic) Use assertions to guide the construction of the following programs.
    1. Linear search
    2. Integer division implemented by repeated subtraction.
    3. Factorial function
    4. Fn the n-th Fibonacci number where F0 = 0, F1 = 1, and Fi+2 = Fi+1 + Fi for i >= 0.
    5. Binary search
    6. Quick sort
  3. (algebraic) Construct algebraic semantics for the following:
    1. Stack
    2. List
    3. Queue
    4. Binary search tree
    5. Graph
    6. Grade book
    7. Complex numbers
    8. Rational numbers
    9. Floating point numbers
    10. Simple
  4. (denotational) Construct denotational semantics for the following:
    1. Stack
    2. List
    3. Queue
    4. Binary search tree
    5. Graph
    6. Grade book
    7. Complex numbers
    8. Rational numbers
    9. Floating point numbers
    10. Simple
    11. Show that the following code denotes the same function.
    12. int f (int n)
      { if n > 1 then n*f(n-1)
                 else 1
      }
      
      int f (int n)
      { int t = 1;
        while n > 1 do {
          t := t*n;
          n := n-1}
      }
  5. (operational) Construct operational semantics for the following:
    1. Stack
    2. List
    3. Queue
    4. Binary search tree
    5. Graph
    6. Grade book
    7. Complex numbers
    8. Rational numbers
    9. Floating point numbers
    10. Simple
  6. (correctness) Construct an implementation of the following and show that your implementation is correct by showing that it satisfies a semantics.
    1. Stack
    2. List
    3. Queue
    4. Binary search tree
    5. Graph
    6. Grade book
    7. Complex numbers
    8. Rational numbers
    9. Floating point numbers
    10. Simple

Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of Anthony A. Aaby. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee.
© 1998 Anthony A. Aaby.  Send comments to aabyan@wwc.edu