Imperative programming languages
reflect the architecture of the underlying von Neumann stored
program computer: Programs update memory locations under the
control of instructions. Execution is (for the most part)
sequential. Sequential execution is governed by a program counter.
Imperative programs are prescriptive. They dictate precisely how a
result is to be computed by means of a sequence of statements to be
performed by the computer. Consider this program using the Small
language developed in Chap. 6.
What do we want to know about the
program in Fig. 7.1? Are we concerned with a detailed
description of what happens when the computer runs this? Do we want
to know what the PC is set to when the program finishes? Are we
interested in what is in memory location 13 after the second
iteration of the loop? These questions are not ones that need to be
answered. They don’t tell us anything about what the program
does.
Instead, if we want to understand the
program we want to be able to describe the relationship between the
input and the output. The output is the remainder after dividing
the first input value by the second input. If this is what we are
really concerned about then why not program by describing
relationships rather than prescribing a set of steps. In Logic
Programming the programmer describes the logical structure of a
problem rather than prescribing how a computer is to go about
solving it. Languages for Logic Programming are called:

Fig.
7.1
A small sample
-
Descriptive languages: Programs are expressed as known facts and logical relationships about a problem. Programmers assert the existence of the des ired result and a logic interpreter then uses the computer to find the desired result by making inferences to prove its existence.
-
Nonprocedural languages: The programmer states only what is to be accomplished and leaves it to the interpreter to determine how it is to be accomplished.
-
Relational languages: Desired results are expressed as relations or predicates instead of as functions. Rather than define a function for calculating a square root, the programmer defines a relation, say
, that is true exactly when
.
While there are many application
specific logic programming languages, there is one language that
stands out as a general purpose logic programming language. Prolog
is the language that is most commonly associated with logic
programming. The model of computation for Prolog is not based on
the Von Neumann architecture. It’s based on the mechanism in logic
called unification. Unification is the process where variables are
unified to terms.
This text has explored a variety of
languages from the JCoCo assembly language, to Java and C++, to
Standard ML, and now Prolog. These languages reflect a continuum
from prescriptive languages to descriptive languages.
-
Assembly language is a very prescriptive language, meaning that you must think in terms of the particular machine and solve problems accordingly. Programmers must think in terms of the von Neumann machine stored program computer model.
-
C++ is a high-level language and hence allows you to think in a more descriptive way about a problem. However, the underlying computational model is still the von Neumann machine.
-
Standard ML is a high-level language too, but allows the programmer to think in a mathematical way about a problem. This language gets away from the traditional von Neumann model in some ways.
-
Prolog takes the descriptive component of languages further and lets programmers write programs based solely on describing relationships.
Prolog was developed in 1972. Alain
Colmerauer, Phillipe Roussel, and Robert Kowalski were key players
in the development of the Prolog language. It is a surprisingly
small language with a lot of power. The Prolog interpreter operates
by doing a depth first search of the search space while unifying
terms to try to come to a conclusion about a question that the
programmer poses to the interpreter. The programmer describes facts
and relationships and then asks questions.
This simple model of programming has
been used in a wide variety of applications including automated
writing of real estate advertisements, an application that writes
legal documents in multiple languages, another that analyzes social
networks, and a landfill management expert system. This is only a
sampling of the many, many applications that have been written
using this simple but powerful programming model.
7.1 Getting Started with Prolog
If you don’t already have a Prolog
interpreter, you will want to download one and install it. There
are many versions of Prolog available. Some are free and some are
not. The standard free implementation is available at http://www.swi-prolog.org. There
are binary distributions available for Microsoft Windows, Mac OS X,
and Linux, so there should be something to suit your needs.
Unlike SML, there is no way to write a
program interactively with Prolog. Instead, you write a text file,
sometimes called a database, containing a list of facts and
predicates. Then you start the Prolog interpreter, consult the
file, and ask yes or no questions that the Prolog interpreter tries
to prove are true.
To start the Prolog interpreter you
type either pl or
swipl depending on your
installation of SWI Prolog. To exit the interpreter type a
ctl-d. A Prolog program is
a database of facts and predicates that can be used to establish
further relationships among those facts. A predicate is a function
that returns true or false. Prolog programs describe relationships.
A simple example is a database of facts about several people in an
extended family and the relationships between them as shown in
Fig. 7.2.
Questions we might ask:
- 1.
Is Gary’s father Sophus?
- 2.
Who are Kent’s fathers?
- 3.
For who is Lars a father?
These questions can all be answered by
Prolog given the database in Fig. 7.2.

Fig.
7.2
The family tree
7.2 Fundamentals
Prolog programs
(databases) are composed of facts. Facts describe relationships
between terms. Simple terms include numbers and atoms. Atoms are
symbols like sophus that
represent an object in our universe of discourse. Atoms MUST start
with a small letter. Numbers start with a digit and include both
integers and real numbers. Real numbers are written in scientific
notation. For instance, 3.14159e0 or just 3.14159 when the exponent
is zero.
A predicate is a function that returns
true or false. Predicates are defined in Prolog by recording a fact
or facts about them. For instance, Fig. 7.2 establishes the fact
that Johan was the parent of Sophus. parent is a predicate representing a
true fact about the relationship of johan and sophus.
Frequently terms include variables in
predicate definitions to establish relationships between groups of
objects. A variable starts with a capital letter. Variables are
used to establish relationships between classes of objects. For
instance, to be a father means that you must be a parent of someone
and be male. In Fig. 7.2 the father predicate is defined by writing
which means X is the
father of Y if X is the parent of Y and X is male. The symbol:− is read as
if and the comma in the
predicate definition is read as and. So X is a father of Y if X is a parent of Y and X is male.

Practice 7.1
What are the terms in
Fig. 7.2?
What is the difference between an atom and a variable? Give
examples of terms, atoms, and variables from Fig. 7.2.
You
can check your answer(s) in Section 7.17.1.
To program in Prolog the programmer
first writes a database like the one in Fig. 7.2. Then the programmer
consults the database so the Prolog interpreter can internally
record the facts that are written there. Once the database has been
consulted, questions can be asked about the database. Questions
asked of Prolog are limited to yes or no questions that are posed
in terms of the predicates in the database. A question posed to
Prolog is sometimes called a query. To discover if Johan is the
father of Sophus you start Prolog using pl or swipl, then consult the database, and
pose the query.
Queries may also contain variables. If we want to find out who the
father of sophus is we can ask that of Prolog by replacing the
father position in the predicate with a variable. When using a
variable in a query Prolog will answer yes or no. If the answer is
yes, Prolog will tell us what the value of the variable was when
the answer was yes. If there is more than one way for the answer to
be yes then typing a semicolon will tell Prolog to look for other
values where the query is true.
The final No is Prolog
telling us there are no other ways for parent (X, kent) to be true.


7.3 The Prolog Program
Prolog performs
unification to search for a
solution. Unification is simply a list of substitutions of terms
for variables. A query of the database is matched with its
predicate definition in the database. Terms in the query are
matched when a suitable pattern is found among the parameters of a
predicate in the database. If the matched predicate is dependent on
other predicates being true, then those queries are posed to the
Prolog interpreter. This process continues until either Prolog
finds that no substitution will satisfy the query or it finds a
suitable substitution.
Prolog uses depth first search with
backtracking to search for a valid substitution. In its search for
truth it will unify variables to terms. Once a valid substitution
is found it will report the substitution and wait for input. In
Sect. 7.2
the interpreter reports that
is a substitution that makes parent(X, kent) true. Prolog waits
until either return is
pressed or a semicolon is entered. When the semicolon is entered,
Prolog undoes the last successful substitution it made and
continues searching for another substitution that will satisfy the
query. In Sect. 7.2 Prolog reports that
will
satisfy the query as well. Pressing semicolon one more time undoes
the
substitution, Prolog continues its depth first search looking for
another substitution, finds none, and reports No indicating that the search has
exhausted all possible substitutions.



Unification finds a substitution of
terms for variables or variables for terms. Unification is a
symmetric operation. It doesn’t work in only one direction. This
means (among other things) that Prolog predicates can run backwards
and forwards. For instance, if you want to know who Kent’s dad is
you can ask that as easily as who is Gary the father of. In the
following example we find out that gary is the father of kent. We also find out who gary is the father of.

Practice 7.2
Write predicates that define the
following relationships.
- 1.
brother
- 2.
sister
- 3.
grandparent
- 4.
grandchild
Depending on how you wrote grandparent
and grandchild there might be something to note about these two
predicates. Do you see a pattern? Why?
You
can check your answer(s) in Section 7.17.2.
7.4 Lists
Prolog supports lists as a data
structure. A list is constructed the same as in ML. A list may be
empty which is written as [] in Prolog. A non-empty list is
constructed from an element and a list. The construction of a list
with head, H, and tail, T, is written as[H | T]. So, [1,2,3] can
also be written as [1 | [2 | [3 | []]]]. The list [a | []] is
equivalent to writing [a]. Unlike ML, lists in Prolog do not need
to be homogeneous. So [1, hi, 4.3] is a valid Prolog list.
By virtue of the fact that Prolog’s
algorithm is depth first search combined with unification, Prolog
naturally does pattern matching. Not only does [H | T] work to
construct a list, it also works to match a list with a variable.
Append can be written as a relationship between three lists. The
result of appending the first two lists is the third argument to
the append predicate. The first fact below says appending the empty
list to the front of Y is
just Y. The second fact
says that appending a list whose first element is H to the front of L2 results in [H|T3] when appending T1 and L2 results in T3.
Try out append both backwards and forwards! The definition of
append can be used to
define a predicate called sublist as follows:
Stated in English this says that X is a sublist of Y if you can append something on the
front of X to get
L and something else on the
end of L to get
Y. The underscore is used
in predicate definitions for values we don’t care about.


To prove that sublist([1],[1,2]) is true we can use
the definition of sublist
and append to find a
substitution for which the predicate holds.
Figure 7.3 provides a proof that [1] is a sublist of [1,2].

Fig.
7.3
A unification tree
Practice 7.3
What is the complexity of the append
predicate? How many steps does it take to append two lists?
You
can check your answer(s) in Section 7.17.3.
Practice 7.4
Write the reverse predicate for lists
in Prolog using the append predicate. What is the complexity of
this reverse predicate?
You
can check your answer(s) in Section 7.17.4.
7.5 The Accumulator Pattern
The slow version of reverse from
practice problem 7.4 can be improved upon. The accumulator
pattern can be applied to Prolog as it was in SML. Looking back at
the solution to practice
problem 5.17, the ML solution can be
rewritten to apply to Prolog as well. In the ML version an
accumulator argument was added to the function that allowed the
helprev helper function to
accumulate the reversed list without the use of append.
Unlike SML, Prolog does not have any facility for defining local
functions with limited scope. If using helper predicates in a
Prolog program the user and/or programmer must be trusted to invoke
the correct predicates in the correct way.

Applying what we learned from the ML
version of reverse to Prolog results in a helprev predicate with an
extra argument as well. In many ways this is the same function
rewritten in Prolog syntax. The only trick is to remember that you
don’t write functions in Prolog. Instead, you write predicates.
Predicates are just like functions with an extra parameter. The
extra parameter establishes the relationship between the input and
the output.
Sometimes in Prolog it is useful to
think of input and output parameters. For instance, with append
defined as a predicate it might be useful to think of the first two
parameters as input values and the third as the return value. While
as a programmer it might sometimes be useful to think this way,
this is not how Prolog works. As was shown in
Sect. 7.4, append works both backwards and forwards.
But, thinking about the problem in this way may help identifying a
base case or cases. When the base cases are identified, the problem
may be easier to solve.
Practice 7.5
Write the reverse predicate using a
helper predicate to make a linear time reverse using the
accumulator pattern.
You
can check your answer(s) in Section 7.17.5.
7.6 Built-In Predicates
Prolog offers a few built in
predicates. The relational operators (<, >,
,
, and
) all work on
numbers and are written in infix form. Notice that not equals is
written as \
in
Prolog.




To check that a predicate doesn’t
hold, the not predicate is
provided. Preceding any predicate with not insists the predicate returns
false. For instance, not(5 >
6) returns true because 5
> 6 returns false.
The atom predicate returns true if the
argument is an atom. So atom(sophus) is true but atom(5) is not. The number predicate returns true if the
argument is a number. So number(5) is true but number(sophus) is not.
7.7 Unification and Arithmetic
The Prolog interpreter does a depth
first search of the search space while unifying variables to terms.
The primary operation that Prolog carries out is unification.
Unification can be represented explicitly in a Prolog program by
using the equals (i.e.
) operator. When equals is used, Prolog attempts to
unify the terms that appear on each side of the operator. If they
can be unified, Prolog reports yes and continues unifying other
terms to try to find a substitution that satisfies the query. If no
substitution is possible, Prolog will report no.

You might have caught yourself wanting
to write something like
in some of the practice problem s. This is
normal, but is the sign of a novice Prolog programmer. Writing
in a
predicate definition is never necessary. Instead, everywhere
Y appears in the predicate,
write X instead.


Unification has one other little
nuance that most new Prolog programmers miss. There is no point in
unifying a variable to a term if that variable is used only once in
a predicate definition. Unification is all about describing
relationships. Unification doesn’t mean much when a variable is not
used in more than one place in a definition. In terms of imperative
programming it’s kind of like storing a value in a variable and
then never using the variable. What’s the point? Prolog warns us
when we do this by saying

If this happens, look for a variable
called X (or whatever the
variable name is) that is used only once in a predicate definition
and replace it with an underscore (i.e. _). An underscore indicates the result
of unification in that position of a predicate isn’t needed by the
current computation. Prolog warns you of singleton variables
because they are a sign that there may be an error in a predicate
definition. If an extra variable exists in a predicate definition
it may never be instantiated. If that is the case, the predicate
will always fail to find a valid substitution. While singleton
variables should be removed from predicate definitions, the message
is only a warning and does not mean that the predicate is
wrong.
The use of equality for unification
and not for assignment statements probably seems a little odd to
most imperative programmers. The equals operator is not the
assignment operator in Prolog. It is unification. Assignment and
unification are different concepts. Writing X
in Prolog
means that the variable X must be equal to the term 6
5, not 30.
The equals operator doesn’t do arithmetic in Prolog. Instead, a
special Prolog operator called is is used. To compute 6
5 and assign
the result to the variable X the Prolog programmer writes
X is 6
5 as part of
a predicate. Using the is
operator succeeds when the variable on the left is unbound and the
expression on the right doesn’t cause an exception when computed.
All values on the right side of the is predicate must be known for the
operation to complete successfully. Arithmetic can only be
satisfied in one direction, from left to right. This means that
predicates involving arithmetic can only be used in one direction,
unlike the append predicate and other predicates that don’t involve
arithmetic.




Practice 7.6
Write a length predicate that computes
the length of a list.
You
can check your answer(s) in Section 7.17.6.
7.8 Input and Output
Prolog programs can read from standard
input and write to standard output. Reading input is a side-effect
so it can only be satisfied once. Once read, it is impossible to
unread something. The most basic predicates for getting input are
get_char(X) which
instantiates X to the next
character in the input (whatever it is) and get(X) which instantiates X to the next non-whitespace character.
The get_char predicate
instantiates X to the
character that was read. The get predicate instantiates X to the ASCII code of the next
character.
There is also a predicate called
read(X) which reads the
next term from the input. When X is uninstantiated, the next term is
read from the input and X
is instantiated with its value. If X is already instantiated, the next
term is read from the input and Prolog attempts to unify the two
terms.
As a convenience, there are certain
libraries that also may be provided with Prolog. The readln predicate may be used to read an
entire line of terms from the keyboard, instantiating a variable to
the list that was read. The readln predicate has several arguments
to control how the terms are read, but typically it can be used by
writing readln(L, _, _, _,
lowercase).
Reading input from the keyboard, no matter which predicate is used,
causes Prolog to prompt for the input by printing a
to the screen. If the readln predicate is invoked as shown
above, entering the text below will instantiate L to the list as shown.
The print(X) predicate will
print a term to the screen in Prolog. The value of its argument
must be instantiated to print it. Print always succeeds even if the
argument is an uninstantiated variable. However, printing an
uninstantiated variable results in the name of the variable being
printed which is probably not what the programmer wants. When a
query is made in Prolog, each variable is given a unique name to
avoid name collisions with other predicates the query may be
dependent on. Prolog assigns these unique names and they start with
an underscore character. If an uninstantiated variable is printed,
you will see it’s Prolog assigned unique name.
The print predicate is
satisfied by unifying the variable with the name of Prolog’s
internal unique variable name which is almost certainly not what
was intended. The print
predicate should never be invoked with an uninstantiated
variable.




7.9 Structures
Prolog terms include numbers, atoms,
variables and one other important type of term called a structure.
A structure in Prolog is like a datatype in SML. Structures are
recursive data structures that are used to model structured data.
Computer scientists typically call this kind of structured data a
tree because they model recursive, hierarchical data. A structure
is written by writing a string of characters preceding a tuple of
some number of elements. Consider implementing a lookup predicate
for a binary search tree in Prolog. A tree may be defined
recursively as either nil
or a btnode(Val, Left,
Right) where Val is
the value stored at the node and Left and Right represent the left and right
binary search trees. The recursive definition of a binary search
tree says that all values in the left subtree must be less than
Val and all values in the
right subtree must be greater than Val. For this example, let’s assume
that binary search trees don’t have duplicate values stored in
them.
A typical binary search tree structure
might look something like the term below and corresponds to the
tree shown graphically in Fig. 7.4.


Fig.
7.4
Search tree
Items may be inserted into and deleted
from a binary search tree. Since Prolog programmers write
predicates, the code to insert into and delete from a binary search
tree must reflect the before and after picture. Because a binary
search tree is recursively defined, each part of the definition
will be part of a corresponding case for the insert and delete
predicates. So, inserting into a search tree involves the value to
insert, the tree before it was inserted, and the tree after it was
inserted. Similarly, a delete predicate involves the same three
arguments.
Looking up a value in a binary search
tree results in a true or false response, which is the definition
of a predicate. Writing a lookup predicate requires the value and
the search tree in which to look for the value.
Practice 7.7
Write a lookup predicate that looks up
a value in a binary search tree like the kind defined in this
section.
You
can check your answer(s) in Section 7.17.7.
7.10 Parsing in Prolog
As mentioned earlier in the text,
Prolog originated out of Colmerauer’s interest in using logic to
express grammar rules and to formalize the parsing of natural
language sentences. Kowalski and Comerauer solved this problem
together and Colmerauer figured out how to encode the grammar as
predicates so sentences could be parsed efficiently. The next
sections describe the implementation of parsing Colmerauer devised
in 1972. Consider the following context-free grammar for English
sentences.

Given a sequence of tokens like “the
professor discovered a group.”, Chap. 2 showed that a parse tree can be
used to demonstrate that a string is a sentence in the language and
at the same time displays its syntactic structure.
Practice 7.8
Construct the parse tree for “the
professor discovered a group.” using the grammar in this
section.
You
can check your answer(s) in Section 7.17.8.
Prolog is especially well suited to
parse sentences like the one in practice problem 6.8. The language has built in
support for writing grammars and will automatically generate a
parser given the grammar of a language. How Prolog does this is not
intuitively obvious. The grammar is taken through a series of
transformations that produce the parser. The next few pages present
these transformations to provide insight into how Prolog generates
parsers.
Parsing in Prolog requires the source
program, or sentence, be scanned as in the parser implementations
presented in Chaps. 2 and 3. The readln predicate will suffice to read a
sentence from the keyboard and scan the tokens in it. Using the
readln predicate to read
the sentence, “the professor discovered a group.”, produces the
list [the, professor, discovered, a, group,’.’].
A Prolog parser is a top-down or
recursive-descent parser. Because the constructed parser is
top-down, the grammar must be LL(1). There cannot be any
left-recursive productions in the grammar. Also, because Prolog
uses backtracking, there cannot be any productions in the grammar
with common prefixes. If there are any common prefixes, left
factorization must be performed. Fortunately, the grammar presented
in this section is already LL(1).

Fig.
7.5
Sentence structure
The Prolog parser will take the list
of tokens and produce a Prolog structure. The structure is the
Prolog representation of the abstract syntax tree of the sentence.
For instance, the sentence, “the professor discovered a group.”,
when parsed by Prolog, yields the term sen(sub(det(the),
noun(professor)), pred(verb(discovered), sub(det(a),
noun(group)))).
The logic programming approach to
analyzing a sentence in a grammar can be viewed in terms of a graph
whose edges are labeled by the tokens or terminals in the language.
Figure 7.6 contains a graph representation of a
sentence. Two terminals are contiguous in the original string if
they share a common node in the graph.

Fig.
7.6
A sentence graph
A sequence of contiguous labels
constitutes a nonterminal if the sequence corresponds to the
right-hand side of a production rule in the grammar. The contiguous
sequence may then be labeled with the nonterminal. In
Fig. 7.5
three nonterminals are identified. To facilitate the representation
of graphs like Fig. 7.6 in Prolog the nodes of the graph are given
labels. Positive integers are convenient labels to use as shown in
Fig. 7.7.

Fig.
7.7
A labeled sentence graph
The graph for the sentence can be
represented in Prolog by entering the following facts. These
predicates reflect the end points of their corresponding labeled
edge in the graph.
Using the labeled graph in Fig. 7.7, nonterminals in the
grammar can be represented by predicates. For instance, the subject
of a sentence can be represented by a subject predicate. The subject(K, L) predicate means that the
path from node K to node L can be interpreted as an instance of the
subject nonterminal.

For example, subject(4, 6) should
return true because edge (4, 5) is labeled by a determiner “a” and
edge (5, 6) is labeled by the noun “group’‘. To define a sentence
predicate there must exist a determiner and a noun. The rule for
the sentence predicate is
The common variable M
insure the determiner immediately precedes the noun.

Practice 7.9
Construct the predicates for the rest
of the grammar.
You
can check your answer(s) in Section 7.17.9.
The syntactic correctness of the
sentence, “the professor discovered a group.” can be determined by
either of the following queries
The sentence is recognized by the parser when the paths in the
graph corresponding to the nonterminals in the grammar are
verified. If eventually a path for the sentence nonterminal is
found then the sentence is valid. The paths in the graph of the
sentence are shown in Fig. 7.8. Note the similarity of the structure
exhibited by the paths in the graph with the tree of the sentence.
If you use your imagination a bit you can see the parse tree upside
down (or right-side up for your non-programming friends).


Fig.
7.8
An upside down parse tree
7.10.1 Difference Lists
There are a couple of problems with
the development of the parser above. First, entering the sentence
as facts like the(1,2) and
professor(2,3) is
impractical and awkward. There would have to be some preprocessing
on the list to get it in the correct format to be parsed. While
this could be done, a better solution exists. The other problem
concerns what the parser does. So far the parser only recognizes a
syntactically valid sentence and does not produce a representation
of the abstract syntax tree for the sentence.
Labeling the nodes of the graph above
with integers was an arbitrary decision. The only requirement of
labeling nodes in the graph requires that it be obvious when two
nodes in the graph are connected. Both problems above can be solved
by letting sublists of the sentence label the graph instead of
labeling the nodes with integers. These sublists are called
difference lists. A difference list represents the part of the
sentence that is left to be parsed. The difference between two
adjacent nodes is the term which labels the intervening edge. The
difference list representation of the graph is shown in
Fig. 7.9.
Using difference lists, two nodes are connected if their difference
lists differ by only one element. This connection relationship can
be expressed as a Prolog predicate.

Fig.
7.9
Difference lists
This is the connect predicate and the
grammar rewritten to use the connect predicate.
The c (i.e. connect)
predicate says that the node labeled [H|T] is connected to the node labeled
T and the edge connecting
the two nodes is labeled H.
This predicate can be used for the terminals in the grammar in
place of the facts given above.
The graph need not be explicitly created when this representation
is employed. The syntactic correctness of the sentence, “the
professor discovered a group.” can be recognized by the following
query.
The parsing succeeds because the node labeled with [the, professor,
discovered, a, group, ‘.’] can be joined to the node labeled with
[] via the intermediate nodes involved in the recursive descent
parse of the sentence. Because Prolog predicates work backwards as
well as forward, it is just as easy to explore all the sentences of
this grammar by posing this query to the Prolog interpreter.
This reveals that there are 126 different sentences defined by the
grammar. Some of the sentences are pretty non-sensical like “the
group discovered a group.”. Some of the sentences like “the group
jailed the professor.” have some truth to them. Sophus Lie used to
walk to many of the places he visited partly because he liked to
walk and partly because he had little money at the time. He also
liked to draw sketches of the countryside when hiking. He was
jailed in France when France and Germany were at war because the
French thought he was a German spy. It was understandable since he
was walking through the countryside talking to himself in Norwegian
(which the French thought might be German). When they stopped to
question him, they found his notebook full of Mathematical formulas
and sketchings of the French countryside. He spent a month in
prison until they let him go. While in prison he read and worked on
his research in Geometry. Of his prison stay he later commented, “I
think that a Mathematician is comparatively well suited to be in
Prison.”[20]. Other mathematicians may not agree with his
assessment of the mathematical personality.




Some care must be taken when asking
for all sentences of a grammar. If the grammar contained a
recursive rule, say
then the language would allow infinitely many sentences, and the
sentence generator will get stuck with ever lengthening subject
phrases.

7.11 Prolog Grammar Rules
Most implementations of Prolog have a
preprocessor which translates grammar rules into Prolog predicates
that implement a parser of the language defined by the grammar. The
grammar of the English language example takes the following form as
a logic grammar in Prolog.
Note that terminal symbols appear inside brackets exactly as they
look in the source text. Since they are Prolog atoms, tokens
starting with characters other than lower case letters must be
placed within apostrophes. The Prolog interpreter automatically
translates these grammar rules into normal Prolog predicates
identical to those defining the grammar presented in the previous
section.

7.12 Building an AST
The grammar given above is transformed
by a preprocessor to generate a Prolog parser. However, in its
given form the parser will only answer yes or no, indicating the
sentence is valid or invalid. Programmers also want an abstract
syntax tree if the sentence is valid. The problem of producing an
abstract syntax tree as a sentence is parsed can be handled by
using parameters in the logic grammar rules.
Predicates defined using Prolog
grammar rules may have arguments in addition to the implicit ones
created by the preprocessor. These additional arguments are
inserted by the translator to precede the implicit arguments. The
grammar rule
will be translated into the Prolog rule
A query with a variable representing a tree produces that tree as
its answer.



Practice 7.10
Write a grammar for the subset of
English sentences presented in this text to parse sentences like
the one above. Include parameters to build abstract syntax trees
like the one above.
You
can check your answer(s) in Section 7.17.10.
Writing an interpreter or compiler in
Prolog is relatively simple given the grammar for the language.
Once the AST has been generated for an expression in the language
the back end of the interpreter or compiler proceeds much like it
does in other languages.
7.13 Attribute Grammars
Programming language syntax is
specified by formal methods like grammars. Semantics, or the
meaning of a computer program, are much harder to define. The study
of formal methods of specifying the meaning, or semantics, of a program is a difficult
but rewarding area of Computer Science. In Chap. 6 a compiler for the Small language
was developed. Mapping the Small language into the language of
JCoCo is a way of defining the semantics of Small. Mappings like
this are sometimes called Small
Step Operational Semantics meaning that the Small language
was defined in terms of the smaller steps in the JCoCo language. Of
course, the JCoCo language’s semantics should also be formally
defined in that case.
Another form of
semantic definition is an Attribute Grammar. Attribute grammars
are not ideal for larger languages, even languages as big as the
Small language would be difficult and tedious to describe with an
attribute grammar. But, a language like the prefix calculator
language is perfect for an attribute grammar definition.
The prefix calculator expression
language was first presented in Chap. 5. The contents of the memory
location after evaluating an expression is not specified by the
grammar of the language. In fact, the purpose of any of the
operators is not made explicit in the grammar. Even though we know
that
stands for
multiplication, there is nothing in the grammar itself that insists
this be the case. Other means are necessary to convey that meaning.
One such method of conveying the semantics of a language is called
an attribute grammar. An
attribute grammar adds attributes to each node of an abstract
syntax tree for sentences in the language.

The attributes tell us how a program
would be evaluated in terms of its abstract syntax tree. In other
words, an attribute grammar provides a mapping of the syntax of a
program into a set of attributes that describe the semantics of the
program. Consider the prefix calculator grammar

Recall the grammar represents prefix
expressions because the operation is written before its arguments.
So,
5
64 results
in 29 when evaluated. Notice that when written in prefix notation,
the expression S 5 stores
5 in the memory location. S is a prefix operator.



Fig.
7.10
AST definition

Fig.
7.11
Annotated AST for
S 4 R


Fig.
7.12
Attribute grammar
The prog node in the abstract syntax
definition in Fig. 7.10 was added to assist in the definition of
the attribute grammar. This abstract syntax can be used in Prolog
but does not need to be defined as a datatype as it would in
Standard ML.
An attribute grammar attaches
assignment statements for the attributes to each node in the
abstract syntax tree. To distinguish between parts of the abstract
syntax tree, let AST0
denote the AST on the left hand side of a production and
ASTi where i > 0 represent an AST on the right
hand side of the production. The attribute grammar for the
calculator language is given in Fig. 7.12. Semantics rules are
attached to each of the nodes in the AST definition. These rules
govern the assignment of the attributes in the AST. The numbers to
the left of each rule are there simply to number the rules and are
not part of the attribute grammar. By deriving an AST for a
sentence and then applying the semantic rules the tree is decorated
with attributes that describe the meaning of the sentence, or
program, in the language.
The attribute grammar given in
Fig. 7.12 can be used to convey the meaning of
evaluating an expression like
S 4 R. Figure 7.11 depicts the
annotated AST according to the attribute grammar given in
Fig. 7.12.

Practice 7.11
Justify the annotation of the tree
given in Fig. 7.11 by stating which rule was used in
assigning each of the attributes annotating the tree.
You
can check your answer(s) in Section 7.17.11.
7.13.1 Synthesized Versus Inherited
Attributes in an attribute grammar
come in two flavors. Some attributes are inherited which means they are derived
from values that are above or to the left in the AST. Some
attributes are synthesized
meaning they are derived from values that are below or to the right
in the tree. The val
attribute is a synthesized attribute in the attribute grammar
presented in Fig. 7.12.
Practice 7.12
Is the min attribute synthesized or inherited?
Is the mout attribute
synthesized or inherited?
You
can check your answer(s) in Section 7.17.12.
Attribute grammars work great for
small languages. When a language is larger, the number of
attributes can grow exponentially, resulting in a very large
annotated tree. In addition, attribute grammars don’t deal well
with things like control flow and values that aren’t determined
until run-time. There are many aspects of programming languages
that are difficult to assign as attributes in an AST. Typically,
attribute grammars work well for small interpreted languages with
little or no unknown information.
7.14 Chapter Summary
This chapter provided an introduction
to programming in Prolog. List manipulation and building and
traversing complex recursive terms are important skills in becoming
an experienced Prolog programmer. Grammars and recursive-descent
parsing are natural topics relating to Prolog. Building top-down
parsers in Prolog is easy with the grammar extension provided in
the Prolog language.
In addition, the chapter introduced a
couple of formal semantic methods for describing programming
languages. Small step operational semantics is one method where a
language is defined in terms of smaller steps in a simpler
well-defined language. Attribute grammars is another method of
assigning meaning to programs.
There are several good books on
Prolog programming. The Prolog presented in this chapter is enough
to get a flavor of the language and a good start programming in the
language. Things left out of the discussion include the
cut operator and some
nuances of how unification is done (i.e. the difference between
and
). Reading
from and writing to files was also left out. The definitive book
for more information is Clocksin and Mellish [5]. This book lacks
exercises but contains many examples and is a good reference once
you understand something about how to program in Prolog (which I
hope you do once you’ve read the chapter and worked through the
problems).


7.15 Review Questions
- 1.
What is a term made up of in Prolog? Give examples of both simple and complex terms.
- 2.
What is a predicate in Prolog?
- 3.
In Standard ML you can pattern match a list using (h::t). How do you pattern match a list in Prolog?
- 4.
According to the definition of append, which are the input and the output parameters to the predicate?
- 5.
How do you get more possible answers for a question posed to Prolog?
- 6.
In the expression X
6
5
4 why doesn’t X equal 34 when evaluated in Prolog? What does X equal? What would you write to get X equal to 34?
- 7.
- 8.
What symbol is used in place of the:- when writing a grammar in Prolog?
- 9.
What is a synthesized atrribute?
- 10.
What is an inherited attribute?
7.16 Exercises
In these early exercises you should
work with the relative database presented at the beginning of this
chapter.
- 1.
Write a rule (i.e. predicate) that describes the relationship of a sibling. Then write a query to find out if Anne and Stephen are siblings. Then ask if Stephen and Michael are siblings. What is Prolog’s response?
- 2.
Write a rule that describes the relationship of a brother. Then write a query to find the brothers of sophusw. What is Prolog’s response?
- 3.
Write a rule that describes the relationship of a niece. Then write a query to find all nieces in the database. What is Prolog’s response?
- 4.
Write a predicate that describes the relationship of cousins.
- 5.
Write a predicate that describes the ancestor relationship.
- 6.
Write a predicate called odd that returns true if a list has an odd number of elements.
- 7.
Write a predicate that checks to see if a list is a palindrome.
- 8.
Show the substitution required to prove that sublist([a, b], [c, a, b]) is true. Use the definition in Fig. 7.3 and use the same method of proving it’s true.
- 9.
Write a predicate that computes the factorial of a number.
- 10.
Write a predicate that computes the nth fibonacci number in exponential time complexity.
- 11.
Write a predicate that computes the nth fibonacci number in linear time complexity.
- 12.
Write a predicate that returns true if a third list is the result of zipping two others together. For instance,
- 13.
Write a predicate that counts the number of times a specific atom appears in a list.
- 14.
Write a predicate that returns true if a list is three copies of the same sublist. For instance, the predicate should return true if called as
- 15.
Implement insert, lookup, and delete on a binary search tree. The structure of a binary search tree was discussed in this chapter. Your main run predicate should be this:The lookup predicate was a practice problem and the solution is provided if you need it. The insert predicate is somewhat like the lookup predicate except that a new node is constructed when you reach a leaf. Deleting a node is similiar to looking it up except that if it is found, the tree is altered to delete the node. Deleting a node from a binary search tree has three cases.
- (a)
The node to delete is a leaf node. If this is the case, then deleting it is simple because you just return an empty tree. In Fig. 7.4 this occurs when 2, 4, 7, or 10 is deleted.
- (b)
The node to delete has one child. If this is the case, then the result of deleting the node is the subtree under the deleted node. In Fig. 7.4, if the 9 is deleted, then the 10 is just moved up to replace the 9 in the tree.
- (c)
The node to delete has two children. If this is the case, then you have to do two things. First, find the left-most value from the right subtree. Then, delete the left-most value from the right subtree and return a new tree with the left-most value of the right subtree at its root. Consider deleting 5 from the tree in Fig. 7.4. The left-most value of the right subtree is 7. To delete 5 we put the 7 at the root of the tree and then delete 7 from the right subtree.
To make this project easy, write it incrementally. Print the results as you go so you can see what works and what doesn’t. The print predicate will print its argument while the nl predicate will print a newline. Don’t start by writing the entire run predicate right away. Write one piece at a time, test it, and then move on to the next piece. - (a)
- 16.
Implement a calculator prefix expression interpreter in Prolog as described in the section on attribute grammars in this chapter. The interpreter will read an expression from the keyboard and print its result. The interpreter should start with a calc predicate. Here is the calc predicate to get you started.To make this project easy, write it incrementally. Print the results as you go so you can see what works and what doesn’t. The print predicate will print its argument while the nl predicate will print a newline. Don’t write the entire calc predicate right away. Write one piece, test it, and then move on to the next piece.
7.17 Solutions to Practice Problems
These are solutions to the practice
problem s. You should only consult these answers after you have
tried each of them for yourself first. Practice problems are meant
to help reinforce the material you have just read so make use of
them.
7.17.1 Solution to Practice Problem 7.1
Terms include atoms and variables.
Atoms include sophus, fred, sophusw, kent, johan, mads, etc. Atoms
start with a lowercase letter. Variables start with a capital
letter and include X and Y from the example.
7.17.2 Solution to Practice Problem 7.2
- 1.
brother(X,Y):- father(Z,X), father(Z, Y), male(X).
- 2.
sister(X,Y):- father(Z,X), father(Z, Y), female(X).
- 3.
grandparent(X,Y):- parent(X,Z), parent(Z, Y).
- 4.
grandchild(X,Y):- grandparent(Y, X).
Grandparent and grandchild
relationships are just the inverse of each other.
7.17.3 Solution to Practice Problem 7.3
The complexity of append is O(n) in
the length of the first list.
7.17.4 Solution to Practice Problem 7.4



7.17.5 Solution to Practice Problem 7.5

7.17.6 Solution to Practice Problem 7.6

7.17.7 Solution to Practice Problem 7.7

7.17.8 Solution to Practice Problem 7.8
7.17.9 Solution to Practice Problem 7.9

7.17.10 Solution to Practice Problem 7.10
