© Springer International Publishing AG 2017
Kent D. LeeFoundations of Programming LanguagesUndergraduate Topics in Computer Sciencehttps://doi.org/10.1007/978-3-319-70790-7_7

7. Logic Programming

Kent D. Lee 
(1)
Luther College, Decorah, IA, USA
 
 
Kent D. Lee
Imperative programming languages reflect the architecture of the underlying von Neumann stored program computer: Programs update memory locations under the control of instructions. Execution is (for the most part) sequential. Sequential execution is governed by a program counter. Imperative programs are prescriptive. They dictate precisely how a result is to be computed by means of a sequence of statements to be performed by the computer. Consider this program using the Small language developed in Chap. 6.
What do we want to know about the program in Fig. 7.1? Are we concerned with a detailed description of what happens when the computer runs this? Do we want to know what the PC is set to when the program finishes? Are we interested in what is in memory location 13 after the second iteration of the loop? These questions are not ones that need to be answered. They don’t tell us anything about what the program does.
Instead, if we want to understand the program we want to be able to describe the relationship between the input and the output. The output is the remainder after dividing the first input value by the second input. If this is what we are really concerned about then why not program by describing relationships rather than prescribing a set of steps. In Logic Programming the programmer describes the logical structure of a problem rather than prescribing how a computer is to go about solving it. Languages for Logic Programming are called:
A330638_2_En_7_Fig1_HTML.gif
Fig. 7.1
A small sample
  • Descriptive languages: Programs are expressed as known facts and logical relationships about a problem. Programmers assert the existence of the des ired result and a logic interpreter then uses the computer to find the desired result by making inferences to prove its existence.
  • Nonprocedural languages: The programmer states only what is to be accomplished and leaves it to the interpreter to determine how it is to be accomplished.
  • Relational languages: Desired results are expressed as relations or predicates instead of as functions. Rather than define a function for calculating a square root, the programmer defines a relation, say $$sqrt(x, y)$$, that is true exactly when $$y^2=x$$.
While there are many application specific logic programming languages, there is one language that stands out as a general purpose logic programming language. Prolog is the language that is most commonly associated with logic programming. The model of computation for Prolog is not based on the Von Neumann architecture. It’s based on the mechanism in logic called unification. Unification is the process where variables are unified to terms.
This text has explored a variety of languages from the JCoCo assembly language, to Java and C++, to Standard ML, and now Prolog. These languages reflect a continuum from prescriptive languages to descriptive languages.
  • Assembly language is a very prescriptive language, meaning that you must think in terms of the particular machine and solve problems accordingly. Programmers must think in terms of the von Neumann machine stored program computer model.
  • C++ is a high-level language and hence allows you to think in a more descriptive way about a problem. However, the underlying computational model is still the von Neumann machine.
  • Standard ML is a high-level language too, but allows the programmer to think in a mathematical way about a problem. This language gets away from the traditional von Neumann model in some ways.
  • Prolog takes the descriptive component of languages further and lets programmers write programs based solely on describing relationships.
Prolog was developed in 1972. Alain Colmerauer, Phillipe Roussel, and Robert Kowalski were key players in the development of the Prolog language. It is a surprisingly small language with a lot of power. The Prolog interpreter operates by doing a depth first search of the search space while unifying terms to try to come to a conclusion about a question that the programmer poses to the interpreter. The programmer describes facts and relationships and then asks questions.
This simple model of programming has been used in a wide variety of applications including automated writing of real estate advertisements, an application that writes legal documents in multiple languages, another that analyzes social networks, and a landfill management expert system. This is only a sampling of the many, many applications that have been written using this simple but powerful programming model.

7.1 Getting Started with Prolog

If you don’t already have a Prolog interpreter, you will want to download one and install it. There are many versions of Prolog available. Some are free and some are not. The standard free implementation is available at http://​www.​swi-prolog.​org. There are binary distributions available for Microsoft Windows, Mac OS X, and Linux, so there should be something to suit your needs.
Unlike SML, there is no way to write a program interactively with Prolog. Instead, you write a text file, sometimes called a database, containing a list of facts and predicates. Then you start the Prolog interpreter, consult the file, and ask yes or no questions that the Prolog interpreter tries to prove are true.
To start the Prolog interpreter you type either pl or swipl depending on your installation of SWI Prolog. To exit the interpreter type a ctl-d. A Prolog program is a database of facts and predicates that can be used to establish further relationships among those facts. A predicate is a function that returns true or false. Prolog programs describe relationships. A simple example is a database of facts about several people in an extended family and the relationships between them as shown in Fig. 7.2. Questions we might ask:
  1. 1.
    Is Gary’s father Sophus?
     
  2. 2.
    Who are Kent’s fathers?
     
  3. 3.
    For who is Lars a father?
     
These questions can all be answered by Prolog given the database in Fig. 7.2.
A330638_2_En_7_Fig2_HTML.gif
Fig. 7.2
The family tree

7.2 Fundamentals

Prolog programs (databases) are composed of facts. Facts describe relationships between terms. Simple terms include numbers and atoms. Atoms are symbols like sophus that represent an object in our universe of discourse. Atoms MUST start with a small letter. Numbers start with a digit and include both integers and real numbers. Real numbers are written in scientific notation. For instance, 3.14159e0 or just 3.14159 when the exponent is zero.
A predicate is a function that returns true or false. Predicates are defined in Prolog by recording a fact or facts about them. For instance, Fig. 7.2 establishes the fact that Johan was the parent of Sophus. parent is a predicate representing a true fact about the relationship of johan and sophus.
Frequently terms include variables in predicate definitions to establish relationships between groups of objects. A variable starts with a capital letter. Variables are used to establish relationships between classes of objects. For instance, to be a father means that you must be a parent of someone and be male. In Fig. 7.2 the father predicate is defined by writing
A330638_2_En_7_Figa_HTML.gif
which means X is the father of Y if X is the parent of Y and X is male. The symbol:− is read as if and the comma in the predicate definition is read as and. So X is a father of Y if X is a parent of Y and X is male.
Practice 7.1
What are the terms in Fig. 7.2? What is the difference between an atom and a variable? Give examples of terms, atoms, and variables from Fig. 7.2.
You can check your answer(s) in Section 7.17.1.
To program in Prolog the programmer first writes a database like the one in Fig. 7.2. Then the programmer consults the database so the Prolog interpreter can internally record the facts that are written there. Once the database has been consulted, questions can be asked about the database. Questions asked of Prolog are limited to yes or no questions that are posed in terms of the predicates in the database. A question posed to Prolog is sometimes called a query. To discover if Johan is the father of Sophus you start Prolog using pl or swipl, then consult the database, and pose the query.
A330638_2_En_7_Figb_HTML.gif
Queries may also contain variables. If we want to find out who the father of sophus is we can ask that of Prolog by replacing the father position in the predicate with a variable. When using a variable in a query Prolog will answer yes or no. If the answer is yes, Prolog will tell us what the value of the variable was when the answer was yes. If there is more than one way for the answer to be yes then typing a semicolon will tell Prolog to look for other values where the query is true.
A330638_2_En_7_Figc_HTML.gif
The final No is Prolog telling us there are no other ways for parent (X, kent) to be true.

7.3 The Prolog Program

Prolog performs unification to search for a solution. Unification is simply a list of substitutions of terms for variables. A query of the database is matched with its predicate definition in the database. Terms in the query are matched when a suitable pattern is found among the parameters of a predicate in the database. If the matched predicate is dependent on other predicates being true, then those queries are posed to the Prolog interpreter. This process continues until either Prolog finds that no substitution will satisfy the query or it finds a suitable substitution.
Prolog uses depth first search with backtracking to search for a valid substitution. In its search for truth it will unify variables to terms. Once a valid substitution is found it will report the substitution and wait for input. In Sect. 7.2 the interpreter reports that $$X=gary$$ is a substitution that makes parent(X, kent) true. Prolog waits until either return is pressed or a semicolon is entered. When the semicolon is entered, Prolog undoes the last successful substitution it made and continues searching for another substitution that will satisfy the query. In Sect. 7.2 Prolog reports that $$X=gerry$$ will satisfy the query as well. Pressing semicolon one more time undoes the $$X=gerry$$ substitution, Prolog continues its depth first search looking for another substitution, finds none, and reports No indicating that the search has exhausted all possible substitutions.
Unification finds a substitution of terms for variables or variables for terms. Unification is a symmetric operation. It doesn’t work in only one direction. This means (among other things) that Prolog predicates can run backwards and forwards. For instance, if you want to know who Kent’s dad is you can ask that as easily as who is Gary the father of. In the following example we find out that gary is the father of kent. We also find out who gary is the father of.
A330638_2_En_7_Figd_HTML.gif
Practice 7.2
Write predicates that define the following relationships.
  1. 1.
    brother
     
  2. 2.
    sister
     
  3. 3.
    grandparent
     
  4. 4.
    grandchild
     
Depending on how you wrote grandparent and grandchild there might be something to note about these two predicates. Do you see a pattern? Why?
You can check your answer(s) in Section 7.17.2.

7.4 Lists

Prolog supports lists as a data structure. A list is constructed the same as in ML. A list may be empty which is written as [] in Prolog. A non-empty list is constructed from an element and a list. The construction of a list with head, H, and tail, T, is written as[H | T]. So, [1,2,3] can also be written as [1 | [2 | [3 | []]]]. The list [a | []] is equivalent to writing [a]. Unlike ML, lists in Prolog do not need to be homogeneous. So [1, hi, 4.3] is a valid Prolog list.
By virtue of the fact that Prolog’s algorithm is depth first search combined with unification, Prolog naturally does pattern matching. Not only does [H | T] work to construct a list, it also works to match a list with a variable. Append can be written as a relationship between three lists. The result of appending the first two lists is the third argument to the append predicate. The first fact below says appending the empty list to the front of Y is just Y. The second fact says that appending a list whose first element is H to the front of L2 results in [H|T3] when appending T1 and L2 results in T3.
A330638_2_En_7_Fige_HTML.gif
Try out append both backwards and forwards! The definition of append can be used to define a predicate called sublist as follows:
A330638_2_En_7_Figf_HTML.gif
Stated in English this says that X is a sublist of Y if you can append something on the front of X to get L and something else on the end of L to get Y. The underscore is used in predicate definitions for values we don’t care about.
To prove that sublist([1],[1,2]) is true we can use the definition of sublist and append to find a substitution for which the predicate holds. Figure 7.3 provides a proof that [1] is a sublist of [1,2].
A330638_2_En_7_Fig3_HTML.gif
Fig. 7.3
A unification tree
Practice 7.3
What is the complexity of the append predicate? How many steps does it take to append two lists?
You can check your answer(s) in Section 7.17.3.
Practice 7.4
Write the reverse predicate for lists in Prolog using the append predicate. What is the complexity of this reverse predicate?
You can check your answer(s) in Section 7.17.4.

7.5 The Accumulator Pattern

The slow version of reverse from practice problem 7.4 can be improved upon. The accumulator pattern can be applied to Prolog as it was in SML. Looking back at the solution to practice problem 5.​17, the ML solution can be rewritten to apply to Prolog as well. In the ML version an accumulator argument was added to the function that allowed the helprev helper function to accumulate the reversed list without the use of append.
A330638_2_En_7_Figg_HTML.gif
Unlike SML, Prolog does not have any facility for defining local functions with limited scope. If using helper predicates in a Prolog program the user and/or programmer must be trusted to invoke the correct predicates in the correct way.
Applying what we learned from the ML version of reverse to Prolog results in a helprev predicate with an extra argument as well. In many ways this is the same function rewritten in Prolog syntax. The only trick is to remember that you don’t write functions in Prolog. Instead, you write predicates. Predicates are just like functions with an extra parameter. The extra parameter establishes the relationship between the input and the output.
Sometimes in Prolog it is useful to think of input and output parameters. For instance, with append defined as a predicate it might be useful to think of the first two parameters as input values and the third as the return value. While as a programmer it might sometimes be useful to think this way, this is not how Prolog works. As was shown in Sect. 7.4, append works both backwards and forwards. But, thinking about the problem in this way may help identifying a base case or cases. When the base cases are identified, the problem may be easier to solve.
Practice 7.5
Write the reverse predicate using a helper predicate to make a linear time reverse using the accumulator pattern.
You can check your answer(s) in Section 7.17.5.

7.6 Built-In Predicates

Prolog offers a few built in predicates. The relational operators (<, >, $$<=$$, $$>=$$, and $$=$$) all work on numbers and are written in infix form. Notice that not equals is written as \$$=$$ in Prolog.
To check that a predicate doesn’t hold, the not predicate is provided. Preceding any predicate with not insists the predicate returns false. For instance, not(5 > 6) returns true because 5 > 6 returns false.
The atom predicate returns true if the argument is an atom. So atom(sophus) is true but atom(5) is not. The number predicate returns true if the argument is a number. So number(5) is true but number(sophus) is not.

7.7 Unification and Arithmetic

The Prolog interpreter does a depth first search of the search space while unifying variables to terms. The primary operation that Prolog carries out is unification. Unification can be represented explicitly in a Prolog program by using the equals (i.e. $$=$$) operator. When equals is used, Prolog attempts to unify the terms that appear on each side of the operator. If they can be unified, Prolog reports yes and continues unifying other terms to try to find a substitution that satisfies the query. If no substitution is possible, Prolog will report no.
You might have caught yourself wanting to write something like $$X=Y$$ in some of the practice problem s. This is normal, but is the sign of a novice Prolog programmer. Writing $$X=Y$$ in a predicate definition is never necessary. Instead, everywhere Y appears in the predicate, write X instead.
Unification has one other little nuance that most new Prolog programmers miss. There is no point in unifying a variable to a term if that variable is used only once in a predicate definition. Unification is all about describing relationships. Unification doesn’t mean much when a variable is not used in more than one place in a definition. In terms of imperative programming it’s kind of like storing a value in a variable and then never using the variable. What’s the point? Prolog warns us when we do this by saying
A330638_2_En_7_Figh_HTML.gif
If this happens, look for a variable called X (or whatever the variable name is) that is used only once in a predicate definition and replace it with an underscore (i.e. _). An underscore indicates the result of unification in that position of a predicate isn’t needed by the current computation. Prolog warns you of singleton variables because they are a sign that there may be an error in a predicate definition. If an extra variable exists in a predicate definition it may never be instantiated. If that is the case, the predicate will always fail to find a valid substitution. While singleton variables should be removed from predicate definitions, the message is only a warning and does not mean that the predicate is wrong.
The use of equality for unification and not for assignment statements probably seems a little odd to most imperative programmers. The equals operator is not the assignment operator in Prolog. It is unification. Assignment and unification are different concepts. Writing X $$=6*5$$ in Prolog means that the variable X must be equal to the term 6 $$*$$ 5, not 30. The equals operator doesn’t do arithmetic in Prolog. Instead, a special Prolog operator called is is used. To compute 6 $$*$$ 5 and assign the result to the variable X the Prolog programmer writes X is 6 $$*$$ 5 as part of a predicate. Using the is operator succeeds when the variable on the left is unbound and the expression on the right doesn’t cause an exception when computed. All values on the right side of the is predicate must be known for the operation to complete successfully. Arithmetic can only be satisfied in one direction, from left to right. This means that predicates involving arithmetic can only be used in one direction, unlike the append predicate and other predicates that don’t involve arithmetic.
Practice 7.6
Write a length predicate that computes the length of a list.
You can check your answer(s) in Section 7.17.6.

7.8 Input and Output

Prolog programs can read from standard input and write to standard output. Reading input is a side-effect so it can only be satisfied once. Once read, it is impossible to unread something. The most basic predicates for getting input are get_char(X) which instantiates X to the next character in the input (whatever it is) and get(X) which instantiates X to the next non-whitespace character. The get_char predicate instantiates X to the character that was read. The get predicate instantiates X to the ASCII code of the next character.
There is also a predicate called read(X) which reads the next term from the input. When X is uninstantiated, the next term is read from the input and X is instantiated with its value. If X is already instantiated, the next term is read from the input and Prolog attempts to unify the two terms.
As a convenience, there are certain libraries that also may be provided with Prolog. The readln predicate may be used to read an entire line of terms from the keyboard, instantiating a variable to the list that was read. The readln predicate has several arguments to control how the terms are read, but typically it can be used by writing readln(L, _, _, _, lowercase).
A330638_2_En_7_Figi_HTML.gif
Reading input from the keyboard, no matter which predicate is used, causes Prolog to prompt for the input by printing a $$\texttt {|:}$$ to the screen. If the readln predicate is invoked as shown above, entering the text below will instantiate L to the list as shown.
A330638_2_En_7_Figj_HTML.gif
The print(X) predicate will print a term to the screen in Prolog. The value of its argument must be instantiated to print it. Print always succeeds even if the argument is an uninstantiated variable. However, printing an uninstantiated variable results in the name of the variable being printed which is probably not what the programmer wants. When a query is made in Prolog, each variable is given a unique name to avoid name collisions with other predicates the query may be dependent on. Prolog assigns these unique names and they start with an underscore character. If an uninstantiated variable is printed, you will see it’s Prolog assigned unique name.
A330638_2_En_7_Figk_HTML.gif
The print predicate is satisfied by unifying the variable with the name of Prolog’s internal unique variable name which is almost certainly not what was intended. The print predicate should never be invoked with an uninstantiated variable.

7.9 Structures

Prolog terms include numbers, atoms, variables and one other important type of term called a structure. A structure in Prolog is like a datatype in SML. Structures are recursive data structures that are used to model structured data. Computer scientists typically call this kind of structured data a tree because they model recursive, hierarchical data. A structure is written by writing a string of characters preceding a tuple of some number of elements. Consider implementing a lookup predicate for a binary search tree in Prolog. A tree may be defined recursively as either nil or a btnode(Val, Left, Right) where Val is the value stored at the node and Left and Right represent the left and right binary search trees. The recursive definition of a binary search tree says that all values in the left subtree must be less than Val and all values in the right subtree must be greater than Val. For this example, let’s assume that binary search trees don’t have duplicate values stored in them.
A typical binary search tree structure might look something like the term below and corresponds to the tree shown graphically in Fig. 7.4.
A330638_2_En_7_Figl_HTML.gif
A330638_2_En_7_Fig4_HTML.gif
Fig. 7.4
Search tree
Items may be inserted into and deleted from a binary search tree. Since Prolog programmers write predicates, the code to insert into and delete from a binary search tree must reflect the before and after picture. Because a binary search tree is recursively defined, each part of the definition will be part of a corresponding case for the insert and delete predicates. So, inserting into a search tree involves the value to insert, the tree before it was inserted, and the tree after it was inserted. Similarly, a delete predicate involves the same three arguments.
Looking up a value in a binary search tree results in a true or false response, which is the definition of a predicate. Writing a lookup predicate requires the value and the search tree in which to look for the value.
Practice 7.7
Write a lookup predicate that looks up a value in a binary search tree like the kind defined in this section.
You can check your answer(s) in Section 7.17.7.

7.10 Parsing in Prolog

As mentioned earlier in the text, Prolog originated out of Colmerauer’s interest in using logic to express grammar rules and to formalize the parsing of natural language sentences. Kowalski and Comerauer solved this problem together and Colmerauer figured out how to encode the grammar as predicates so sentences could be parsed efficiently. The next sections describe the implementation of parsing Colmerauer devised in 1972. Consider the following context-free grammar for English sentences.
A330638_2_En_7_Figm_HTML.gif
Given a sequence of tokens like “the professor discovered a group.”, Chap. 2 showed that a parse tree can be used to demonstrate that a string is a sentence in the language and at the same time displays its syntactic structure.
Practice 7.8
Construct the parse tree for “the professor discovered a group.” using the grammar in this section.
You can check your answer(s) in Section 7.17.8.
Prolog is especially well suited to parse sentences like the one in practice problem 6.​8. The language has built in support for writing grammars and will automatically generate a parser given the grammar of a language. How Prolog does this is not intuitively obvious. The grammar is taken through a series of transformations that produce the parser. The next few pages present these transformations to provide insight into how Prolog generates parsers.
Parsing in Prolog requires the source program, or sentence, be scanned as in the parser implementations presented in Chaps. 2 and 3. The readln predicate will suffice to read a sentence from the keyboard and scan the tokens in it. Using the readln predicate to read the sentence, “the professor discovered a group.”, produces the list [the, professor, discovered, a, group,’.’].
A Prolog parser is a top-down or recursive-descent parser. Because the constructed parser is top-down, the grammar must be LL(1). There cannot be any left-recursive productions in the grammar. Also, because Prolog uses backtracking, there cannot be any productions in the grammar with common prefixes. If there are any common prefixes, left factorization must be performed. Fortunately, the grammar presented in this section is already LL(1).
A330638_2_En_7_Fig5_HTML.gif
Fig. 7.5
Sentence structure
The Prolog parser will take the list of tokens and produce a Prolog structure. The structure is the Prolog representation of the abstract syntax tree of the sentence. For instance, the sentence, “the professor discovered a group.”, when parsed by Prolog, yields the term sen(sub(det(the), noun(professor)), pred(verb(discovered), sub(det(a), noun(group)))).
The logic programming approach to analyzing a sentence in a grammar can be viewed in terms of a graph whose edges are labeled by the tokens or terminals in the language. Figure 7.6 contains a graph representation of a sentence. Two terminals are contiguous in the original string if they share a common node in the graph.
A330638_2_En_7_Fig6_HTML.gif
Fig. 7.6
A sentence graph
A sequence of contiguous labels constitutes a nonterminal if the sequence corresponds to the right-hand side of a production rule in the grammar. The contiguous sequence may then be labeled with the nonterminal. In Fig. 7.5 three nonterminals are identified. To facilitate the representation of graphs like Fig. 7.6 in Prolog the nodes of the graph are given labels. Positive integers are convenient labels to use as shown in Fig. 7.7.
A330638_2_En_7_Fig7_HTML.gif
Fig. 7.7
A labeled sentence graph
The graph for the sentence can be represented in Prolog by entering the following facts. These predicates reflect the end points of their corresponding labeled edge in the graph.
A330638_2_En_7_Fign_HTML.gif
Using the labeled graph in Fig. 7.7, nonterminals in the grammar can be represented by predicates. For instance, the subject of a sentence can be represented by a subject predicate. The subject(K, L) predicate means that the path from node K to node L can be interpreted as an instance of the subject nonterminal.
For example, subject(4, 6) should return true because edge (4, 5) is labeled by a determiner “a” and edge (5, 6) is labeled by the noun “group’‘. To define a sentence predicate there must exist a determiner and a noun. The rule for the sentence predicate is
A330638_2_En_7_Figo_HTML.gif
The common variable M insure the determiner immediately precedes the noun.
Practice 7.9
Construct the predicates for the rest of the grammar.
You can check your answer(s) in Section 7.17.9.
The syntactic correctness of the sentence, “the professor discovered a group.” can be determined by either of the following queries
A330638_2_En_7_Figp_HTML.gif
The sentence is recognized by the parser when the paths in the graph corresponding to the nonterminals in the grammar are verified. If eventually a path for the sentence nonterminal is found then the sentence is valid. The paths in the graph of the sentence are shown in Fig. 7.8. Note the similarity of the structure exhibited by the paths in the graph with the tree of the sentence. If you use your imagination a bit you can see the parse tree upside down (or right-side up for your non-programming friends).
A330638_2_En_7_Fig8_HTML.gif
Fig. 7.8
An upside down parse tree

7.10.1 Difference Lists

There are a couple of problems with the development of the parser above. First, entering the sentence as facts like the(1,2) and professor(2,3) is impractical and awkward. There would have to be some preprocessing on the list to get it in the correct format to be parsed. While this could be done, a better solution exists. The other problem concerns what the parser does. So far the parser only recognizes a syntactically valid sentence and does not produce a representation of the abstract syntax tree for the sentence.
Labeling the nodes of the graph above with integers was an arbitrary decision. The only requirement of labeling nodes in the graph requires that it be obvious when two nodes in the graph are connected. Both problems above can be solved by letting sublists of the sentence label the graph instead of labeling the nodes with integers. These sublists are called difference lists. A difference list represents the part of the sentence that is left to be parsed. The difference between two adjacent nodes is the term which labels the intervening edge. The difference list representation of the graph is shown in Fig. 7.9. Using difference lists, two nodes are connected if their difference lists differ by only one element. This connection relationship can be expressed as a Prolog predicate.
A330638_2_En_7_Fig9_HTML.gif
Fig. 7.9
Difference lists
This is the connect predicate and the grammar rewritten to use the connect predicate.
A330638_2_En_7_Figq_HTML.gif
The c (i.e. connect) predicate says that the node labeled [H|T] is connected to the node labeled T and the edge connecting the two nodes is labeled H. This predicate can be used for the terminals in the grammar in place of the facts given above.
A330638_2_En_7_Figr_HTML.gif
The graph need not be explicitly created when this representation is employed. The syntactic correctness of the sentence, “the professor discovered a group.” can be recognized by the following query.
A330638_2_En_7_Figs_HTML.gif
The parsing succeeds because the node labeled with [the, professor, discovered, a, group, ‘.’] can be joined to the node labeled with [] via the intermediate nodes involved in the recursive descent parse of the sentence. Because Prolog predicates work backwards as well as forward, it is just as easy to explore all the sentences of this grammar by posing this query to the Prolog interpreter.
A330638_2_En_7_Figt_HTML.gif
This reveals that there are 126 different sentences defined by the grammar. Some of the sentences are pretty non-sensical like “the group discovered a group.”. Some of the sentences like “the group jailed the professor.” have some truth to them. Sophus Lie used to walk to many of the places he visited partly because he liked to walk and partly because he had little money at the time. He also liked to draw sketches of the countryside when hiking. He was jailed in France when France and Germany were at war because the French thought he was a German spy. It was understandable since he was walking through the countryside talking to himself in Norwegian (which the French thought might be German). When they stopped to question him, they found his notebook full of Mathematical formulas and sketchings of the French countryside. He spent a month in prison until they let him go. While in prison he read and worked on his research in Geometry. Of his prison stay he later commented, “I think that a Mathematician is comparatively well suited to be in Prison.”[20]. Other mathematicians may not agree with his assessment of the mathematical personality.
Some care must be taken when asking for all sentences of a grammar. If the grammar contained a recursive rule, say
A330638_2_En_7_Figu_HTML.gif
then the language would allow infinitely many sentences, and the sentence generator will get stuck with ever lengthening subject phrases.

7.11 Prolog Grammar Rules

Most implementations of Prolog have a preprocessor which translates grammar rules into Prolog predicates that implement a parser of the language defined by the grammar. The grammar of the English language example takes the following form as a logic grammar in Prolog.
A330638_2_En_7_Figv_HTML.gif
Note that terminal symbols appear inside brackets exactly as they look in the source text. Since they are Prolog atoms, tokens starting with characters other than lower case letters must be placed within apostrophes. The Prolog interpreter automatically translates these grammar rules into normal Prolog predicates identical to those defining the grammar presented in the previous section.

7.12 Building an AST

The grammar given above is transformed by a preprocessor to generate a Prolog parser. However, in its given form the parser will only answer yes or no, indicating the sentence is valid or invalid. Programmers also want an abstract syntax tree if the sentence is valid. The problem of producing an abstract syntax tree as a sentence is parsed can be handled by using parameters in the logic grammar rules.
Predicates defined using Prolog grammar rules may have arguments in addition to the implicit ones created by the preprocessor. These additional arguments are inserted by the translator to precede the implicit arguments. The grammar rule
A330638_2_En_7_Figw_HTML.gif
will be translated into the Prolog rule
A330638_2_En_7_Figx_HTML.gif
A query with a variable representing a tree produces that tree as its answer.
A330638_2_En_7_Figy_HTML.gif
Practice 7.10
Write a grammar for the subset of English sentences presented in this text to parse sentences like the one above. Include parameters to build abstract syntax trees like the one above.
You can check your answer(s) in Section 7.17.10.
Writing an interpreter or compiler in Prolog is relatively simple given the grammar for the language. Once the AST has been generated for an expression in the language the back end of the interpreter or compiler proceeds much like it does in other languages.

7.13 Attribute Grammars

Programming language syntax is specified by formal methods like grammars. Semantics, or the meaning of a computer program, are much harder to define. The study of formal methods of specifying the meaning, or semantics, of a program is a difficult but rewarding area of Computer Science. In Chap. 6 a compiler for the Small language was developed. Mapping the Small language into the language of JCoCo is a way of defining the semantics of Small. Mappings like this are sometimes called Small Step Operational Semantics meaning that the Small language was defined in terms of the smaller steps in the JCoCo language. Of course, the JCoCo language’s semantics should also be formally defined in that case.
Another form of semantic definition is an Attribute Grammar. Attribute grammars are not ideal for larger languages, even languages as big as the Small language would be difficult and tedious to describe with an attribute grammar. But, a language like the prefix calculator language is perfect for an attribute grammar definition.
The prefix calculator expression language was first presented in Chap. 5. The contents of the memory location after evaluating an expression is not specified by the grammar of the language. In fact, the purpose of any of the operators is not made explicit in the grammar. Even though we know that $$*$$ stands for multiplication, there is nothing in the grammar itself that insists this be the case. Other means are necessary to convey that meaning. One such method of conveying the semantics of a language is called an attribute grammar. An attribute grammar adds attributes to each node of an abstract syntax tree for sentences in the language.
The attributes tell us how a program would be evaluated in terms of its abstract syntax tree. In other words, an attribute grammar provides a mapping of the syntax of a program into a set of attributes that describe the semantics of the program. Consider the prefix calculator grammar
A330638_2_En_7_Figz_HTML.gif
Recall the grammar represents prefix expressions because the operation is written before its arguments. So, $$+$$5 $$*$$ 64 results in 29 when evaluated. Notice that when written in prefix notation, the expression S 5 stores 5 in the memory location. S is a prefix operator.
A330638_2_En_7_Fig10_HTML.gif
Fig. 7.10
AST definition
A330638_2_En_7_Fig11_HTML.gif
Fig. 7.11
Annotated AST for $$+$$ S 4 R
A330638_2_En_7_Fig12_HTML.gif
Fig. 7.12
Attribute grammar
The prog node in the abstract syntax definition in Fig. 7.10 was added to assist in the definition of the attribute grammar. This abstract syntax can be used in Prolog but does not need to be defined as a datatype as it would in Standard ML.
An attribute grammar attaches assignment statements for the attributes to each node in the abstract syntax tree. To distinguish between parts of the abstract syntax tree, let AST0 denote the AST on the left hand side of a production and ASTi where i > 0 represent an AST on the right hand side of the production. The attribute grammar for the calculator language is given in Fig. 7.12. Semantics rules are attached to each of the nodes in the AST definition. These rules govern the assignment of the attributes in the AST. The numbers to the left of each rule are there simply to number the rules and are not part of the attribute grammar. By deriving an AST for a sentence and then applying the semantic rules the tree is decorated with attributes that describe the meaning of the sentence, or program, in the language.
The attribute grammar given in Fig. 7.12 can be used to convey the meaning of evaluating an expression like $$+$$ S 4 R. Figure 7.11 depicts the annotated AST according to the attribute grammar given in Fig. 7.12.
Practice 7.11
Justify the annotation of the tree given in Fig. 7.11 by stating which rule was used in assigning each of the attributes annotating the tree.
You can check your answer(s) in Section 7.17.11.

7.13.1 Synthesized Versus Inherited

Attributes in an attribute grammar come in two flavors. Some attributes are inherited which means they are derived from values that are above or to the left in the AST. Some attributes are synthesized meaning they are derived from values that are below or to the right in the tree. The val attribute is a synthesized attribute in the attribute grammar presented in Fig. 7.12.
Practice 7.12
Is the min attribute synthesized or inherited? Is the mout attribute synthesized or inherited?
You can check your answer(s) in Section 7.17.12.
Attribute grammars work great for small languages. When a language is larger, the number of attributes can grow exponentially, resulting in a very large annotated tree. In addition, attribute grammars don’t deal well with things like control flow and values that aren’t determined until run-time. There are many aspects of programming languages that are difficult to assign as attributes in an AST. Typically, attribute grammars work well for small interpreted languages with little or no unknown information.

7.14 Chapter Summary

This chapter provided an introduction to programming in Prolog. List manipulation and building and traversing complex recursive terms are important skills in becoming an experienced Prolog programmer. Grammars and recursive-descent parsing are natural topics relating to Prolog. Building top-down parsers in Prolog is easy with the grammar extension provided in the Prolog language.
In addition, the chapter introduced a couple of formal semantic methods for describing programming languages. Small step operational semantics is one method where a language is defined in terms of smaller steps in a simpler well-defined language. Attribute grammars is another method of assigning meaning to programs.
There are several good books on Prolog programming. The Prolog presented in this chapter is enough to get a flavor of the language and a good start programming in the language. Things left out of the discussion include the cut operator and some nuances of how unification is done (i.e. the difference between $$=$$ and $$==$$). Reading from and writing to files was also left out. The definitive book for more information is Clocksin and Mellish [5]. This book lacks exercises but contains many examples and is a good reference once you understand something about how to program in Prolog (which I hope you do once you’ve read the chapter and worked through the problems).

7.15 Review Questions

  1. 1.
    What is a term made up of in Prolog? Give examples of both simple and complex terms.
     
  2. 2.
    What is a predicate in Prolog?
     
  3. 3.
    In Standard ML you can pattern match a list using (h::t). How do you pattern match a list in Prolog?
     
  4. 4.
    According to the definition of append, which are the input and the output parameters to the predicate?
     
  5. 5.
    How do you get more possible answers for a question posed to Prolog?
     
  6. 6.
    In the expression X $$=$$ 6 $$*$$ 5 $$+$$ 4 why doesn’t X equal 34 when evaluated in Prolog? What does X equal? What would you write to get X equal to 34?
     
  7. 7.
    Provide the calls to lookup to look up 7 in the binary tree in Fig. 7.4. Be sure to write down the whole term that is passed to lookup each time. You can consult the answer to practice problem 7.5 to see the definition of the lookup predicate.
     
  8. 8.
    What symbol is used in place of the:- when writing a grammar in Prolog?
     
  9. 9.
    What is a synthesized atrribute?
     
  10. 10.
    What is an inherited attribute?
     

7.16 Exercises

In these early exercises you should work with the relative database presented at the beginning of this chapter.
  1. 1.
    Write a rule (i.e. predicate) that describes the relationship of a sibling. Then write a query to find out if Anne and Stephen are siblings. Then ask if Stephen and Michael are siblings. What is Prolog’s response?
     
  2. 2.
    Write a rule that describes the relationship of a brother. Then write a query to find the brothers of sophusw. What is Prolog’s response?
     
  3. 3.
    Write a rule that describes the relationship of a niece. Then write a query to find all nieces in the database. What is Prolog’s response?
     
  4. 4.
    Write a predicate that describes the relationship of cousins.
     
  5. 5.
    Write a predicate that describes the ancestor relationship.
     
  6. 6.
    Write a predicate called odd that returns true if a list has an odd number of elements.
     
  7. 7.
    Write a predicate that checks to see if a list is a palindrome.
     
  8. 8.
    Show the substitution required to prove that sublist([a, b], [c, a, b]) is true. Use the definition in Fig. 7.3 and use the same method of proving it’s true.
     
  9. 9.
    Write a predicate that computes the factorial of a number.
     
  10. 10.
    Write a predicate that computes the nth fibonacci number in exponential time complexity.
     
  11. 11.
    Write a predicate that computes the nth fibonacci number in linear time complexity.
     
  12. 12.
    Write a predicate that returns true if a third list is the result of zipping two others together. For instance,
    A330638_2_En_7_Figaa_HTML.gif
    should return true since zipping [1, 2, 3] and [a, b, c] would yield the list of pairs given above.
     
  13. 13.
    Write a predicate that counts the number of times a specific atom appears in a list.
     
  14. 14.
    Write a predicate that returns true if a list is three copies of the same sublist. For instance, the predicate should return true if called as
    A330638_2_En_7_Figab_HTML.gif
    It should also return true if it were called like
    A330638_2_En_7_Figac_HTML.gif
     
  15. 15.
    Implement insert, lookup, and delete on a binary search tree. The structure of a binary search tree was discussed in this chapter. Your main run predicate should be this:
    A330638_2_En_7_Figad_HTML.gif
    The run predicate calls the buildTree predicate to build the binary search tree from the list read by the readline. If 5 8 2 10 is entered at the keyboard, L would be the list containing those numbers. To complete this project there should be at least three predicates: insert, lookup, and delFromTree.
    The lookup predicate was a practice problem and the solution is provided if you need it. The insert predicate is somewhat like the lookup predicate except that a new node is constructed when you reach a leaf. Deleting a node is similiar to looking it up except that if it is found, the tree is altered to delete the node. Deleting a node from a binary search tree has three cases.
    1. (a)
      The node to delete is a leaf node. If this is the case, then deleting it is simple because you just return an empty tree. In Fig. 7.4 this occurs when 2, 4, 7, or 10 is deleted.
       
    2. (b)
      The node to delete has one child. If this is the case, then the result of deleting the node is the subtree under the deleted node. In Fig. 7.4, if the 9 is deleted, then the 10 is just moved up to replace the 9 in the tree.
       
    3. (c)
      The node to delete has two children. If this is the case, then you have to do two things. First, find the left-most value from the right subtree. Then, delete the left-most value from the right subtree and return a new tree with the left-most value of the right subtree at its root. Consider deleting 5 from the tree in Fig. 7.4. The left-most value of the right subtree is 7. To delete 5 we put the 7 at the root of the tree and then delete 7 from the right subtree.
       
    To make this project easy, write it incrementally. Print the results as you go so you can see what works and what doesn’t. The print predicate will print its argument while the nl predicate will print a newline. Don’t start by writing the entire run predicate right away. Write one piece at a time, test it, and then move on to the next piece.
     
  16. 16.
    Implement a calculator prefix expression interpreter in Prolog as described in the section on attribute grammars in this chapter. The interpreter will read an expression from the keyboard and print its result. The interpreter should start with a calc predicate. Here is the calc predicate to get you started.
    A330638_2_En_7_Figae_HTML.gif
    The program reads a list of tokens from the keyboard. The preprocess predicate should take the list of values and add num tags to any number it finds in the list. This makes writing the grammar a lot easier. Any number like 6 in L should be replaced by num((6) in the list PreL. The expr predicate represents the start symbol of your grammar. Finally, the interpret predicate is the attribute grammar evaluation of the AST represented by Tree.
    To make this project easy, write it incrementally. Print the results as you go so you can see what works and what doesn’t. The print predicate will print its argument while the nl predicate will print a newline. Don’t write the entire calc predicate right away. Write one piece, test it, and then move on to the next piece.
     

7.17 Solutions to Practice Problems

These are solutions to the practice problem s. You should only consult these answers after you have tried each of them for yourself first. Practice problems are meant to help reinforce the material you have just read so make use of them.

7.17.1 Solution to Practice Problem 7.1

Terms include atoms and variables. Atoms include sophus, fred, sophusw, kent, johan, mads, etc. Atoms start with a lowercase letter. Variables start with a capital letter and include X and Y from the example.

7.17.2 Solution to Practice Problem 7.2

  1. 1.
    brother(X,Y):- father(Z,X), father(Z, Y), male(X).
     
  2. 2.
    sister(X,Y):- father(Z,X), father(Z, Y), female(X).
     
  3. 3.
    grandparent(X,Y):- parent(X,Z), parent(Z, Y).
     
  4. 4.
    grandchild(X,Y):- grandparent(Y, X).
     
Grandparent and grandchild relationships are just the inverse of each other.

7.17.3 Solution to Practice Problem 7.3

The complexity of append is O(n) in the length of the first list.

7.17.4 Solution to Practice Problem 7.4

A330638_2_En_7_Figaf_HTML.gif
This predicate has O($$n^2$$) complexity since append is called n times and append is O($$n$$) complexity.

7.17.5 Solution to Practice Problem 7.5

A330638_2_En_7_Figag_HTML.gif

7.17.6 Solution to Practice Problem 7.6

A330638_2_En_7_Figah_HTML.gif

7.17.7 Solution to Practice Problem 7.7

A330638_2_En_7_Figai_HTML.gif

7.17.8 Solution to Practice Problem 7.8

7.17.9 Solution to Practice Problem 7.9

A330638_2_En_7_Figaj_HTML.gif

7.17.10 Solution to Practice Problem 7.10

A330638_2_En_7_Figak_HTML.gif

7.17.11 Solution to Practice Problem 7.11 (See Fig. 7.13)

7.17.12 Solution to Practice Problem 7.12

The val attribute is synthesized. The min value is inherited. The mout value is synthesized (See Fig. 7.14).
A330638_2_En_7_Fig13_HTML.gif
Fig. 7.13
The sentence structure for “The professor discovered a group”
A330638_2_En_7_Fig14_HTML.gif
Fig. 7.14
Decorated tree for the prefix expression $$+$$ S 4 R