Archive for the ‘artificial intelligence’ Category

Automated Reasoning in Uncertainty Using Fuzzy Logic Sets

Automated reasoning in artificial intelligence tries to emulate human reasoning through automated deduction. The four color theorem is an example of this, and was the first major theorem to be proved using a computer (Appel & Haken, 1977). However, a large part of human reasoning involves decisions with uncertain, incomplete, ambiguous, or inaccurate input. A good primer for how fuzzy logic is used today in cancer research can be found here.

We demand rigidly defined areas of doubt and uncertainty!

Douglas Adams, The Hitchhiker’s Guide to the Galaxy

Fuzzy set theory was introduced by Lotfi Zadeh (1965) which allowed for the use of classic set theory and their operations to a class of objects in sets that have varying degrees of membership.

Fuzzy Logic Intro

Using Aristotle logic in expert systems involving automated reasoning is impractical when dealing with uncertainty. Fuzzy logic (Zadeh, 1965) provided flexibility in reasoning for use in expert systems with uncertainty.

In classical set theory, a set is a collection of crisp values. Crisp values are distinct and precise. These sets deal with strict boundaries. In a Boolean system truth values are also crisp, that is either 1 as true, or 0 as false. For example, we could have a set of geochemical samples with copper parts per million (Cu ppm) values over 100.

In this example we have two groups, members and non-members. Members are any values over 100 and non-members are 100 or less. There is no such thing as partial membership. We can have a membership function defined as

Fuzzy logic provides intermediate values between true and false and extends classic set theory. We can now have partial membership in the set , as seen in Figure 1.  

Figure 1: Classical Set Theory versus Fuzzy Set Theory.

In our fuzzy set we can represent our set , as a set of ordered pairs in the universe .

In our case the universe  is finite and discrete, and our set can be described as

Suppose we have a crisp set , with their respective partial membership . We visualize this in Figure 2, and can represent the fuzzy set in its discrete form.

Figure 2: Fuzzy Set vs. Crisp Set example.

In my next post, I will talk about how to do operations on these sets, and how to apply them to geological data.


Appel, K., & Haken, W. (1977). The Solution of the Four-Color-Map Problem. Scientific American, 237(4), 108–121.

Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353.

Tags: , ,
Posted in artificial intelligence | No Comments »

Heuristics involve the study of methods and rules for discovery and invention[efn_note]G, Polya. (1945). How to Solve It. Princeton, NJ: Princeton University Press.[/efn_note]. In AI, heuristics involve defining rules for choosing branches in state space search that are likely to lead to acceptable solutions. This can be particularly useful in two situations. These are problems that don’t require an exact solutions due to the ambiguities in the problem statement or available data, and problems that do have an exact solution but the state space is exceptionally large and therefore computationally expensive.

Have no fear of perfection – you’ll never reach it.

Salvador Dali

There are well developed heuristics I’d like to talk about today, along with some examples of each.

Hill Climbing Heuristics

One of the simplest heuristics we can use is the hill climbing heuristic. This strategy involves picking the best child and searching that, without retaining any information on parents or siblings. This heuristic tends to search deep into the state space.

Some obvious problems with this strategy is the tendency to get stuck at a local maxima (or minima). This can often result in failing to find an optimal solution.

Hill climbing heuristics example.
Figure 1: In this example searching for a maximum number a hill climbing heuristic would choose the 10 child and stop, missing the solutions on the right side of the tree. This is the major draw back of hill climbing.

This method is often called greedy, as it simply takes the best option at time ignoring possible future gains. However this technique can be optimal in some situations. These situations typically tend to be when moves in state space can’t be undone. Think tic-tac-toe; not chess. We can add a heuristic to a tic-tac-toe game that picks the child that leads to to most possible wins. For example, the first move X would make is in the center. The center leads to 4 possible wins (2 diagonals, middle row, middle column), and no other move leads to 4 or more wins.

Tic-tac-toe state space representation. First tree and elimination of nodes by heuristics.
Figure 2: Showing the first two levels of the state space for tic-tac-toe. The “O” player can eliminate half the possible moves by using the heuristic.

Dynamic Programming

Another creative way to reduce state space is to break the problem down into subproblems and reuse the subproblems that have already been solved. This technique is often called memoization (as in memo, not memorize!)

Dynamic programming is used to solve a large number of problems. Let’s think of a problem where we need to find the longest common substring (LCS). For example we have the arrays (1,0,0,1,0,1,0,1) and (0,1,0,1,1,0,1,1,0), and we want to find the longest string of common bits. An example answer would be (1,0,1,0) from (1,0,0,1,0,1,0,1) and (0,1,0,1,1,0,1,1,0). But is it the longest?

We will first build a table.

Start of dynamic programming matrix.
Figure 3: Start of LCS table.

This table was calculated row-wise from top to bottom. Row i = 0, and j = 0 are trivial and filled with 0 (if either array is empty the LCS is obviously 0).

The first comparison is between X1 = 1, and Y1 = 0, and these don’t match. Both the cell above and to the left are 0, so the symbols (←↑) are inserted indicated either path can be taken, and the current LCS remains the same.

Partially finished dynamic programming matrix.
Figure 4: Filling out the table for LCS.

The second comparison is between X1 = 1, and Y2 = 1, and these do match. We add the (↖) symbol, to indicate we transition both row and column. The cell the arrow points to is 0, so we add 1 to that and our new LCS is length 1, with the element (1).

We continue this to fill out the table. Once the table is filled we can trace back to find the LCS. In the table we found the LCS to be length 6, with the elements (1,0,0,1,1,0). Note this is just one example of a possible sequence (1,0,1,0,1,1) is another valid longest common subsequence, but the length always remains 6 for any LCS.

The matrix showing dynamic programing and the backtracking algorithm.
Figure 5: Completed LCS table.

Best-first search allows for recovery from dead-ends and local maxima and minima through the use of a priority queue. This priority queue is an ordered list that places the most promising paths at the top of the list. Dijkstra’s algorithm, or the SPF (Shortest Path First) algorithm is the textbook example.

The algorithm works by maintaining a list of explored nodes. The starting node’s children are examined, and the algorithm simply chooses to go to the “closest” child and explore that node’s children. As it does this a list is maintained, and the algorithm always chooses to the shortest path so far. It continues iteratively until the goal node is found. Because this algorithm always chooses the best option first, it is also considered a greedy algorithm.

A* Heuristics

This approach is guaranteed to find the shortest path. However, some problems in this domain are NP hard, and a heuristic would help. Imagine you want to develop an algorithm that finds the best route from one location in a city to another. In a large city with multiple gridded roads, the algorithm would easily get lost in the multiple paths. The A* heuristic tries to inform the algorithm by including information like Euclidean, or Manhattan distances to the goal node. By including this information the algorithm can prioritize paths that are shorter, and paths that decrease the distance to the goal node.

A Common Lisp implementation of this can be seen here.

Thus ends my discussions specific to heuristics and search algorithms. I truly hope that you consider the abstract representations of these problems when solving real-world examples. Abstraction is the core of computer science.

Tags: , , , ,
Posted in artificial intelligence | No Comments »

How to Design an Effective State Space Search

After determining the state space for a problem, we need to design a state space search algorithm for solutions. The importance of decisions regarding searching state space for AI problem solvers can’t be understated. Most problems where AI tools are useful in solving, are very complex and therefore the state spaces are exponentially large. Two approaches could be used. We could use our data and work towards our goal, or we could define our goal and find out what states lead to that goal. We also need to decide in what order to search the graph. This leads us to our discussion on data-driven search, goal-driven search, breadth-first search, and depth-first search[efn_note](Luger, George F.; Stubblefield, William A, Artificial intelligence: structures and strategies for complex problem solving 1998[/efn_note].

Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;
Then took the other…,,

Robert Frost, “The Road Not Taken”

Data-driven search

Data-driven search, also called forward chaining, we start with the data and a set of logical rules for changing states. We apply these rules to create new facts, and iteratively continue until we find a solution or fail.

Data-driven search is most appropriate when:

  1. Most of the data is given in the statement of the problem. This is usual when the problem involves high level interpretations.
  2. If there are many potential goals, but only a few ways to use the data.
  3. If it is difficult to define the goal. It may not be possible to start at a goal if you aren’t able to clearly define it.
Graph showing a data-driven state space search.
Figure 1: Data-Driven Search.

An example could be if a person claims to be a distance relative with a common ancestor with an unknown name that is at most eight generations back. A program that reads sonar readings and interprets them as either small submarines or marine wildlife is another example. A final example is an expert system that classifies plants by family, genus, species, etc.

Goal-driven search

Conversely, we could start with our goal and determine which rules we need to be true in order to arrive at the goal. We create subgoals this way working backwards until we arrive at the data we have defined for our problem.

  1. If the goal is given or easily defined. Theory provers tend have a structure like this. The problem solver gives a solution, and then tries to prove it.
  2. If there is a large number of rules that match the problem. This leads to a large number of potential goals. If we decide to choose a goal first, the state space trimmed off is much larger and therefore a more efficient strategy.
  3. If the problem data isn’t given, we have no choice but to use a goal-driven approach. This approach is great for providing a guide to effective data gathering.
Graph showing a goal-driven state space search.
Figure 2: Goal-Driven Search.

Some goal-driven examples if the same person from the data-driven example instead gives you the name of the common ancestor, then finding the verification is goal-driven. Diagnosing automobile problems, and theorem provers for plane geometry are two more examples.

Depth-first search

Search direction is only one way we can design the state space search algorithm. Another way to enhance the performance is through the order in which child nodes are examined.

Depth-first searches go deep into the search space whenever possible. It ignores siblings of child nodes and “pushes forward”. This can be very good when the solution path is long. However, it can get lost deep in the graph, and it can also find very long paths when shorter paths may exist. It works well when there are many branches.

graph showing depth-first state space search
Figure 3: Depth-first search.

Using our above examples, both the ancestor searches, the theorem prover, and the expert system for identifying plants are good examples where depth-first searches will be appropriate.


A breadth-first search travels level by level in a graph. Because it examines all children of a parent as it visits them, it is gauranteed to find the shortest path.

These searches are best with the solution is simple or known. However, it starts to perform poorly with a high number of branching, and long solution paths.

graph showing breadth-first state space search.
Figure 4: Breadth-first search.

Our automobile problem checker, and our sonar reader from the above are good examples where breath-first search is best.

State Space Search of the Farmer, Wolf, Cabbage, Goat Problem

Finally I’ll present a real solution to a common problem used in graph search. A farmer is traveling with his wolf, cabbage, and goat. He gets to a river that he needs to cross, but has come to a problem. The boat only carries two things (one being the farmer). If the wolf is left alone with the goat then the wolf will eat the goat. If the goat is left alone with the cabbage, then the goat will eat the cabbage.

The solution involves bringing back “something” on a return trip, however I intend just to show the solution(s), in our state graph.

graph representation of Farmer, Wolf, Goat, Cabbage problem.
Figure 5: Graph representation of Farmer, Wolf, Goat, Cabbage problem.

In this problem, with a breadth-first search we would go level by level and both paths from A to J would be discovered. If one of these paths was shorter than the other, we would find that path and, therefore find an optimal solution.  Because this path has a known solution, few branches, and a short solution path, a breadth-first is a reasonable approach to this problem. If each state has an average of B children, Bn states will be on each level n.

However, a depth-first search would end up being a more efficient solution in this case. As both solutions require 7 edges, finding only one solution is necessary in this case. The longer the solution path becomes, the more a depth-first solution is appropriate. However this approach can result in getting lost in deep paths, and won’t ensure the shortest path. It would be useful if the graph has many branches, with the space used being linear with the length of the path. If there are B children, then B x n states are seen to go n deep.

This concludes my discussion on graph theory and state space search. Next on the docket is heuristics and how to make our search algorithms even more efficient! If you’re curious about how I made the graphs, check out this link!

Tags: , , , ,
Posted in artificial intelligence | No Comments »

State Space Search and a Simple Introduction to Graph Theory

Previously I’ve talked about predicate calculus and inference rules to find solutions to problems in artificial intelligence. These rules define the space that is searched when trying to solve a problem through AI. This space can quickly become very large in complex problems and we need efficient ways to search that space. We may find that no solutions exists, or the problem solver can end up in an infinite loop. Even if we find a solution, are we sure it’s the optimal one? How long will it take for the solver to find a problem? State-space search theory is what we use to tackle this problem. We will use a state-space graph and graph theory to analyze the complexity of the problem and design the best way to solve it.

In order to cope, an organism must either armor itself (like a tree or a clam) and “hope for the best”, or else develop methods for getting out of harm’s way and into the better neighborhoods of the vicinity. If you follow this later course, your are confronted with the primordial problem that every agent must continually solve: now what do I do?

Daniel C. Dennett, “Consciousness Explaned”

Graph Theory

Firstly, we need to define what a graph is. Simply put a graph is a set of nodes. Between these nodes we have a set of edges or links. We can define this in mathematical notation the following way[efn_note](Luger, George F.; Stubblefield, William A, Artificial intelligence: structures and strategies for complex problem solving 1998)[/efn_note].

Set of nodes : N1 … Nn
Set of arcs : (N2,N3), (N3, N4), … (Nm, Nn)

In our use of graph theory, each node in our graph represents a state of the world in our problem, and the edges represent the transition between these states.

A graph can also be directed. In these example the edges have directions assigned to them. For example an edge(A,B) will describe a state transition from A to B. However, unless edge(B,A) exists, the problem solver cannot go from state B to A.

A directed graph showing nodes A, B, and C with edges (A,B),(B,C), and (C,B).
Figure 1: Directed graph.

A path is a sequence of nodes through successive edges. In Figure 1, we could represent a path from A to C as [A,B,C].

A rooted graph has one node which all other nodes are connected to. The root node has no parent, and all other nodes are its children.

A rooted graph with A as the root, and B to F as children.
Figure 2: Rooted graph.

A tree is a graph with unique paths between every pair of nodes. That is to say, there is at most one path between two nodes. Figure 1 cannot be a tree because two paths occur between B and C. However, Figure 2 is a tree, and most trees also contain a root. The requirement of at most one path between nodes also leads to trees having no loops.

Using Graph Theory to Build Finite State Machines

Our discussion on graphs allows us to build abstract models for computational devices. These models represent automation for traversing paths in a graph. This abstraction we call a finite state machine (FSM). These are special graphs that are directed, finite, and connected.

A FSM can be described as a finite set of states (S), a finite set of input values (I), and a function (F) that defines the transition between states.

Finite State Machine for flipping a states.
Figure 3: Finite State Machine (FSM)
Table 1: Transition table for the FSM in Figure 3. Input is on the top row, and states are the first column. The transition states are at the intersection.

The example above shows a very simple FSM. The transition function works as a flip. If the input is 0, then no change in state is made. But if the input is 1, the state flips between the states.

Finite State Acceptor Machines (Moore Machines)

A special type of FSM is presented below. In this example, we have a starting state, S0, and an accepting state Sn. This machine will recognize a specific input stream. Here are a few examples.

Moore machines for binary digits.
Figure 4: The first finite state acceptor identifies a stream of binary digits “111”. S3 is only reached with that input stream. The second finite state acceptor accepts only three consecutive “1”‘s.

State Space Representation

Ultimately these graphs help use map out the state space of a problem. Each node can represent a partial solution state to the problem. Edges in the graph represent the actions used to solve this problem. Let’s look at this using The Traveling Salesman problem.

In this problem, a salesmen must travel between 4 cities and return home.

Graph of the TSP problem.
Figure 5: TSP problem.

In this problem we would like to find a path from A, then visiting each node, and back to A. An example solution would be [A, B, D, C, A] = 14 + 29 + 28 + 20 = 91. But is this the shortest path? State space search is all about trying to find the best (in this case the minimum length path). We can represent the state search as a tree.

Tree graph of the TSP problem
Figure 6: TSP Tree, partially filled, searching all solutions to the TSP problem

As we can see from Figure 6, the state space for this problem can grow very quickly with a large number of nodes. Graph theory helps us visualize this and even more importantly, come up with with more efficient ways to search state spaces for potential solutions. Those will be addressed in my next blog. Until then, stay tuned!

Tags: , ,
Posted in artificial intelligence | No Comments »

How to use Logical Inference and Unification in AI

The previous posts on propositional calculus (or zeroth-order logic) and predicate calculus (first-order logic), build on the final and practical application of these logical systems in artificial intelligence automated reasoning systems. Through logical inference and unification we will take a real world example for a financial advisor decision making system.

We come to the full possession of our power of drawing inferences, the last of our facilities; for it is not so much a natural fight as a long and difficult art.

C. S. Pierce

Logical Inference

Logical inference is a formal theory that infers new correct expressions from a set of true assertions. These new assertions will be consistent with all previous statements and interpretations. Inference rules provide the computational mechanism which we can use the concept that a conclusion “logically follows” from a premise.

A basic example is modus ponens. The inference rule for modus ponens is

If P and P → Q are true, then modus ponens infers Q is also true.

More inference rules like modus tollens, examples can be found at the Wikipedia page here.


In order to apply these rules of inference we will need to be able to determine when two expressions match. This is trivial in propositional calculus, but the presence of variables in predicate calculus makes it more complicated. Unification is the algorithm that determines what substitutions we can make to predicate calculus expressions in order to make them match.

First we check the initial predicate symbol, then the number of arguments the predicate has, and then the actual arguments for a match. These arguments are either variables or constants, for example,

love(X, Y) and love(eleanor, chidi)
We can list substitutions and unify as {X/eleanor, Y/chidi}.

When unifying we have three rules:

  • a variable can by unified with an atom, variable or term
  • two atoms can by unified if they are identical
  • a term can be unified if the identifiers are the same

Here are some more examples:

p(X,Y) and p(a,Z)
'X' is a variable, and 'a' is an atom.
'X' and 'Z' are variables.
Therefore it unifies with {X/a, Z/Y}.
p(X,X) and p(a,b)
'X' is a variable, but it cannot be substituted by both 'a' and 'b' atoms, therefore it does not unify.
ancestor(X,Y) and ancestor(bill,father(bill))
'ancestor' and 'ancestor' unify and return the empty set {}
'X' and 'bill' unify as {bill/X}
'Y' and father(bill) unifies as {father(bill)/Y}
Therefore it unifies to {bill/X,father(bill)/Y}
ancestor(X,father(X)) and ancestor(david,george)
Assuming that george is the father of david:
'ancestor' and 'ancestor' unify and return the empty set {}
'X' and 'david' unify as {david/X}
father(X) and 'george' unify as {george/father(X)}
Therefore it unifies to {david/X,george/father(X)}

A logic-based financial advisor

This example is part of the exercises found in Lugers text[efn_note](Luger, George F.; Stubblefield, William A, Artificial intelligence: structures and strategies for complex problem solving 1998)[/efn_note].

We are going to build a financial advisor that decides how someone should split their money between investments and savings. Specifically, the advisor should recommend the amount according the the following criteria.

  • Inadequate savings accounts should always make increasing their savings the top priority regardless of their income.
  • Adequate savings and adequate income should consider riskier investments in the stock market.
  • Lower income with an adequate savings may consider splitting their surplus in investments and savings.

From this we can create the first three predicates for our logical system:

1.  savings_account(inadequate) → investment(savings)
2.  savings_account(adequate)⋀income(adequate) → investment(stocks)
3.  savings_account(adequate)⋀income(inadequate) → investment(combination)

We now need to determine when savings and income are considered adequate. To do this we create the function minsavings which takes one argument, the number of dependents, and returns 5000 times that argument. minsavings(X) ≡ 5000 * X. Now we have two more predicates for our logical system.

4.  ∀ X amount_saved(X) ⋀ ∃ Y (dependents(Y) ⋀ greater(X, minsavings(Y))) → savings_account(adequate)
5.  ∀ X amount_saved(X) ⋀ ∃ Y (dependents(Y) ⋀ ¬ greater(X, minsavings(Y))) → savings_account(inadequate)

Similiarily we need a minincome function. We will define this function as minincome(X) ≡ 15000 + (4000 * X). To define an investors income as adequate it must be above this minimum income and also steady. This adds three more predicates to our logical system.

6.  ∀ X earnings(X, steady) ⋀ ∃ Y (dependents(Y) ⋀ greater(X, minincome(Y))) → income(adequate)
7.  ∀ X earnings(X, steady) ⋀ ∃ Y (dependents(Y) ⋀ ¬ greater(X, minincome(Y))) → income(inadequate)
8.  ∀ X earnings(X, unsteady) → income(inadequate)

For our example, Jane Doe has four dependents, a steady income of $30,000 and $15,000 in her savings account. This adds our final three predicate calculus sentences.

9.  amount_savings(15000)
10. earnings(30000,steady)
11. dependents(4)

Now for the fun part! Let’s unify! First we can unify conjunction of 10 and 11 with the first two components of 7.

earnings(30000, steady) ⋀ dependents(4) unifies with:
earnings(X, steady) ⋀ dependents(Y) under substitution {30000/X, 4/Y} yielding:

earnings(30000, steady) ⋀ dependents(4) ¬ greater(30000, minincome(4)) → income(inadequate)

earnings(30000, steady) ⋀ dependents(4) ¬ greater(30000, 31000) → income(inadequate)

All 3 components of the premise are true, therefore by 10, 3, and the mathematical definition of greater their conjunction is true and the entire premise is true. Thus, we can add another premise

12. income(inadequate)
amount_saved(15000) ⋀ dependents(4) unifies with the first two elements of assertion 5 under the substitution {15000/X, 4/Y} yielding:

amount_saved(15000) ⋀ dependents(4) ⋀ ¬ greater(15000, minsavings(4)) → savings_account(inadequate)

amount_saved(15000) ⋀ dependents(4) ⋀ ¬ greater(15000, 20000) → savings_account(inadequate)

Given that all 3 components of the premise are true, we have another premise

13. savings(inadequate)

As a consequence of expressions 13 and 1 and modus pollens,

savings(inadequate) → investment(savings)

Our automated logical reasoning system suggests that Jane Doe’s surplus income goes into her savings account. This example shows how we can automatically reason and draw conclusions by applying these inference rules in a programmatic fashion.

Tags: , , ,
Posted in artificial intelligence | No Comments »

Predicate Calculus: Its Simple Syntax, and Semantics

In my previous post I talked about propositional calculus. Today we will talk about predicate calculus and its syntax and semantics. Predicate calculus builds on this concept by allowing us to describe the relationships in our logical assertions. We could have a statement like “it snowed on Saturday” (it did… it sucks), and represent that as

weather(saturday, snow)

We can now add variables and start to generalize. For example if we use the X as a variable for the day of the week, we can say

weather(X, snow) is true

to say that it snows every day of the week (I hope not!). As usual, we first need to define the words in our language, that is the syntax.

The essential quality of a proof is to compel belief.


Predicate Calculus Syntax and Semantics

In predicate calculus we use improper symbols to describe objects and their relationship in the world. These are parentheses “( )”, commas “,”, and periods “.”.

We define the symbols as strings of characters and digits. No spaces or special characters can be used, although the “_” can be used for readability. That is the valid characters used are

  1. All letters, upper- and lowercase.
  2. All digits from 0 to 9.
  3. The underscore, “_”.

And all symbols must start with a letter. Some correct examples are

Name   coffee3   stan_and_kyle   emma   XYZ   likes   partner_of

And some invalid examples are

7John   stan and kyle   ta%ble   ###123   lookout!!!

These symbols can represent variables, constants, functions, or predicates.

Variables are used for the abstract concept of classes. Any symbol beginning with a capital is a variable.

X   Name   Location

Constants are specific objects in the world, and start with a lowercase.

emma   small   pink     
note: true and false are reserved as truth symbols

Functions also start with a lowercase, and are used the map elements in a set. In this writers world, the function


evaluates to nancy. Functions also have an arity associated with them. In the previous example the arity would be 1. An example of an arity of 2 could be

loves(nancy, emma)

Another simple example would be

subtract(nine, two)

which evaluates to seven.

A predicate term is a constant, variable, or function expression of a specific arity. We use these terms to describe the problem domain. This leads to another definition, the atomic sentence. It is simply a predicate of N arity followed by N terms in parentheses separated by commas. It is the the most primitive unit of a predicate term. An example of an atomic sentence is

loves(aunt(emma), partner_of(nancy))

Now that we have our basic unit, we can build predicate sentences. First some definitions of symbols

  • ∧ : logical and
  • ∨ : logical or
  • ¬ : logical not
  • → : logical implies
  • ≡ : logical equivalent to
  • ∀ : universal quantifier (reads “sentence is true for all values”)
  • ∃ : existential quantifier (reads “sentence is true for at least one value”)

To end our discussion on syntax I’ll show another example[efn_note](Luger, George F.; Stubblefield, William A, Artificial intelligence: structures and strategies for complex problem solving 1998)[/efn_note].

mother(eve, abel)
mother(eve, cain)
father(adam, abel)
father(adam, cain)

∀ X ∀ Y father(X, Y) ∨ mother(X, Y) → parent(X, Y)
∀ X ∀ Y ∀ Z parent(X, Y) ∧ parent(X, Z) → sibling(Y, Z)

The first 4 lines describe the world. The second two sentences define rules of the world. The first states that for all X’s and all Y’s, if the X is the father of Y or mother of Y, than X is implied to be a parent of Y. The second states that for all X’s, Y’s, and Z’s if a parent of Y is X, and a parent of Z is X, then it is implied that the sibling of Y is Z. If this still confuses you, I encourage you to plug in the variables and see how that looks on paper.

If you want more check out the MIT OpenCourseWare video here!

Tags: ,
Posted in artificial intelligence | No Comments »

Propositional Calculus – A Quick Artificial Intelligence primer

Artificial Intelligence includes much more than machine learning I’ve talked about before here, and here. I thought I’d start talking about this by introducing predicate calculus. Predicate calculus, also known as first-order logic, is the collection of formal systems used in computer science to quantify non-logical objects and expressions. It breaks down artificial intelligence systems to their most atomic states.

Logic is the anatomy of thought.

John Locke

Logic is the beginning of wisdom, not the end.


A fact can be broken down into propositional expressions. The most traditional example is,

  • Socrates is a human (P)
  • All human are mortal (Q)
  • Therefore Socrates is mortal (P –> Q)

This reads P is true, and Q is true, therefore P implies Q.

Some examples of propositional operators that build our propositional expressions are,

  • ∧ : AND
  • ∨ : OR
  • ¬ : NOT

Let’s do an example. How would we express something like the exclusive-or operator ⊗, using only those operators? We can start by making a truth table of what this should look like.

Truth table for the exclusive-or

Now an exclusive-or sentence in English should say P or Q, but not P and Q. That expression in our formal propositional style is (P ⋁ Q) ⋀ ¬(P ⋀ Q). But lets check our work with truth tables.

PQP ⋁ Q¬(P ⋀ Q)(P ⋁ Q) ⋀ ¬(P ⋀ Q)
Truth table for (P ⋁ Q) ⋀ ¬(P ⋀ Q)

This truth table evaluates to the same truth values and shows P⊕Q ≡ (P ⋁ Q) ⋀ ¬(P ⋀ Q).

Modus Ponens

We can use truth tables to prove some classic logical arguments. Let’s start wtih modus ponens. Modus ponens states that P is true, therefore Q is true. The “Socrates is a human” predicate above is an example of modus ponens. A truth table for modus ponens would be:

Because we state P to be true, we only need to look at those cases. Interestingly in the case of false statements P implies Q would evaluate to false, showing that false statements imply anything!

The truth tables shows truth values of the if… then statement of P → Q. If the statement was instead,

  • Shakespeare is a human.
  • All humans are men.
  • Therefore Shakespeare is a man.

Note that although Shakespeare could be a man, the logical argument is not sound. Because Q (All humans are men) is false, we cannot state that P implies Q.


Abduction is an inference rule that infers P to be true from P → Q. Let’s look again to the P → Q truth table.

This time we consider cases where P → Q is True. We can see that there are instances were P is True and P is False, and we cannot infer that P is true from P → Q.

Abductive reasoning is common in science. Sherlock Holmes, often praised for his deductive skills, often actually uses abductive reasoning to come to his conclusions. As an example look at these two arguments.

  • Organs have been designed to be similar to machines.
  • Machines are a product of intelligent design.
  • Therefore, humans are the product of intelligent design


  • Test mice given water from a new industrial plant get sick.
  • People got sick after the plant was opened.
  • Therefore, the current sickness caused by the new plant.

Modus tollens

Finally let’s look at modus tollens. Modus tollens, or affirming the consequent is the contrapositive of modus ponens. It states Q is false, and P → Q is false, and concludes P is false. The propositional expression is ((P → Q) ⋀ ¬ Q) → ¬ P. This truth table is,

PQP → Q¬ P¬ Q((P → Q) ⋀ ¬ Q)((P → Q) ⋀ ¬ Q) → ¬ P
Assuming Q is false, P is inferred to also be false, else modus tollens evaluates to false.

Artificial Intelligence starts with these basic concepts. I encourage you to learn more about them and how to build and use truth tables in all your work.

Tags: ,
Posted in artificial intelligence | No Comments »