Recent Comments

    Categories

    Leadership Experience: the Canadian Armed Forces Way

    Okay, so it’s been quite awhile since my last blog. One of the reasons for this is the time I spent in February and March on an Advanced Geochronology course taught by my supervisor, professor Bruce Eglington and professor Camille Partin from University of Saskatchewan. Absolutely great course, and well taught. Also something I’ll have to blog about later! The second reason was my adventure in learning leadership.

    “A leader is best when people barely know he exists…when his work is done, his aim fulfilled, they will all say: We did it ourselves.”

    Lao-Tzu, an ancient philosopher and founder of Taoism

    This second reason for my complete absence in April and half of May is my training and completion of the basic military officer qualification (BMOQ) course for the Naval Reserve. Working for half my career as a quasi-solo consultant, I was looking for more opportunities to diversify my leadership skills. I was also looking for more opportunities to develop my data analysis skills. This lead me to join the Naval Reserve as an Intelligence Officer, and ultimately to my decision in March to attend basic training.

    This certainly was a decision that I made half heartedly. Forty-one years old and going to boot camp!? I must be crazy. But I was quite surprised when I arrived. I was probably slightly older than the average age. There were quite a few people in their late 40’s and early 50’s. That said, the course was very difficult. If you are ever thinking about this, prepare yourself physically and mentally. It’s not for the faint of heart.

    BMOQ first teaches you how to follow. Nobody can be a leader, without first knowing how to be a follow. The first 3 weeks of the course are primarily focused on following. With this post, I’ll be focusing on the last 3 weeks, where the focus becomes leadership. I’ll discuss the 10 critical requirements for leadership.

    Responsibility and Accountability

    Part of being a leader, is showing others that you want to be a leader. Some personalities have a natural talent for showing this enthusiasm. I’ll focus on my personal experience as being someone who has had problems showing this level of enthusiasm for a leadership role. My advice is to look for opportunities to be a leader and jump on them. Expand the comfort zone, and push yourself to be in the spotlight. Know you’re going to make mistakes, and of utmost importance, own your mistakes.

    Perform Effectively Under Stress

    Well if there was one thing that was abundantly clear during basic training, it was the instructors intent of putting the candidates under stress and then analyzing their decision making. Sleep deprivation was one of these tools to induce stress. First step, make the candidates do 20 hours of missions, a few night watches, and 43 minutes of sleep (an accurate measurement based on the tears falling on my watch blinking 1:43 late in the course). Second step, have them perform a test, or clear their rifle, and watch many of them make one too many shortcuts and fail. This course is sometimes facetiously called “Failure Camp”, and that has some merit. How do you react to failure?

    This particular skill requires a lot of practice and emotional intelligence. Remain calm, and control your reactions. It certainly has applications to civilian life. Is a client screaming at you because something out of your control went wrong? Or was it something in your control? We are all human, but a simple excuse are rarely acceptable. Offer solutions.

    From my experience, emotions are contagious. Remaining calm and composed until others catch your emotions is a much better option than matching others anger or frustration.

    Skills and Knowledge

    There is no doubt to be successful in leadership, you need the overall skills and knowledge required to accomplish the task. Now I didn’t know a thing about military tactics before this course. A lack of knowledge certainly cripples you, but being a leader is also about identifying the people in your team that have the skills and knowledge required. I was fortunate enough to be in a section with a lot of those skills.

    Give the people with the skills clear objectives. Explain why and how the task will be achieved. By providing overall command and control, you can leverage the skills of your team to its fullest.

    Initiative and Decisiveness

    A big part of decision making involves having a strong knowledge base and knowing the time frame required for your decisions. You are typically never going to have the full picture before you make your decision, but subjectively speaking, about 70% is reasonable. Once you make that decision, have confidence in it. If the decision doesn’t work out, you can briefly and internally analyze what went wrong with your decision making process, but it’s much more important to focus on what your next decision is.

    Seek Advice and Constructive Criticism

    Growth comes with being able to identify where you are going wrong. This can be a very difficult pill to swallow sometimes, but if you don’t remember the last time you learned a fundamental truth about yourself or the world around you, then you don’t remember the last time you’ve grown as a person. Leadership requires swallowing this pill sometimes.

    One other tip I have in this area, is to turn negative criticism into constructive criticism. I would love to say that the world is filled with people that are trying to make you better, but more often than not people are a little more self centered. But you can choose to ignore negative, unhelpful criticism, or try to reform their criticism in a positive way and grow from it.

    Inspire Performance and Cooperation

    The best way to inspire is through example. If your team isn’t communicating with you or each other, then communicate with them. Make them feel comfortable by taking the first steps. Another pitfall in inspiration is trying to get off to a good start, but forgetting to continue on. Inspiration must be continuous, throughout the entire project.

    Plan Effectively

    Ah, organization. An Achilles heel for me if I ever had one (actually I’m pretty sure I have about 4 Achilles heels… or maybe 2 Achilles heels and 2 Achilles elbows???). There are a number of LinkedIn courses that can help with planning, along with a number of tools like Gantt charts that can turn planning into a more logical process. I encourage anyone to look online for these free courses. The time invested in them will be returned.

    Communicate Effectively

    Communication needs to be clear and concise. Confirmation questions are great ways to make sure your instructions have been understood. I also suggest following up verbal conversations with emails to ensure everyone is on the same page, especially if those instructions are slightly complex or high importance. It allows people to return to the email to reread the directions if their memory lapses over time.

    Supervise Effectively

    Often watching new supervisors you see a lot of “firefighting”. By this I mean they tackle each task one by one directing their team members with one task at a time, and then refocusing to other tasks. Good supervision requires the ability to juggle multiple tasks simultaneously. Give your team members multiple instructions, and leave them to do the job. After this, you should be monitoring and ensuring standards, and making sure everyone has at least one, preferably more tasks on their plate.

    Delegate Effectively

    Know your teams strengths and weaknesses so you can delegate the right tasks to the right people. It’s also important to remember the overall goal of the project. While the team works on their individual tasks, you have to keep an eye on the bigger picture.

    If you are interested in taking this journey yourself, I highly suggest you take that challenge. The HMCS Unicorn’s contact information can be found here.

    Automated Reasoning in Uncertainty Using Fuzzy Logic Sets

    Automated reasoning in artificial intelligence tries to emulate human reasoning through automated deduction. The four color theorem is an example of this, and was the first major theorem to be proved using a computer (Appel & Haken, 1977). However, a large part of human reasoning involves decisions with uncertain, incomplete, ambiguous, or inaccurate input. A good primer for how fuzzy logic is used today in cancer research can be found here.

    We demand rigidly defined areas of doubt and uncertainty!

    Douglas Adams, The Hitchhiker’s Guide to the Galaxy

    Fuzzy set theory was introduced by Lotfi Zadeh (1965) which allowed for the use of classic set theory and their operations to a class of objects in sets that have varying degrees of membership.

    Fuzzy Logic Intro

    Using Aristotle logic in expert systems involving automated reasoning is impractical when dealing with uncertainty. Fuzzy logic (Zadeh, 1965) provided flexibility in reasoning for use in expert systems with uncertainty.

    In classical set theory, a set is a collection of crisp values. Crisp values are distinct and precise. These sets deal with strict boundaries. In a Boolean system truth values are also crisp, that is either 1 as true, or 0 as false. For example, we could have a set of geochemical samples with copper parts per million (Cu ppm) values over 100.

    In this example we have two groups, members and non-members. Members are any values over 100 and non-members are 100 or less. There is no such thing as partial membership. We can have a membership function defined as

    Fuzzy logic provides intermediate values between true and false and extends classic set theory. We can now have partial membership in the set , as seen in Figure 1.  

    Figure 1: Classical Set Theory versus Fuzzy Set Theory.

    In our fuzzy set we can represent our set , as a set of ordered pairs in the universe .

    In our case the universe  is finite and discrete, and our set can be described as

    Suppose we have a crisp set , with their respective partial membership . We visualize this in Figure 2, and can represent the fuzzy set in its discrete form.

    Figure 2: Fuzzy Set vs. Crisp Set example.

    In my next post, I will talk about how to do operations on these sets, and how to apply them to geological data.

    Citations

    Appel, K., & Haken, W. (1977). The Solution of the Four-Color-Map Problem. Scientific American, 237(4), 108–121. https://doi.org/10.1038/scientificamerican1077-108

    Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353. https://doi.org/10.1016/S0019-9958(65)90241-X

    What Are Heuristics in AI, and How Can They Help Me Write Better Code?

    Heuristics involve the study of methods and rules for discovery and invention[efn_note]G, Polya. (1945). How to Solve It. Princeton, NJ: Princeton University Press.[/efn_note]. In AI, heuristics involve defining rules for choosing branches in state space search that are likely to lead to acceptable solutions. This can be particularly useful in two situations. These are problems that don’t require an exact solutions due to the ambiguities in the problem statement or available data, and problems that do have an exact solution but the state space is exceptionally large and therefore computationally expensive.

    Have no fear of perfection – you’ll never reach it.

    Salvador Dali

    There are well developed heuristics I’d like to talk about today, along with some examples of each.

    Hill Climbing Heuristics

    One of the simplest heuristics we can use is the hill climbing heuristic. This strategy involves picking the best child and searching that, without retaining any information on parents or siblings. This heuristic tends to search deep into the state space.

    Some obvious problems with this strategy is the tendency to get stuck at a local maxima (or minima). This can often result in failing to find an optimal solution.

    Hill climbing heuristics example.
    Figure 1: In this example searching for a maximum number a hill climbing heuristic would choose the 10 child and stop, missing the solutions on the right side of the tree. This is the major draw back of hill climbing.

    This method is often called greedy, as it simply takes the best option at time ignoring possible future gains. However this technique can be optimal in some situations. These situations typically tend to be when moves in state space can’t be undone. Think tic-tac-toe; not chess. We can add a heuristic to a tic-tac-toe game that picks the child that leads to to most possible wins. For example, the first move X would make is in the center. The center leads to 4 possible wins (2 diagonals, middle row, middle column), and no other move leads to 4 or more wins.

    Tic-tac-toe state space representation. First tree and elimination of nodes by heuristics.
    Figure 2: Showing the first two levels of the state space for tic-tac-toe. The “O” player can eliminate half the possible moves by using the heuristic.

    Dynamic Programming

    Another creative way to reduce state space is to break the problem down into subproblems and reuse the subproblems that have already been solved. This technique is often called memoization (as in memo, not memorize!)

    Dynamic programming is used to solve a large number of problems. Let’s think of a problem where we need to find the longest common substring (LCS). For example we have the arrays (1,0,0,1,0,1,0,1) and (0,1,0,1,1,0,1,1,0), and we want to find the longest string of common bits. An example answer would be (1,0,1,0) from (1,0,0,1,0,1,0,1) and (0,1,0,1,1,0,1,1,0). But is it the longest?

    We will first build a table.

    Start of dynamic programming matrix.
    Figure 3: Start of LCS table.

    This table was calculated row-wise from top to bottom. Row i = 0, and j = 0 are trivial and filled with 0 (if either array is empty the LCS is obviously 0).

    The first comparison is between X1 = 1, and Y1 = 0, and these don’t match. Both the cell above and to the left are 0, so the symbols (←↑) are inserted indicated either path can be taken, and the current LCS remains the same.

    Partially finished dynamic programming matrix.
    Figure 4: Filling out the table for LCS.

    The second comparison is between X1 = 1, and Y2 = 1, and these do match. We add the (↖) symbol, to indicate we transition both row and column. The cell the arrow points to is 0, so we add 1 to that and our new LCS is length 1, with the element (1).

    We continue this to fill out the table. Once the table is filled we can trace back to find the LCS. In the table we found the LCS to be length 6, with the elements (1,0,0,1,1,0). Note this is just one example of a possible sequence (1,0,1,0,1,1) is another valid longest common subsequence, but the length always remains 6 for any LCS.

    The matrix showing dynamic programing and the backtracking algorithm.
    Figure 5: Completed LCS table.

    Best-first search allows for recovery from dead-ends and local maxima and minima through the use of a priority queue. This priority queue is an ordered list that places the most promising paths at the top of the list. Dijkstra’s algorithm, or the SPF (Shortest Path First) algorithm is the textbook example.

    The algorithm works by maintaining a list of explored nodes. The starting node’s children are examined, and the algorithm simply chooses to go to the “closest” child and explore that node’s children. As it does this a list is maintained, and the algorithm always chooses to the shortest path so far. It continues iteratively until the goal node is found. Because this algorithm always chooses the best option first, it is also considered a greedy algorithm.

    A* Heuristics

    This approach is guaranteed to find the shortest path. However, some problems in this domain are NP hard, and a heuristic would help. Imagine you want to develop an algorithm that finds the best route from one location in a city to another. In a large city with multiple gridded roads, the algorithm would easily get lost in the multiple paths. The A* heuristic tries to inform the algorithm by including information like Euclidean, or Manhattan distances to the goal node. By including this information the algorithm can prioritize paths that are shorter, and paths that decrease the distance to the goal node.

    A Common Lisp implementation of this can be seen here.

    Thus ends my discussions specific to heuristics and search algorithms. I truly hope that you consider the abstract representations of these problems when solving real-world examples. Abstraction is the core of computer science.

    How to Design an Effective State Space Search

    After determining the state space for a problem, we need to design a state space search algorithm for solutions. The importance of decisions regarding searching state space for AI problem solvers can’t be understated. Most problems where AI tools are useful in solving, are very complex and therefore the state spaces are exponentially large. Two approaches could be used. We could use our data and work towards our goal, or we could define our goal and find out what states lead to that goal. We also need to decide in what order to search the graph. This leads us to our discussion on data-driven search, goal-driven search, breadth-first search, and depth-first search[efn_note](Luger, George F.; Stubblefield, William A, Artificial intelligence: structures and strategies for complex problem solving 1998[/efn_note].

    Two roads diverged in a yellow wood,
    And sorry I could not travel both
    And be one traveler, long I stood
    And looked down one as far as I could
    To where it bent in the undergrowth;
    Then took the other…,,

    Robert Frost, “The Road Not Taken”

    Data-driven search

    Data-driven search, also called forward chaining, we start with the data and a set of logical rules for changing states. We apply these rules to create new facts, and iteratively continue until we find a solution or fail.

    Data-driven search is most appropriate when:

    1. Most of the data is given in the statement of the problem. This is usual when the problem involves high level interpretations.
    2. If there are many potential goals, but only a few ways to use the data.
    3. If it is difficult to define the goal. It may not be possible to start at a goal if you aren’t able to clearly define it.
    Graph showing a data-driven state space search.
    Figure 1: Data-Driven Search.

    An example could be if a person claims to be a distance relative with a common ancestor with an unknown name that is at most eight generations back. A program that reads sonar readings and interprets them as either small submarines or marine wildlife is another example. A final example is an expert system that classifies plants by family, genus, species, etc.

    Goal-driven search

    Conversely, we could start with our goal and determine which rules we need to be true in order to arrive at the goal. We create subgoals this way working backwards until we arrive at the data we have defined for our problem.

    1. If the goal is given or easily defined. Theory provers tend have a structure like this. The problem solver gives a solution, and then tries to prove it.
    2. If there is a large number of rules that match the problem. This leads to a large number of potential goals. If we decide to choose a goal first, the state space trimmed off is much larger and therefore a more efficient strategy.
    3. If the problem data isn’t given, we have no choice but to use a goal-driven approach. This approach is great for providing a guide to effective data gathering.
    Graph showing a goal-driven state space search.
    Figure 2: Goal-Driven Search.

    Some goal-driven examples if the same person from the data-driven example instead gives you the name of the common ancestor, then finding the verification is goal-driven. Diagnosing automobile problems, and theorem provers for plane geometry are two more examples.

    Depth-first search

    Search direction is only one way we can design the state space search algorithm. Another way to enhance the performance is through the order in which child nodes are examined.

    Depth-first searches go deep into the search space whenever possible. It ignores siblings of child nodes and “pushes forward”. This can be very good when the solution path is long. However, it can get lost deep in the graph, and it can also find very long paths when shorter paths may exist. It works well when there are many branches.

    graph showing depth-first state space search
    Figure 3: Depth-first search.

    Using our above examples, both the ancestor searches, the theorem prover, and the expert system for identifying plants are good examples where depth-first searches will be appropriate.

    Breadth-first

    A breadth-first search travels level by level in a graph. Because it examines all children of a parent as it visits them, it is gauranteed to find the shortest path.

    These searches are best with the solution is simple or known. However, it starts to perform poorly with a high number of branching, and long solution paths.

    graph showing breadth-first state space search.
    Figure 4: Breadth-first search.

    Our automobile problem checker, and our sonar reader from the above are good examples where breath-first search is best.

    State Space Search of the Farmer, Wolf, Cabbage, Goat Problem

    Finally I’ll present a real solution to a common problem used in graph search. A farmer is traveling with his wolf, cabbage, and goat. He gets to a river that he needs to cross, but has come to a problem. The boat only carries two things (one being the farmer). If the wolf is left alone with the goat then the wolf will eat the goat. If the goat is left alone with the cabbage, then the goat will eat the cabbage.

    The solution involves bringing back “something” on a return trip, however I intend just to show the solution(s), in our state graph.

    graph representation of Farmer, Wolf, Goat, Cabbage problem.
    Figure 5: Graph representation of Farmer, Wolf, Goat, Cabbage problem.

    In this problem, with a breadth-first search we would go level by level and both paths from A to J would be discovered. If one of these paths was shorter than the other, we would find that path and, therefore find an optimal solution.  Because this path has a known solution, few branches, and a short solution path, a breadth-first is a reasonable approach to this problem. If each state has an average of B children, Bn states will be on each level n.

    However, a depth-first search would end up being a more efficient solution in this case. As both solutions require 7 edges, finding only one solution is necessary in this case. The longer the solution path becomes, the more a depth-first solution is appropriate. However this approach can result in getting lost in deep paths, and won’t ensure the shortest path. It would be useful if the graph has many branches, with the space used being linear with the length of the path. If there are B children, then B x n states are seen to go n deep.

    This concludes my discussion on graph theory and state space search. Next on the docket is heuristics and how to make our search algorithms even more efficient! If you’re curious about how I made the graphs, check out this link!

    State Space Search and a Simple Introduction to Graph Theory

    Previously I’ve talked about predicate calculus and inference rules to find solutions to problems in artificial intelligence. These rules define the space that is searched when trying to solve a problem through AI. This space can quickly become very large in complex problems and we need efficient ways to search that space. We may find that no solutions exists, or the problem solver can end up in an infinite loop. Even if we find a solution, are we sure it’s the optimal one? How long will it take for the solver to find a problem? State-space search theory is what we use to tackle this problem. We will use a state-space graph and graph theory to analyze the complexity of the problem and design the best way to solve it.

    In order to cope, an organism must either armor itself (like a tree or a clam) and “hope for the best”, or else develop methods for getting out of harm’s way and into the better neighborhoods of the vicinity. If you follow this later course, your are confronted with the primordial problem that every agent must continually solve: now what do I do?

    Daniel C. Dennett, “Consciousness Explaned”

    Graph Theory

    Firstly, we need to define what a graph is. Simply put a graph is a set of nodes. Between these nodes we have a set of edges or links. We can define this in mathematical notation the following way[efn_note](Luger, George F.; Stubblefield, William A, Artificial intelligence: structures and strategies for complex problem solving 1998)[/efn_note].

    Set of nodes : N1 … Nn
    Set of arcs : (N2,N3), (N3, N4), … (Nm, Nn)

    In our use of graph theory, each node in our graph represents a state of the world in our problem, and the edges represent the transition between these states.

    A graph can also be directed. In these example the edges have directions assigned to them. For example an edge(A,B) will describe a state transition from A to B. However, unless edge(B,A) exists, the problem solver cannot go from state B to A.

    A directed graph showing nodes A, B, and C with edges (A,B),(B,C), and (C,B).
    Figure 1: Directed graph.

    A path is a sequence of nodes through successive edges. In Figure 1, we could represent a path from A to C as [A,B,C].

    A rooted graph has one node which all other nodes are connected to. The root node has no parent, and all other nodes are its children.

    A rooted graph with A as the root, and B to F as children.
    Figure 2: Rooted graph.

    A tree is a graph with unique paths between every pair of nodes. That is to say, there is at most one path between two nodes. Figure 1 cannot be a tree because two paths occur between B and C. However, Figure 2 is a tree, and most trees also contain a root. The requirement of at most one path between nodes also leads to trees having no loops.

    Using Graph Theory to Build Finite State Machines

    Our discussion on graphs allows us to build abstract models for computational devices. These models represent automation for traversing paths in a graph. This abstraction we call a finite state machine (FSM). These are special graphs that are directed, finite, and connected.

    A FSM can be described as a finite set of states (S), a finite set of input values (I), and a function (F) that defines the transition between states.

    Finite State Machine for flipping a states.
    Figure 3: Finite State Machine (FSM)
    01
    S0S0S1
    S1S1S0
    Table 1: Transition table for the FSM in Figure 3. Input is on the top row, and states are the first column. The transition states are at the intersection.

    The example above shows a very simple FSM. The transition function works as a flip. If the input is 0, then no change in state is made. But if the input is 1, the state flips between the states.

    Finite State Acceptor Machines (Moore Machines)

    A special type of FSM is presented below. In this example, we have a starting state, S0, and an accepting state Sn. This machine will recognize a specific input stream. Here are a few examples.

    Moore machines for binary digits.
    Figure 4: The first finite state acceptor identifies a stream of binary digits “111”. S3 is only reached with that input stream. The second finite state acceptor accepts only three consecutive “1”‘s.

    State Space Representation

    Ultimately these graphs help use map out the state space of a problem. Each node can represent a partial solution state to the problem. Edges in the graph represent the actions used to solve this problem. Let’s look at this using The Traveling Salesman problem.

    In this problem, a salesmen must travel between 4 cities and return home.

    Graph of the TSP problem.
    Figure 5: TSP problem.

    In this problem we would like to find a path from A, then visiting each node, and back to A. An example solution would be [A, B, D, C, A] = 14 + 29 + 28 + 20 = 91. But is this the shortest path? State space search is all about trying to find the best (in this case the minimum length path). We can represent the state search as a tree.

    Tree graph of the TSP problem
    Figure 6: TSP Tree, partially filled, searching all solutions to the TSP problem

    As we can see from Figure 6, the state space for this problem can grow very quickly with a large number of nodes. Graph theory helps us visualize this and even more importantly, come up with with more efficient ways to search state spaces for potential solutions. Those will be addressed in my next blog. Until then, stay tuned!

    How to use Logical Inference and Unification in AI

    The previous posts on propositional calculus (or zeroth-order logic) and predicate calculus (first-order logic), build on the final and practical application of these logical systems in artificial intelligence automated reasoning systems. Through logical inference and unification we will take a real world example for a financial advisor decision making system.

    We come to the full possession of our power of drawing inferences, the last of our facilities; for it is not so much a natural fight as a long and difficult art.

    C. S. Pierce

    Logical Inference

    Logical inference is a formal theory that infers new correct expressions from a set of true assertions. These new assertions will be consistent with all previous statements and interpretations. Inference rules provide the computational mechanism which we can use the concept that a conclusion “logically follows” from a premise.

    A basic example is modus ponens. The inference rule for modus ponens is

    If P and P → Q are true, then modus ponens infers Q is also true.

    More inference rules like modus tollens, examples can be found at the Wikipedia page here.

    Unification

    In order to apply these rules of inference we will need to be able to determine when two expressions match. This is trivial in propositional calculus, but the presence of variables in predicate calculus makes it more complicated. Unification is the algorithm that determines what substitutions we can make to predicate calculus expressions in order to make them match.

    First we check the initial predicate symbol, then the number of arguments the predicate has, and then the actual arguments for a match. These arguments are either variables or constants, for example,

    love(X, Y) and love(eleanor, chidi)
    We can list substitutions and unify as {X/eleanor, Y/chidi}.

    When unifying we have three rules:

    • a variable can by unified with an atom, variable or term
    • two atoms can by unified if they are identical
    • a term can be unified if the identifiers are the same

    Here are some more examples:

    p(X,Y) and p(a,Z)
    'X' is a variable, and 'a' is an atom.
    'X' and 'Z' are variables.
    Therefore it unifies with {X/a, Z/Y}.
    p(X,X) and p(a,b)
    'X' is a variable, but it cannot be substituted by both 'a' and 'b' atoms, therefore it does not unify.
    ancestor(X,Y) and ancestor(bill,father(bill))
    'ancestor' and 'ancestor' unify and return the empty set {}
    'X' and 'bill' unify as {bill/X}
    'Y' and father(bill) unifies as {father(bill)/Y}
    Therefore it unifies to {bill/X,father(bill)/Y}
    ancestor(X,father(X)) and ancestor(david,george)
    Assuming that george is the father of david:
    'ancestor' and 'ancestor' unify and return the empty set {}
    'X' and 'david' unify as {david/X}
    father(X) and 'george' unify as {george/father(X)}
    Therefore it unifies to {david/X,george/father(X)}

    A logic-based financial advisor

    This example is part of the exercises found in Lugers text[efn_note](Luger, George F.; Stubblefield, William A, Artificial intelligence: structures and strategies for complex problem solving 1998)[/efn_note].

    We are going to build a financial advisor that decides how someone should split their money between investments and savings. Specifically, the advisor should recommend the amount according the the following criteria.

    • Inadequate savings accounts should always make increasing their savings the top priority regardless of their income.
    • Adequate savings and adequate income should consider riskier investments in the stock market.
    • Lower income with an adequate savings may consider splitting their surplus in investments and savings.

    From this we can create the first three predicates for our logical system:

    1.  savings_account(inadequate) → investment(savings)
    2.  savings_account(adequate)⋀income(adequate) → investment(stocks)
    3.  savings_account(adequate)⋀income(inadequate) → investment(combination)

    We now need to determine when savings and income are considered adequate. To do this we create the function minsavings which takes one argument, the number of dependents, and returns 5000 times that argument. minsavings(X) ≡ 5000 * X. Now we have two more predicates for our logical system.

    4.  ∀ X amount_saved(X) ⋀ ∃ Y (dependents(Y) ⋀ greater(X, minsavings(Y))) → savings_account(adequate)
    5.  ∀ X amount_saved(X) ⋀ ∃ Y (dependents(Y) ⋀ ¬ greater(X, minsavings(Y))) → savings_account(inadequate)

    Similiarily we need a minincome function. We will define this function as minincome(X) ≡ 15000 + (4000 * X). To define an investors income as adequate it must be above this minimum income and also steady. This adds three more predicates to our logical system.

    6.  ∀ X earnings(X, steady) ⋀ ∃ Y (dependents(Y) ⋀ greater(X, minincome(Y))) → income(adequate)
    7.  ∀ X earnings(X, steady) ⋀ ∃ Y (dependents(Y) ⋀ ¬ greater(X, minincome(Y))) → income(inadequate)
    8.  ∀ X earnings(X, unsteady) → income(inadequate)

    For our example, Jane Doe has four dependents, a steady income of $30,000 and $15,000 in her savings account. This adds our final three predicate calculus sentences.

    9.  amount_savings(15000)
    10. earnings(30000,steady)
    11. dependents(4)

    Now for the fun part! Let’s unify! First we can unify conjunction of 10 and 11 with the first two components of 7.

    earnings(30000, steady) ⋀ dependents(4) unifies with:
    earnings(X, steady) ⋀ dependents(Y) under substitution {30000/X, 4/Y} yielding:
    
    earnings(30000, steady) ⋀ dependents(4) ¬ greater(30000, minincome(4)) → income(inadequate)
    
    earnings(30000, steady) ⋀ dependents(4) ¬ greater(30000, 31000) → income(inadequate)

    All 3 components of the premise are true, therefore by 10, 3, and the mathematical definition of greater their conjunction is true and the entire premise is true. Thus, we can add another premise

    12. income(inadequate)
    amount_saved(15000) ⋀ dependents(4) unifies with the first two elements of assertion 5 under the substitution {15000/X, 4/Y} yielding:
    
    amount_saved(15000) ⋀ dependents(4) ⋀ ¬ greater(15000, minsavings(4)) → savings_account(inadequate)
    
    amount_saved(15000) ⋀ dependents(4) ⋀ ¬ greater(15000, 20000) → savings_account(inadequate)

    Given that all 3 components of the premise are true, we have another premise

    13. savings(inadequate)

    As a consequence of expressions 13 and 1 and modus pollens,

    savings(inadequate) → investment(savings)

    Our automated logical reasoning system suggests that Jane Doe’s surplus income goes into her savings account. This example shows how we can automatically reason and draw conclusions by applying these inference rules in a programmatic fashion.

    Predicate Calculus: Its Simple Syntax, and Semantics

    In my previous post I talked about propositional calculus. Today we will talk about predicate calculus and its syntax and semantics. Predicate calculus builds on this concept by allowing us to describe the relationships in our logical assertions. We could have a statement like “it snowed on Saturday” (it did… it sucks), and represent that as

    weather(saturday, snow)

    We can now add variables and start to generalize. For example if we use the X as a variable for the day of the week, we can say

    weather(X, snow) is true

    to say that it snows every day of the week (I hope not!). As usual, we first need to define the words in our language, that is the syntax.

    The essential quality of a proof is to compel belief.

    Fermat

    Predicate Calculus Syntax and Semantics

    In predicate calculus we use improper symbols to describe objects and their relationship in the world. These are parentheses “( )”, commas “,”, and periods “.”.

    We define the symbols as strings of characters and digits. No spaces or special characters can be used, although the “_” can be used for readability. That is the valid characters used are

    1. All letters, upper- and lowercase.
    2. All digits from 0 to 9.
    3. The underscore, “_”.

    And all symbols must start with a letter. Some correct examples are

    Name   coffee3   stan_and_kyle   emma   XYZ   likes   partner_of

    And some invalid examples are

    7John   stan and kyle   ta%ble   ###123   lookout!!!

    These symbols can represent variables, constants, functions, or predicates.

    Variables are used for the abstract concept of classes. Any symbol beginning with a capital is a variable.

    X   Name   Location

    Constants are specific objects in the world, and start with a lowercase.

    emma   small   pink     
    note: true and false are reserved as truth symbols

    Functions also start with a lowercase, and are used the map elements in a set. In this writers world, the function

    aunt(emma)

    evaluates to nancy. Functions also have an arity associated with them. In the previous example the arity would be 1. An example of an arity of 2 could be

    loves(nancy, emma)

    Another simple example would be

    subtract(nine, two)

    which evaluates to seven.

    A predicate term is a constant, variable, or function expression of a specific arity. We use these terms to describe the problem domain. This leads to another definition, the atomic sentence. It is simply a predicate of N arity followed by N terms in parentheses separated by commas. It is the the most primitive unit of a predicate term. An example of an atomic sentence is

    loves(aunt(emma), partner_of(nancy))

    Now that we have our basic unit, we can build predicate sentences. First some definitions of symbols

    • ∧ : logical and
    • ∨ : logical or
    • ¬ : logical not
    • → : logical implies
    • ≡ : logical equivalent to
    • ∀ : universal quantifier (reads “sentence is true for all values”)
    • ∃ : existential quantifier (reads “sentence is true for at least one value”)

    To end our discussion on syntax I’ll show another example[efn_note](Luger, George F.; Stubblefield, William A, Artificial intelligence: structures and strategies for complex problem solving 1998)[/efn_note].

    mother(eve, abel)
    mother(eve, cain)
    father(adam, abel)
    father(adam, cain)
    
    ∀ X ∀ Y father(X, Y) ∨ mother(X, Y) → parent(X, Y)
    ∀ X ∀ Y ∀ Z parent(X, Y) ∧ parent(X, Z) → sibling(Y, Z)

    The first 4 lines describe the world. The second two sentences define rules of the world. The first states that for all X’s and all Y’s, if the X is the father of Y or mother of Y, than X is implied to be a parent of Y. The second states that for all X’s, Y’s, and Z’s if a parent of Y is X, and a parent of Z is X, then it is implied that the sibling of Y is Z. If this still confuses you, I encourage you to plug in the variables and see how that looks on paper.

    If you want more check out the MIT OpenCourseWare video here!

    Propositional Calculus – A Quick Artificial Intelligence primer

    Artificial Intelligence includes much more than machine learning I’ve talked about before here, and here. I thought I’d start talking about this by introducing predicate calculus. Predicate calculus, also known as first-order logic, is the collection of formal systems used in computer science to quantify non-logical objects and expressions. It breaks down artificial intelligence systems to their most atomic states.

    Logic is the anatomy of thought.

    John Locke

    Logic is the beginning of wisdom, not the end.

    Spock

    A fact can be broken down into propositional expressions. The most traditional example is,

    • Socrates is a human (P)
    • All human are mortal (Q)
    • Therefore Socrates is mortal (P –> Q)

    This reads P is true, and Q is true, therefore P implies Q.

    Some examples of propositional operators that build our propositional expressions are,

    • ∧ : AND
    • ∨ : OR
    • ¬ : NOT

    Let’s do an example. How would we express something like the exclusive-or operator ⊗, using only those operators? We can start by making a truth table of what this should look like.

    PQP⊗Q
    TTF
    TFT
    FTT
    FFF
    Truth table for the exclusive-or

    Now an exclusive-or sentence in English should say P or Q, but not P and Q. That expression in our formal propositional style is (P ⋁ Q) ⋀ ¬(P ⋀ Q). But lets check our work with truth tables.

    PQP ⋁ Q¬(P ⋀ Q)(P ⋁ Q) ⋀ ¬(P ⋀ Q)
    TTTFF
    TFTTT
    FTTTT
    FFFTF
    Truth table for (P ⋁ Q) ⋀ ¬(P ⋀ Q)

    This truth table evaluates to the same truth values and shows P⊕Q ≡ (P ⋁ Q) ⋀ ¬(P ⋀ Q).

    Modus Ponens

    We can use truth tables to prove some classic logical arguments. Let’s start wtih modus ponens. Modus ponens states that P is true, therefore Q is true. The “Socrates is a human” predicate above is an example of modus ponens. A truth table for modus ponens would be:

    PQP → Q
    TTT
    TFF
    Because we state P to be true, we only need to look at those cases. Interestingly in the case of false statements P implies Q would evaluate to false, showing that false statements imply anything!

    The truth tables shows truth values of the if… then statement of P → Q. If the statement was instead,

    • Shakespeare is a human.
    • All humans are men.
    • Therefore Shakespeare is a man.

    Note that although Shakespeare could be a man, the logical argument is not sound. Because Q (All humans are men) is false, we cannot state that P implies Q.

    Abduction

    Abduction is an inference rule that infers P to be true from P → Q. Let’s look again to the P → Q truth table.

    PQP → Q
    TTT
    FTT
    This time we consider cases where P → Q is True. We can see that there are instances were P is True and P is False, and we cannot infer that P is true from P → Q.

    Abductive reasoning is common in science. Sherlock Holmes, often praised for his deductive skills, often actually uses abductive reasoning to come to his conclusions. As an example look at these two arguments.

    • Organs have been designed to be similar to machines.
    • Machines are a product of intelligent design.
    • Therefore, humans are the product of intelligent design

    or,

    • Test mice given water from a new industrial plant get sick.
    • People got sick after the plant was opened.
    • Therefore, the current sickness caused by the new plant.

    Modus tollens

    Finally let’s look at modus tollens. Modus tollens, or affirming the consequent is the contrapositive of modus ponens. It states Q is false, and P → Q is false, and concludes P is false. The propositional expression is ((P → Q) ⋀ ¬ Q) → ¬ P. This truth table is,

    PQP → Q¬ P¬ Q((P → Q) ⋀ ¬ Q)((P → Q) ⋀ ¬ Q) → ¬ P
    TFFFTFF
    FFTTTTT
    Assuming Q is false, P is inferred to also be false, else modus tollens evaluates to false.

    Artificial Intelligence starts with these basic concepts. I encourage you to learn more about them and how to build and use truth tables in all your work.

    Programming Environments and an intro to Anaconda

    Most of my experience in programming so far has been on my own, or at least as the lone programmer on site. Now I am about to start working more with people that are at (or more likely beyond) my capabilities. Part of the start of this journey is going to require everyone working in the same programming environments.

    Messy desks are not programming environments
    No, this is not what I mean when I say programming environments.

    Programming in some ways, can be just like building scientific knowledge. Everything makes sense when hypothesis are built on theories, which are built on laws. But everyone has to work and agree on the same basic concepts. Programming environments are similar. If you want to collaborate with someone who is using Python 3.6, it is best you also work in 3.6. At least for that project. If you decide to contribute by introducing Python 3.8 code, some libraries may not work correctly or cease functioning at all.

    So I made a video, my first, to show people how to set up Python programming environments on their computer using the Anaconda distribution. It is a very popular repository where Python libraries/packages are maintained. The GUI used, Anaconda Navigator, is easy and intuitive. And best of all, the individual distribution is open sourced and free.

    Here is a like to the video! I hope you use it and get programming!

    Pause the video if you need time to catch up!

    This project was also my first introduction into making a video. I’d like to try and make more, but as this is my first, my editing is abysmal. There’s about 20 things I would have done differently. I definitely see the advantage of good video editing software, and good audio recording gear. But, every journey always begins with the first step!

    The Wilcoxon Sign Rank Test: A Geological Example

    Geological data is often not normally distributed, and with data that is not normally distributed parametric methods should not be used.  In this case, I did a paired experiment, which is a simplified example of blocking, where comparisons are made between similar experimental units. This blocking needs to be done prior to performing the experiment.

    Wilcoxon sign rank test can be used when we can’t assume a normal distribution in a paired experiment.

    We will use the fictional data below to determine if Arsenic (As) is increasing in soil samples over a 1-year period.

    Location IDFeb 2018 (As ppm)Feb 2019 (As ppm)
    11811
    2127
    31716
    41010
    51417
    6810
    7915
    81317
    91220
    10935
    Figure 1: Fictional As soil sample data.

    In this sample data set, I used a Shapiro-Wilk test to test for normality (∝=0.05) (“Shapiro-Wilk Test Calculator”, 2020). For February 2018, we calculated a p-value of 0.457 that our sample set is normal, so we assume that it is normal. For February 2019, our calculated p-value is 0.047, this is statistically significant and therefore we assume it is not normally distributed.

    As usual we state our hypothesis:

    We state our level of significance at ∝=0.05, and our number of observations n=9(we ignore values with no differences such as location 4). First, we need to calculate some new information for our test: the absolute difference, and the rank.

    Location IDFeb 2018 (As ppm)Feb 2019 (As ppm)DifferenceAbsolute DifferenceRank
    11811-777
    2127-555
    31716-111
    4101000
    51417333
    6810222
    7915666
    81317444
    91220888
    1093526269
    Figure 2: Calculation for the Wilcoxon signed rank test.

    As this is a Wilcoxon signed rank test, we need to know the rank sum of the negative and positive differences. We ignore sample pairs with no differences.

    Our Wilcox test statistic is the smallest of these two calculations, therefore wstat=13. We use a table to find that the critical value for n=10 and ∝=0.05, is wcrit=8, is  (“Wilcoxon Signed-Ranks Table”, 2020). Therefore,  wstat>wcrit, and we cannot reject the null hypothesis. It is possible there has been no change in As levels between the two years. If our critical value was greater than our test statistic, we would be able to reject the null and confirm statistically significant change in As values.