Previous Chapter: Structured Knowledge

Programming with Atomese

In the previous chapter, we introduced Atomspace queries and a little bit about the execution model with cog-execute!. Now we’ll go deeper into some more of the program-flow constructs that allow Atomese to behave like a complete programming language.

In this upcoming chapter, we’ll deal with Atomese approaches to conditionals, code factoring (i.e. functions), and looping.

If you are steeped in procedural programming, like I am, there are things about the Atomspace execution model that will require you to turn your brain inside-out. On the other hand, if you come from a Lambda Calculus or Functional Programming background then, lucky you! This next parts will be a lot easier to wrap your head around.

The Atomspace is different from procedural programming languages insofar as there isn’t a program counter as I typically understand it. You can’t “follow” the execution in a sequential fashion the way you might be able to in other languages. Under the hood, obviously, it’s software running on a microprocessor so there has to be sequential instruction-flow at some level, but it’s abstracted away from you and trying to follow it is counter-productive to understanding how to effectively use the Atomspace.

Conditional Expressions

In the previous chapter, we used a MeetLink and QueryLink to query the “weight_in_kg” of “Fido the Dog”, and used a different query to find all dogs (actually all atoms) heavier than 10kg. But how can we cause a different action to be taken, depending on whether or not Fido is heavier than 10kg?

More generally, how do we compose a conditional expression in Atomese?

Let’s start with a simpler conditional. We’ll use a CondLink like this:

(cog-execute!
    (CondLink
        (GreaterThan
            (Number 2)
            (Number 1)
        )
        (Concept "Yes")
        (Concept "No")
    )
)

When you execute the Scheme snippet above, you will see (ConceptNode “Yes”) because 2 is indeed greater than 1. Simple enough. CondLink takes either 2 or 3 atoms as arguments.

The first is the conditional predicate atom. Something that will evaluate to true or false. There is a lot more to say about this later. For now, just remember that CondLink predicates must be 100% true or they will be considered false.

Argument 2 is the consequent atom, or as I like to think of it, the expression after the then keyword in other languages. Optionally, for argument 3, you can supply a default atom, which is basically the else expression, to be executed if the conditional evaluates to false.

Building on that, let’s compare Fido’s weight rather than just comparing some constants. First we need to bring Fido back into the Atomspace (assuming you’ve cleared things out since last chapter’s exercises).

(define fidos_weight_link
    (ListLink
        (Concept "Fido the Dog")
        (Predicate "weight_in_kg")
    )
)

(StateLink
    fidos_weight_link
    (NumberNode 12.5)
)

Now we need a predicate that will query for Fido’s weight, and evaluate to true if he’s heavier than 10kg.

(define fido_is_big?
    (SatisfactionLink
        (AndLink
            (StateLink
                fidos_weight_link
                (VariableNode "dogs_weight_node")
            )
            (GreaterThan
                (VariableNode "dogs_weight_node")
                (Number 10)
            )
        )
    )
)

Earlier I promised I wouldn’t drop a new atom or other construct on you without at least attempting to demystify it. SatisfactionLink is yet another query link type. Fundamentally it’s just like MeetLink, GetLink, QueryLink, and BindLink.

The main feature that sets SatisfactionLink apart is that it evaluates to a TruthValue. True, aka stv(1, 1), if the expression could be matched in the Atomspace, and false, aka stv(0, 1), if not. There is a lot to say about TruthValues, and we’ll get there soon. For now you can think of them as Boolean True/False or Yes/No values, just know that there is a lot more to them.

Note

SatisfactionLink is actually the basic building-block from which all of the other query link types are constructed.

Finally, let’s use our new fido_is_big? predicate in a CondLink atom.

(cog-execute!
    (CondLink
        fido_is_big?
        (Concept "Yes")
        (Concept "No")
    )
)

Executing that should get you a resounding (ConceptNode “Yes”)!

Defines, Schemas, Lambdas and Functions in Atomese

Up until now, we’ve been using Scheme’s (define) mechanism to as a way to get a symbolic reference to an atom we intend to use later. But this mechanism has some limitations that we’re about to cover, not to mention its reliance on Scheme. Remember the Atomspace can theoretically be accessed from other non-Scheme environments.

In this section we’re going to introduce some code segmentation and organization primitives that are native to Atomese.

Let’s being with Atomese’s own version of define, the DefineLink. Here’s an example:

(Define (DefinedSchema "five") (NumberNode 5))

We just introduced two new atom types: DefineLink and DefinedSchemaNode. DefineLink gives a name to something else. That’s it.

The first argument to a DefineLink needs to be a “naming” node. There are 3 special “naming” node types: DefinedSchemaNode, DefinedPredicateNode, and DefinedTypeNode. We’ll cover the latter two in due course, but here we’ll focus on DefinedSchemaNode. The second atom provided to DefineLink is the definition body. In our case, it is just a simple NumberNode.

DefinedSchemaNode is the most general of the “naming” node types. A DefinedSchemaNode can give a name to any other atom.

We can now use our DefinedSchemaNode as we would use the atom it represents… Almost. This example below does exactly what you think it should do.

(cog-execute!
    (State
        (Concept "counter")
        (DefinedSchema "five")
    )
)

But without the cog-execute!, the StateLink connects (Concept “counter”) to (DefinedSchemaNode “five”), and not to (NumberNode 5). The DefinedSchemaNode will be replaced by the node it represents, but only when it is reduced by executing it. Think of it like a const in C, not like a pre-processor #define.

Basic Subroutines

DefinedSchemaNode can also be used to define subroutines. By subroutine I mean a block of code you can call, but where there are no function arguments. The expectation is that all communication between the subroutine and the caller happens by way of global program state. Subroutines are just a way to dispatch one chunk of code from a different place in the program. Most modern programming languages don’t even have subroutines because they can lead to hideous spaghetti code and functions with arguments reduce to subroutines in the case where no arguments are needed. If you’ve written assembly code by hand or worked with a very old language like Integer BASIC, you’ll certainly appreciate why subroutines are insufficient to architect a complex piece of software.

All that said, subroutines are simpler than functions so let’s start there. Here is an example:

(DefineLink
    (DefinedSchemaNode "turn_on_switch")
    (PutLink
        (State
            (Variable "switch_placeholder")
            (Concept "On")
        )
        (Concept "Global Switch")
    )
)

We can call it like this:

(cog-execute! (DefinedSchemaNode "turn_on_switch"))

That’s it. The (DefinedSchemaNode “turn_on_switch”) takes the place of the whole PutLink and all its dependent atoms.

Typed Variables and VariableLists as Arguments

LambdaLink takes a single atom to specify its argument, but what if we want to pass multiple arguments to a function? Enter the VariableList atom. A VariableList, as the name would suggest, is an ordered list to define multiple VariableNode atoms, and thus multiple function arguments.

You can use it with a LambdaLink, as in the example below, where I’ve reimplemented simple multiplication using TimesLink.

(DefineLink
    (DefinedSchema "multiply")
    (LambdaLink
        (VariableList
            (VariableNode "x")
            (VariableNode "y")
        )
        (TimesLink
            (VariableNode "x")
            (VariableNode "y")
        )
    )
)

And we use a ListLink to pass the arguments, when we want to call it.

(cog-execute!
    (ExecutionOutputLink
        (DefinedSchema "multiply")
        (List
            (Number 2)
            (Number 3)
        )
    )
)

If your list doesn’t have enough arguments to match the VariableList of the LambdaLink that you are calling, then it will cause an error. On the other hand, if you pass additional extraneous arguments they will be silently ignored.

Note

VariableList is one type of Connector. Connector atoms are used to declare which atoms may be linked with which other atoms. They are key to specifying custom grammars, which we will cover later on.

We have been using “naked” unadorned VariableNode atoms, which can map onto any other atom whatsoever, irrespective of the atom’s type. In some situations we may want to constrain the types of atoms that our function accept as arguments. This is done with a TypedVariableLink.

Here is the “multiply” function from above, but using TypedVariableLink atoms to declare that the parameters should be NumberNode atoms.

(DefineLink
    (DefinedSchema "multiply")
    (LambdaLink
        (VariableList
            (TypedVariableLink
                (VariableNode "x")
                (TypeNode "NumberNode")
            )
            (TypedVariableLink
                (VariableNode "y")
                (TypeNode "NumberNode")
            )
        )
        (TimesLink
            (VariableNode "x")
            (VariableNode "y")
        )
    )
)

TypeNode is an atom type that represents an atom type. Whoa… that’s meta. It can refer to any of the built-in atom types we’ve used so far; in addition, new types can be defined. In a later chapter we’ll get deeper into Signatures and creating new atom types.

Note

QUESTION FOR SOMEBODY SMARTER THAN ME. What is TypedVariableLink actually useful for when used in a LambdaLink? It seems to be ignored when executing lambdas. I see how it gives extra criteria when matching. Am I missing something?

Finally, I should mention that the variable declaration features of Atomese aren’t just for LambdaLink and functions.

VariableList and TypedVariableLink can be used with any of the query link types, such as MeetLink, QueryLink, SatisfactionLink, etc. Often including a VariableList in a match expression is optional, because the variables can be inferred from the expression itself. When there is any ambiguity, however, including a VariableList is required.

In addition, including a TypedVariableLink around a VariableNode in a query can help narrow down the possible matches that can satisfy the query.

Looping with Tail Recursion

Atomese doesn’t have loops, in the way most iterative programming languages do. Like most pure functional languages, looping behavior is done through recursion. To utilize recursion, you make the loop body into a function, and then you have the function call itself. Every recursive function fundamentally has two paths through it, one path that returns without further iterations, and one path that reduces the problem into a smaller problem and then calls itself to solve that new reduced problem.

It’s mathematically proven (By Alan Turing himself) than any algorithm that can be expressed with iteration can also be expressed with recursion. However, this says nothing about the gymnastics that are needed for the implementation of a particular algorithm. In any case, this is not the point of the guide and the topic is thoroughly covered elsewhere.

The “Tail” in Tail Recursion means that self-calling is the very last thing the function does before exiting. Therefore, there is no need to retain the execution state for the “parent” function. Traditional recursion has a bad reputation for consumption of stack memory, because each iteration allocates a new stack frame. Tail recursion makes this a non-issue. In fact, in many programming languages, tail recursion has been shown to be as fast as iterative looping.

Note

Atomese is not a language designed for execution speed, so don’t expect your Atomese code to be fast compared to anything written in a more traditional compiled language, or even an interpreted scripting language.

Below is an example that uses tail recursion to increment a counter repeatedly. This could stand-in for a for-loop.

(DefineLink
    (DefinedSchema "loop_body")
    (LambdaLink
        (VariableNode "iterator")
        (CondLink
            ; Check to see if the "iterator" parameter is > 4
            (GreaterThan
                (VariableNode "iterator")
                (Number 4)
            )

            ; If so, we are done,
            (VariableNode "iterator")

            ; Here is the recursive call.  Call ourselves with iterator+1
            (ExecutionOutputLink
                (DefinedSchema "loop_body")
                (Plus (Variable "iterator") (Number 1))
            )
        )
    )
)

So the above function checks to see if the parameter is > 4, and if not, calls itself with its parameter + 1. This means it will be called 5 times in total. Of course, it doesn’t actually do anything, so there is no point to the function. It’s just a pure illustration of a recursive function implementing a loop.

Here is a slightly more useful example, a function to calculate the nth term of the Fibonacci sequence. Of course this is far from optimal, given the order-n^2 execution difficulty for a sequence that should be order-n or better.

(DefineLink
    (DefinedSchema "fibonacci")
    (LambdaLink
        (VariableNode "n")
        (CondLink
            ; The 1st term is 0, so check to see if n is smaller than 2
            (GreaterThan
                (Number 2)
                (VariableNode "n")
            )
            (Number 0)

            ; The 2nd term is 1, so check to see if n is smaller than 3
            (CondLink
                (GreaterThan
                    (Number 3)
                    (VariableNode "n")
                )
                (Number 1)

                ; Otherwise, the nth term is fibbonacci(n-1) + fibbonacci(n-2)
                ; Here are the recursive calls.
                (PlusLink
                    (ExecutionOutputLink
                        (DefinedSchema "fibonacci")
                        (MinusLink (VariableNode "n") (Number 1))
                    )
                    (ExecutionOutputLink
                        (DefinedSchema "fibonacci")
                        (MinusLink (VariableNode "n") (Number 2))
                    )
                )
            )
        )
    )
)

So here’s one way to improve our solution to only require n iterations as opposed to n^2.

(DefineLink
    (DefinedSchema "fibonacci_iterative")
    (LambdaLink
        (VariableList
            (Variable "n")
            (Variable "current_iter")
            (Variable "current_term")
            (Variable "previous_term")
        )
        (CondLink
            ; If we're reached the end of the sequence, return "current_term"
            (Not (GreaterThan ; not-greater-than is equivalent to less-than-or-equal-to
                (Variable "n")
                (Variable "current_iter")
            ))
            (Variable "current_term")

            ; Call ourselves recursively, to get the next term
            (ExecutionOutputLink
                (DefinedSchema "fibonacci_iterative")
                (List
                    (Variable "n")
                    (Plus (Variable "current_iter") (Number 1))
                    (Plus (Variable "current_term") (Variable "previous_term"))
                    (Variable "current_term")
                )
            )
        )
    )
)

This appraoch requires us to “seed” our sequence with some starting terms, but it is considerably more efficient than the earlier solution. Here is an example of how we might call it:

(cog-execute!
    (ExecutionOutputLink
        (DefinedSchema "fibonacci_iterative")
        (List
            (Number 9) ; We want the 9th term
            (Number 2) ; We're passing in the 2nd term
            (Number 1) ; The 2nd term is 1
            (Number 0) ; The 1st term is 0
        )
    )
)

Note

QUESTION FOR SOMEONE SMARTER THAN ME. This is where I’d like to say “LambdaLink has this feature for default arguments”, but I couldn’t find anything like that… Does it exist?