Unstructured Text: 2018

Sunday, December 2, 2018

Matching brackets

For my work as a developer, I sometimes get the chance to interview new developers for the company. I check their resumes and I review some of their code they send me. We thought it would also be a good idea to give them a little programming task to do in a limited amount of time. I had no idea what kind of task to choose, so I checked Google. The following problem showed up several times and I liked it because it requires some thinking to do it right and at the same time, it doesn't require a lot of typing.

Write a function that checks if the parentheses of an expression match.

The task involves several types of brackets: [], () and {}. For example, the parenthesis in this expression match:

({[12][34]}{(56)})

While in this expression they do not:

{[909)

I wrote a test function (in PHP):

function test(string $exp, bool $expected)
{
    if (matchBrackets($exp) !== $expected) {
        echo "Error matching " . $exp . "\n";
    }
}

and some tests

test("()", true);
test("()()", true);
test("noot()noot", true);
test("(aap)", true);
test("aap", true);
test("(", false);
test(")", false);
test("{[]}", true);
test("{[][]}", true);
test("{[]()}", true);
test("", true);
test("{[aap](({banaan}))}", true);
test("{[{[](({}){a[](({}))}){b[](({})){[](({c}))}}{[](({}))}}]({[]d(({}){[](({123}))})}({{[](({}))}}))}", true);
test("{[aap](({banaa))}", false);
test("{[aap]((anaan}))}", false);

Solving the problem in an elegant way requires you to use recursive nature of the domain.

While the developer was writing the program, I tried to figure it out for myself. And it occurred to me that next to some algorithmic options, there would also be a possibility to use a regular expression. But this could only be done by applying the advanced recursive patterns feature. I had heard of this feature, but never used it.

I would certainly not demand of the developer to give such an answer, but it intrigued me too much to let it go.

It is indeed possible, and so, without further ado, I present the answer here:

function matchBrackets(string $exp): bool
{
    return preg_match('/^
        # start of group 1
        (
            (
                (
                    [^\(\{\[\)\}\]]+
                ) |
                (
                    \( (?1) \)
                ) |
                (
                    \[ (?1) \]
                ) |
                (
                    \{ (?1) \}
                )
            )*
        )
    $/x', $exp);
}

I will try to explain by building the expression from the ground up.

//x

The "x" modifier strips all whitespace from the string. This allows you to make it more readable.

/^$/x

Match the beginning and end of $exp.

/^(recursion-group)$/x

Create the group that will be used for recursion.

/^((recursion-group)*)$/x

Each item in this group can be repeated any number of times. This also allows for the completely empty $exp.

/^(((other-chars)|(round-brackets)|(squared-brackets)|(curly-braces))*)$/x

Any nested group consists of a sequence of non-brackets, or a set of round brackets () or a set of squared brackets [] or a set of curly braces {}. Finally a "bracket" groups look like this:

\{ (?1) \}

It starts with a bracket (here "{"), followed by a recursive group, and ends with the matching bracket "}"..

A recursive group can repeat the entire expression. You would have to use (?R) in stead of (?1). Here it just matches a single group (group 1) from the expression. Group 1 is the group that starts with the first "(" in the expression. I do not use (?R) here because that would place the ^ and $ in the middle of the expression. That would never give a match.

End Note

It took me hours to get this expression just right, and as simple as possible. I expected the recursive expression to be complicated, but actually it turned out that it could be made to be pretty simple. And if you're interested: yes, it is fast. It was about 8 times as fast as the algorithmic variant I came up with. For a non-compiled language like PHP regular expressions open the door to the speed of a compiled language.

Sunday, February 18, 2018

DAYDREAMER - A Landmark Cognitive Architecture

I posted this review before at Amazon
Source code of DAYDREAMER can be found on Github

This book is a revised version of Erik T. Mueller's 1987 dissertation. It was written in 1990 and republished in 2012, by the author himself.

It is a very important book in the discipline of Cognitive Science, as I will try to explain.

Daydreaming, the subject of the book, is the process of "the spontaneous activity - carried out in a stream of thought - of recalling past experiences, imagining alternative courses that a past experience might have taken, and imagining possible future experiences."

Mueller has gone way beyond that single subject, and worked out an impressive cognitive architecture that includes many mental faculties. He supports it with a massive base of research literature. Based on this architecture is the DAYDREAMER Lisp program.

Architecture

At the core of the architecture are personal goals and needs. These include achievement goals: to maintain self esteem, to have friends, and so on; and cyclic goals: food, money, entertainment.

In order to achieve these personal goals, DAYDREAMER has knowledge about "persons, physical objects, locations, possession of physical objects, ... mental states, interpersonal relationships, actions, and activities" (the "interpersonal domain"). This knowledge is implemented as planning rules and inference rules.

The program also applies these rules to other entities than itself in order to represent and make inferences on the mental states of others. This is called "other planning".

"Most AI programs take a request from a human user, handle the request, and terminate...In contrast, DAYDREAMER has a collection of goals which are instantiated and processed as the program sees fit."

A task or activity on behalf of an active top-level goal is called a "concern". Several concerns may be active at any time. "Each subsequence of contiguous representations produced on behalf of a given concern may be called a 'daydream'".

The program uses emotions to determine which concern to choose at any time ("emotion-driven planning"). Several emotions are created in different phases of a concern. For example, when a concern terminates a new positive or negative emotion is created. These emotions may trigger new concerns.

The architecture allow for two types of creative mechanisms: Serendipity and Action Mutation. Serendipity-based planning uses new input, as a fortunate coincidence, to help progress concerns that are not currently active. Action Mutation modifies events and goals in arbitrary ways (mutation) in order to find novel solutions.

Events are stored in episodic memory.

Emotions, Daydreaming goals, learning, and creativity

Chapter 3 shows how to represent emotions in a clear way. It continues with the description of the personal goals that are specific to daydreaming: rationalization, roving, revenge, reversal, recovery, rehearsal, and repercussions. These goals are activated when a person is idle. "Daydreaming goals are heuristics for how to exploit surplus processing time in a useful way." They are activated by emotions and directed, among others, towards the reduction of negative emotional states, and the exploration of consequences of possible actions.
Chapter 4 show how a person learns from daydreaming, the effect on his or her future behavior.
Chapter 5 dives into the creative processes serendipity and action mutation.

Implementation

Chapter 6 describes the implementation of mental states and attitudes using Conceptual Dependency.
Chapter 7 describes the implementation of DAYDREAMER rules and procedures in the GATE language, an AI program for the T language, a version of Scheme (Lisp).
Chapter 8 describes possible episodic memory schemes, and Intersection Indexing, the technique used in DAYDREAMER.

Literature review, philosophical underpinnings, and conclusions

Chapter 9 is a review literature on daydreaming.
Chapter 10 deals with philosophical questions.
Chapter 11 describes possible applications of the architecture.

Conclusion

Mueller seems to have read every single book and paper every written on the subject, and all adjoining fields as well. Whenever he missed a book reference, and I checked, it turned out that that book hadn't been written at the time. Note that Mueller's dissertation was finished as early as 1987!

And even though it is thirty years later now, as far as I know (and I am happy to be proven wrong), this book can still be seen as the most complete description of a cognitive architecture to date.

Saturday, January 13, 2018

My little NLI-GO DBPedia demo

With some pride (but not very much ;) I present to you my natural language interaction demo:

http://patrickvanbergen.com/dbpedia/app/

This is the first presentable result of many years work of trying to create a library that allows a user (you) to interact with a database in natural language.

The language in this example is DBPedia, a database that is based on Wikipedia. It is a triple store that contains information about the people in Wikipedia, among other things.

My library tokenizes and parses the sentence and it converts the parse tree into a relational representation. A solution is looked up and executed. This involves a series of simple database requests (in this case Sparql queries). The results are integrated to more relations. From these relations a response in natural language is generated.

The point of my library is not to create a little interface to DBPedia. I am looking to make the interface to databases in general as simple and powerful as possible. This means that more databases, types of databases, domains, natural languages, etc, should be addable with few configuration files. No coding should be necessary eventually. It's a long way to go.

Nevertheless, I invite you to try these sentences with the demo, and check out the productions that will be shown. These are some of the steps the library takes to find the answer.

How many children had Lord Byron?
How many children had Michael Jackson?
Who married Elvis Presley?
Who married Donald Trump?

I hope you find this interesting.

Unstructured Text