Demos explained: Scanner

October 1, 2009

Note: Deprecated!

Intro

Time for the second demo overview: the Scanner.  It builds on most of the ideas found in the first demo but it goes way further, and actually does something very useful (although you wouldn’t say it at first).  It’s probably going to be a lengthy piece so I’m thinking of cutting it in 2 or maybe even 3 parts. Anyway, lets first start it up, either through the start menu shortcut (in the Demo’s sub folder, conveniently called Scanner demo), or by opening it in NND (File/Open, select the ‘My documents/NND/Demos/Scanner’ folder). Once the project is loaded, you should see a single text communication channel open (called Text sin), if this is not the case, go to View/Communication channels/Text sin and make certain that it is is selected.

Overview

image

Lets get a taste of what it does, so enter some text (or leave the one that’s already there) and press the ‘send’ button (or enter).

You’ll notice that it basically does the same thing as the echo demo: the input text is echoed back, except that it’s a bit slower. If you had the debugger tab open, you probably also noticed a lot more activity, so something more must be going on.

And indeed, if you take a closer look to the text sin channel, in the upper section, you can see the neurons that were send to the network (on the left) and those that were returned (on the right), which are different. This was not the case with the echo demo, it simply sent all the incoming neurons back out, as they were. In this demo though, we get back something completely different: TextNeurons, that represent the same thing as the int neurons that were sent as input (if you regard them as ASCII characters). Hence the name of the demo, it’s a scanner.

Converting a stream of ASCII chars into words, integers, doubles and signs is a pretty useful feature and that’s all this demo is capable of doing, but that’s only because I stopped there. You see, in the background, is a general purpose algorithm that converts an input stream of neurons into a single result cluster, using any and all of the flows that are defined in the network. This means that you can use the same algorithm with many different flow definitions. I have simply defined some to recognize words, integers and doubles. You could go further and add flows to find verbs, sentence subjects,.. (in fact, that’s what the AICI 1 demo does). You could even go further still and create flows for visual objects or audio fragments, the same algorithm can be used. Unfortunately though, the editor doesn’t yet support such types of displays for flows (will probably be added somewhere in the future though).

Details

So how is the translation actually performed? To explain this, let me first recap some of the basic concepts of neurons and flows:

  • imageThe different types of  data available to a neuron are: incoming and outgoing links, possibly one or more parent clusters, for clusters possibly 1 or more children and a meaning. And finally value neurons also have their value of course. This is important, cause when the translation process starts, this is all the available information.
  • Flows are nothing more than clusters that contain flow items, which can be statics or conditionals (loops and options). These in turn can only contain conditional parts. They represent a single branch of the decision tree. Parts can again have the same data as flows: statics or conditionals. So if you are a neuron (a static, part, conditional or flow), you can always look up into your list of parents to see in which flows and parts it is used.

imageNow, if you recall from the first demo, an input starts by creating an IntNeuron for each ASCII value, which is linked to another neuron using the ‘Letter’ neuron (ID 109). These are all put on the execution stack and the processor starts (the Rules code on the Letter neuron is executed).

So both neurons are new and only have each other as links. This means that image we can’t use links or parent/child relationships to resolve the first step, but instead must use something different. The only other thing that remains is the value of the int neurons, so this is compared against some constants to see if they are digits (0..9) alpha numeric (a..z+A..Z), spaces | returns, or something else.  This comparison results in the creation of 1 new neuron per input neuron: the result cluster, in which we store the integer. One of these clusters (or it’s duplicate, due to a split) will eventually store the end result.  This cluster is linked to one of 3 static neurons: Digit, Alpha or Space (signs like . or , are handled a bit differently, this will be explained later). Naturally, if the integers would represent color values, we would use other starting points than digit or alpha. In other words, this first part is variable according to the type of input and the required accuracy of the algorithm. As meaning, we use  the start of the ‘flow recognition’ algorithm, called ‘Stage 1.1’ and put the result cluster back on the stack.  It’s important to put this one back on the stack, and not the item we are looking for. That’s because the algorithm can perform numerous splits and we want the result to be duplicated not the searchable, cause the contents of the list are continuously modified and we don’t want the result of one processor to be modified by  another one (I had to learn this the hard way).

After this initial step, the actual recognition algorithm kicks in. This consists out of 4 stages, grouped by 2. Meaning that  stage 1.2 is executed immediately after stage 1.1 for each neuron (this is the same for stage 2,1 and 2.2), but at the end of stage 1.2 and 2.2 all the result neurons are collected into a single global. Only after the last link of the last item on the stack has  been processed, are all the result clusters put back on the stack, with links for the next stage.  This is done for allowing to group items together.

The different stages are:

  • Stage 1.1 (search parts/flows): Find conditional parts or flows in the list of parents of the searchable. If there are multiple results, perform a split for each, after the result list has been filtered (these are the shortcuts). If there are no results and this is the only and last item still on the stack, the end result has been found.
  • Stage 1.2 (sequence-combine and filter): Check if items are sequential (2 flow items declared after each other in the same parent list, which is a part or flow) and handle floating flows, which are allowed to appear anywhere in the input stream, but which break up the sequence of other items.  Different actions can be performed if the order of the items is not ok: try to solve further or exit without result. Results of sequential items are grouped together.
  • Stage 2.1 (search conditionals): Find conditionals in the list of parents of the searchable if this is a conditional part, otherwise the stage is simply skipped. If there are multiple results, perform a split for each after the result list has been filtered (not yet completely implemented at this stage). If there are no results, there is an error in the flow definition.
  • Stage 2.2 (process loops and sync-points): If there was a conditional found in the previous stage, check if this is a loop. If so, and the previous item is of the same loop, combine the results. Also start a sync-point (will be explained later) if this was defined on the conditional.

These 4 stages are repeated until there is only 1 result cluster on the stack that represents the end result of a flow which is no longer used in any other flows. This cluster is made the result of the  split for the processor it ran on.  Off course, because there were possibly many splits, there could be many results. These will all be presented in the split-callback cluster (you need to provide a code cluster to the Split instruction, which will be called when all sub processors are done). In this demo, the result is sent back to the sin that caused the input, in a normal situation thought, this will simply start another process, as is done in the AICI 1 demo.

During this whole process, the algorithm is capable of executing callback code (attached to the statics, conditionals, parts and flows) at certain specific moments in the code. This is where the magic happens. The following types of callbacks are possible (together with  their execution time):

  • Flow code: this code cluster is executed when a flow has been recognized in the stream. The ‘Result’ variable (ID 1822) contains all the neurons that match the flow. It’s here for instance that the int neurons are converted to a single word using the CiToS (Cluster with ints to string) instruction.
  • Filter flow code: this code cluster is called from stage 1.1  (or 2.1, but this is not yet completely implemented) just after all the next items were retrieved from the list of parents of the searchable (stored in the CurrentTo variable). It allows the flow item to determine if it is a valid result, given the current state of the network. This is done by checking the contents of a number of globals, like ‘Prev stage item’ (ID 1956), which contains the previously processed neuron.

In the next post, I’ll go deeper into the specifics of the algorithm itself, for as you’ve guessed by know, it’s a bit funky, and I don’t want to forget all the subtleties, since it’s definitely still a work in progress (there are many improvements still possible).

tags: , , , , ,
posted in N²D by admin

Follow comments via the RSS Feed | Leave a comment | Trackback URL

Leave Your Comment

 
Powered by Wordpress and MySQL. Theme by Shlomi Noach, openark.org