The new thread locking algorithm is beginning to work and it looks fast, real fast. Yes 
The new thread locking algorithm is beginning to work and it looks fast, real fast. Yes 
Time for the second demo overview: the Scanner. It builds on most of the ideas found in the first demo but it goes way further, and actually does something very useful (although you wouldn’t say it at first). It’s probably going to be a lengthy piece so I’m thinking of cutting it in 2 or maybe even 3 parts. Anyway, lets first start it up, either through the start menu shortcut (in the Demo’s sub folder, conveniently called Scanner demo), or by opening it in NND (File/Open, select the ‘My documents/NND/Demos/Scanner’ folder). Once the project is loaded, you should see a single text communication channel open (called Text sin), if this is not the case, go to View/Communication channels/Text sin and make certain that it is is selected.
Lets get a taste of what it does, so enter some text (or leave the one that’s already there) and press the ‘send’ button (or enter).
You’ll notice that it basically does the same thing as the echo demo: the input text is echoed back, except that it’s a bit slower. If you had the debugger tab open, you probably also noticed a lot more activity, so something more must be going on.
And indeed, if you take a closer look to the text sin channel, in the upper section, you can see the neurons that were send to the network (on the left) and those that were returned (on the right), which are different. This was not the case with the echo demo, it simply sent all the incoming neurons back out, as they were. In this demo though, we get back something completely different: TextNeurons, that represent the same thing as the int neurons that were sent as input (if you regard them as ASCII characters). Hence the name of the demo, it’s a scanner.
Converting a stream of ASCII chars into words, integers, doubles and signs is a pretty useful feature and that’s all this demo is capable of doing, but that’s only because I stopped there. You see, in the background, is a general purpose algorithm that converts an input stream of neurons into a single result cluster, using any and all of the flows that are defined in the network. This means that you can use the same algorithm with many different flow definitions. I have simply defined some to recognize words, integers and doubles. You could go further and add flows to find verbs, sentence subjects,.. (in fact, that’s what the AICI 1 demo does). You could even go further still and create flows for visual objects or audio fragments, the same algorithm can be used. Unfortunately though, the editor doesn’t yet support such types of displays for flows (will probably be added somewhere in the future though).
So how is the translation actually performed? To explain this, let me first recap some of the basic concepts of neurons and flows:
Now, if you recall from the first demo, an input starts by creating an IntNeuron for each ASCII value, which is linked to another neuron using the ‘Letter’ neuron (ID 109). These are all put on the execution stack and the processor starts (the Rules code on the Letter neuron is executed).
So both neurons are new and only have each other as links. This means that
we can’t use links or parent/child relationships to resolve the first step, but instead must use something different. The only other thing that remains is the value of the int neurons, so this is compared against some constants to see if they are digits (0..9) alpha numeric (a..z+A..Z), spaces | returns, or something else. This comparison results in the creation of 1 new neuron per input neuron: the result cluster, in which we store the integer. One of these clusters (or it’s duplicate, due to a split) will eventually store the end result. This cluster is linked to one of 3 static neurons: Digit, Alpha or Space (signs like . or , are handled a bit differently, this will be explained later). Naturally, if the integers would represent color values, we would use other starting points than digit or alpha. In other words, this first part is variable according to the type of input and the required accuracy of the algorithm. As meaning, we use the start of the ‘flow recognition’ algorithm, called ‘Stage 1.1’ and put the result cluster back on the stack. It’s important to put this one back on the stack, and not the item we are looking for. That’s because the algorithm can perform numerous splits and we want the result to be duplicated not the searchable, cause the contents of the list are continuously modified and we don’t want the result of one processor to be modified by another one (I had to learn this the hard way).
After this initial step, the actual recognition algorithm kicks in. This consists out of 4 stages, grouped by 2. Meaning that stage 1.2 is executed immediately after stage 1.1 for each neuron (this is the same for stage 2,1 and 2.2), but at the end of stage 1.2 and 2.2 all the result neurons are collected into a single cluster. Only after the last link of the last item on the stack has been processed, are all the result clusters put back on the stack, with links for the next stage. This is done for allowing to group items together.
The different stages are:
These 4 stages are repeated until there is only 1 result cluster on the stack that represents the end result of a flow which is no longer used in any other flows. This cluster is made the result of the split for the processor it ran on. Off course, because there were possibly many splits, there could be many results. These will all be presented in the split-callback cluster (you need to provide a code cluster to the Split instruction, which will be called when all sub processors are done). In this demo, the result is sent back to the sin that caused the input, in a normal situation thought, this will simply start another process, as is done in the AICI 1 demo.
During this whole process, the algorithm is capable of executing callback code (attached to the statics, conditionals, parts and flows) at certain specific moments in the code. This is where the magic happens. The following types of callbacks are possible (together with their execution time):
In the next post, I’ll go deeper into the specifics of the algorithm itself, for as you’ve guessed by know, it’s a bit funky, and I don’t want to forget all the subtleties, since it’s definitely still a work in progress (there are many improvements still possible).
I thought I’d write something down on how I see things progress from here on. No dates and times, just a general idea of what I have planned for the next release (version 0.3), so here goes.
There still is a major hiccup in the execution core: some instructions aren’t guaranteed to be uninterruptable, that is to say, some, like the instructions that change links, can be interrupted by other processors before finishing. This could result in data corruption.
The current protection mechanism against data corruption also appears to slow down the core way to much (most time is actually spend waiting on a sync object here or there). While trying to fix this, I pushed a bit to far, which resulted in the occasional loss of recycled id’s, in other words, deleted neurons are dropped incorrectly (you can usually see this in version 0.2 after a large input stream has been processed, the explorer might show some red slots).
The fix is a new thread syncing system on top of that provided by the OS, instead of trying to rely on the slower, clumsier locks provided by the system, I need to create something new. The design is mostly done, it just needs implementing, which requires some work.
I really like to get this demo working as soon as possible, so I will probably do a lot of work here. There are some updates required to the system though, before this is possible:
I guess by the time I have gotten this far, a lot more, currently unforeseen stuff will have been added, so for now, lets make this all for version 0.3. After that, I plan to do some more work on the visual side of things (not how the UI looks, but the image sensory interfaces and their UIs).
I sort of blew out all the cylinders while attempting a first run directly from numbers to the English grammar: memory usage went up to +1.5 gig, thread count was +800 and a gazillion temporary neurons had been created before it all came to a grinding halt. Some redesign was required. So I beefed up the execution engine so that you are now able to throttle the maximum amount of simultaneous system threads that are used, which saves an incredible amount of resources and it prevents the app from grinding to a halt (well, that and the removal of a small army of bugs).
I also realized that the flow recognition algorithm simply was not yet mature enough to be used in even the most basic situations. Not just because it was not yet able to process the more complex flow structures, but also because I made a basic design error that I did not yet know off: always put the result cluster on the stack (‘from’ part of link) and the item-to-search in the ‘to’ part. Getting this algorithm ready definitely took the most effort, but it was a grate catalyst to improve the debugger and the execution core. Here’s a (non exhaustive) list of new/changed things:
Perhaps a final note, best to uninstall any previous versions. I haven’t yet tested how the installer works when it overwrites a previous installation, so it might screw things up. And off course, you can get the new installer from here.
Man, this was a tough nut to crack, but it’s done, it’s finally done. The flow recognition algorithm is working.
To find some of the more tedious bugs, I had to create 2 new debugging techniques: attached neurons and split paths, which I will explain shortly. The whole algorithm eventually became seriously elaborate to deal with some of the more complicated situations. I need to document this very soon before I forget myself (I guess the radio silence is out of the window, dev time simply took to long for 1 algorithm, I need to do at least 3 more of these which would take far to long). I’m pretty convinced there are still some caveats to work out, but, as far as I have been able to test, all the situations using a scanner flow definition seem to work. The core still shows some hick-ups at times, which can still result in bogus results, but this has also been improving considerably. Expect a new update very shortly (like in a week or so).
My network is starting to get a mind of it’s own:
I am trying to get the flow recognition algorithm working so it can handle mixed content (words, numbers and signs mixed), which has been one of the more difficult algorithms I have worked on, to date. Clearly, I still haven’t got the order quite right. This has been the main issue holding back a new release by the way.
I hate storms, especially when they happen round the same time that there is a solar eclipse on the other side of the world. My crohn did not agree with this situation one bit and made me aware of this through an unwelcome series of cramps, followed by first brown, than red stuff.
So, I’m not doing much today.
Well, that bear (you know, the deadlocks) turned out to be a formidable grizzly. Now, I don’t know about you, but me, when I see a monster like that, I turn around and run… I can assure you, there’s nothing better than a fierce predator on your tail to streamline things. First to go was .net’s WeakReference pattern. This simply couldn’t keep up with the engine (a change that touched every part of the designer: all editors, toolbox, explorer, thesaurus, timers,…). Next was the ReaderWriterLockSlim thingy (used to protect data blocks from corruption), which has a very peculiar definition of slim: you have no idea how many times I have seen my RAM blown up because of a simple integer scan. Lots of other stuff got tuned up or hacked out as well, so the expected update has arrived.
The engine appears to be stabling out, although it is still acting fishy on single core machines, where there are errors I don’t have on my multi core dev system, so I need to move to a different machine to test this out. The designer is also still very much lingering behind the engine when this is processing at full speed, but the UI should remain responsive now.
I have also included an extra demo project called ‘Scanner’. It is able to transform an input stream containing integers, representing characters, into words and numbers (ints and doubles). This doesn’t seem much, and it isn’t, except that it is doing this using a couple of flows and a general purpose algorithm (processing is still slow, mostly because the UI is trying to catch up). This was the guts of the older ‘English language definition’ demo, which I have split into 2: the scanner and the language definition, which is no longer able to do any processing (all code removed). It’s an example of a more complex flow.
The scanner demo also has the number scanning problem fixed (numbers longer than 2 came out with multiple results). This appeared to be caused by deleting a couple of neurons to many (in the scanning algorithm). I had already experienced the dangers of deleting neurons I thought were no longer used, but which were because of the splits. I will probably have to implement some sort of a garbage collection system to clean up unused neurons (but that’s for later).
Anyway, here’s the latest download (best to remove previous installation before installing this one).
It’s time to do a final hack-stretch. There’s still this annoying bug in the scanner to fix which causes numbers to be misunderstood. This shouldn’t be to bad though (I hope). Followed by the final push towards a responsive network that is able to record simple statements and respond to ‘what is’ questions. This is going to take some time, so there wont be many posts coming out in the near future. Perhaps a small update here or there, but nothing serious until I have something. Here goes…
This is the very first debugger I have written, and I am pretty proud about it! It’s not a masterpiece, but functional. The code definitely could use some tidying up and some speed tuning wouldn’t hurt at all, but you can trace bugs, inspect values and follow the program flow, and that’s already something I guess. To explain the debugger to you, I thought it perhaps best to do it using some of the demo’s. Simply open the ‘Echo words’ project to get started.
Before we send some input to the network, we need to set up the designer so that it is ready to debug:
Note that it is not possible to switch between different modes while a processor is running. You can only specify the debug mode for newly created processors.
An overview of all the breakpoints in the project can be found on the right side of the debugger tab. Currently it simply lists all the breakpoints, but new features should be added to this list shortly.
To get started, we need to send some data to the network through it’s text-sin (sensory interface), so make certain that the ‘Echo channel’ is opened (a communication channel is the visual interface for a sin). Go to View/Communication Channels/Echo channel, and make certain it is checked (and that the tab is selected).
Once the channel is open, type some text in the input section and send it to the network by pressing enter (or the send button). You should see a neuron appear in the left upper screen, which represents the input event (the neuron that will be solved by the processor). Your text will also appear as a string in the centre dialog screen. There will also appear an object in the centre screen of the debugger tab. This represents the processor that was started and which is handling the input event.
The processor overview contains 2 numbers. The first number represents the name of the processor. This value can be changed and is used to identify it between multiple processors. The second number represents the number of neurons that are left on the stack + the current neuron that is being solved.
The buttons represent, from left to right:
Once a processor is started, the middle square in the right side group on the status bar will be blue. This will remain so for as long as the network has still a processor running.
If you press the first toggle button, a detailed view of the processor should open in a new tab (using the same name as that of the processor).
This tab is divided into 3 sections (from left to right):
The detailed view is great to have an overview of where we are in the processing stage, but it isn’t very useful to debug a program flow. It’s better to use the code editor for this because it provides a better overview of the connection between the statements. To do this, open the editor containing the code and make certain the the processor in the debugger tab’s center screen is selected. This last action is very important for the following reason: a detailed view is opened for a specific processor, so a single tab always links to a single processor. Code however is not specific to a processor, it can be called by many processors. So it needs to know the processor context for displaying debug information. It uses the processor that is selected in the debugger overview for this purpose. This has a nice side effect: when you have multiple processors running, you can quickly switch between active processor and view where each one is in the code. Note that the code editor will put a red square around the statement that will be executed next, to indicate execution location.
We are ready now to start debugging the network. The first thing you can do is walk through the code using the following commands:
When a processor is paused, it is possible to inspect the value of any item that returns a result, this includes: variables, globals, result statements, bool expressions and ByRefs. You can do this by selecting the item you want to inspect with the mouse, and pressing F7 or through it’s context menu (Inspect value). This opens a dialog with the debug info of the result. This can be empty, 1 or multiple neurons.
Debug info for a neuron contains the following info:
The debug info is depicted recursively for all neurons. This type of debug visualization is used in multiple places throughout the designer. You might have noticed that the echo channel uses this UI element to depict the incoming and outgoing neurons. The detailed view of a processor also uses it to depict most of the neurons.
A final feature of the debugger is watches. These allow you to observe the content of variables and globals in a list for a single processor or across all processors for those that are paused.
This feature is best experienced using the ‘English language definition’ demo since this performs some splits while processing text input. I have put a breakpoint in the ‘Code: Stage 1.1’ page on the second if statement (first child if in the left-side path of the only root if) and ran it until ‘Found == Sentence(flow)’ to get the screenshot on the left.
To add watches, drag a variable or global and drop it in the left part of the debugger tab. This will add it at the bottom of the list. If you are dragging it from a code editor, it is best to hold the ctrl key pressed, so that the item stays at it’s original position. Note that it’s currently not yet possible to drag from the toolbox. This is a small bug that still needs fixing. Also note that it’s not yet possible to remove variables (except by editing the designer file). This command will be added very soon (just goes to show how fresh the debugger still is).
By default, a list of watches with their values for the selected processor is depicted. The left side is the name of the variable (or it’s id if no name has yet been assigned). To the right of the name, the content of the variable is displayed. This can be empty, 1 or more neurons, all of which are depicted using the debug UI element for neurons.
If you switch to variable view, the left section of the debugger tab will only display a radio button for each watch. The middle section, which contains processor info, will now display the contents of the selected watch for each processor.
Also note that, when a processor gets split up into multiple sub procs, the view will switch to a tree. The root node shows the number of processors contained by the node. If a new input event is sent to the network before the previous has been processed, it is added as a list item, at end of the list. If sub processors split up, more sub nodes are created as children of the nodes that triggered the split (tree structure). This allows you to see how processors are related to each other. At the moment you can only see the current state, through the tree/list structure. In the future, an extra view might be added that shows a line view over time to show when and how may processors did a split and when they died out, although that’s just an idea at the moment, so don’t put your hopes up to see it any time soon, there are still far more important things to do.
If there are any errors generated by the code, either through the Error or Warning instruction or because of an error in the code, you can view exactly where it occurred. All messages are stored in the log tab. When they are blue, you can double click on them. This will open a code editor with the statement selected that caused the log item (note: if it is somewhere in a sub section, this is not expanded automatically) Because you can use the same statement in multiple locations, only the first few will be selected, this is to take care of some problems with WPF’s standard controls (will be fixed in the future).
As you can see, it’s all still fresh, but functional. The debugger has already become a central point of usage for me and I suspect this will only increase, so this will get some extensive testing early on. There are plenty of things that still need adding like conditional breakpoints, counts on breakpoints, enabling-disabling of breakpoints, more commands on the breakpoints-list (like clear, enable/disable all,…),… These things will probably be added as needed.
I am certain there is plenty more to say about NND’s debugger, I just can’t think of anything anymore, so I guess I’ll leave it by this for today. It’s already turned out a long enough post.