Jan
21
2010

Roadmap revised

A few notes on what’s coming next (just to keep me on focus). I’m changing direction a bit, compared to the previous roadmap: I’d like to end the prototype faze and prepare for beta. This means I plan to drop the visual (graphics) and audio sensory interfaces for version 1. Instead, I will initially focus on linguistic skills solely and move the visual and audio stuff to later releases.

Storage system

First thing on the agenda is the storage system. I am currently saving everything in xml files (1 neuron = 1 xml file), grate for debugging, crap for real networks: reading/copying and moving 30.000 something files around is just to slow to work with (and that’s just a bare bones chatbot). I initially planned on using an sql db or something similar, but for practical reasons, this is not an option. Hence, it’s going to be binary, blocked flat files.

AICI demo

I would also like to extend the aici demo so that it shows how it can interact with the .net system by perhaps program it to copy files or something similar. Before I do anything to this demo though, the new storage system will have to work, cause debugging it is otherwise far to slow.  When this is done, there will probably be a new release.

Modules

next on the agenda will probably be the modules. These are partial networks, that can be imported into or exported  from an existing neural network. This way, you can share peaces of your work with others (like a set of frames or some actions, a thesaurus, …).

Designer improvements

From then on, I’m hoping to only do improvements/bug-fixes, mostly on the designer, so it can be moved to beta. These are the major areas that I will focus on first:

  • There’s still a memory leak in the designer somewhere. This needs some serious attention. I thought I knew where it was, bit it isn’t there, so I have no idea at the moment what it could be.  Mmm…
  • The code editor needs a custom control. I’m currently using wpf’s default listboxes for the editor (yes, it’s all listboxes). This was grate to get started, but you’ve probably already experienced some of it’s problems: it gets slow, real fast, only to blow up with a layout recursion exception (bugger). For me, it’s also a pain to get exactly how I want it (you constantly have to fight with it’s default behavior). Solution: custom control.
  • The same is true for the mind maps and the flow editor (also all listboxes), which need custom controls.
  • I will have to move the search functionality into it’s own thread, cause any small search at the moment freezes the UI. This can’t be off course.
  • Finally there are lots of small things still doing strange things at times, but I guess that’s for the beta stage.
Jan
20
2010

NND 0.3

The new release is finally ready.  The Aici demo took a bit longer than planned. Also, lots of things have been fixed and updated. Here’s a non exhaustive list:

  • imageThere’s a complete new lockmanager running in the background. This is much more secure (thread-wise, that is) and a lot faster. It’s still a bit of a diesel though, it takes some time for it to get going, but once running, it should be pretty fast. The slow start is due to the storage mechanism (all xml files currently). This is the major drag on the entire system at the moment, and will be fixed next.
  • Wordnet import has been seriously updated, a lot more info is retrieved, and it’s now also possible to import the entire db in one go (although not yet advisable, due to a memory bug in the designer, it still takes a major byte out of the hard disk and it simply takes ridiculously long).
  • The thesaurus has been given a make over to allow for editing and filtering. He can now also display / edit non recursive relationships. Drag drop is also supported.
  • The frame editor has been updated considerably: drag drop support has been added and frame element filters/restrictions have also been introduced (will probably be extended in the future).
  • I changed the function of the ‘contains’ operator a bit. It now only checks the contents of a variable. For clusters/children, there are the new instructions (IsClustesteredBy, LinkExists, ContainsChildren, GetInFiltered,…).
  • I have added the ‘not contains’ operator.
  • New instructions:
    • Arithmetic group (+-/*%): I finally caved on those. My imageoriginal plan was to see how far I got without using any arithmetic in the neural code. I guess, this is as far as I got with that.
    • Get-at group: get child at, get cluster at, get out at, get in at, get info at.
    • Distinct
    • get Incoming, get outgoing, get info, Get in filtered, get out filtered, Get info filtered
    • Is clustered by, Link exists, Contains children
    • Remove-at group: Remove child at, remove info at, remove link in at, remove link out at
  • many, many bug fixes, updates and little improvements.

Aici 1

This demo is a small chat interface. Though, in it’s current state, it’s not much more than a framework. It can initiate a conversation, close it, ask for the name of the user and store the data. It’s not yet able to fully recognize a recurring user since I haven’t defined the neural code for this yet.

image I will be explaining how it works and how you can expand on this functionality, shortly. For those who can’t wait and want to get a peek under the hood, here are some pointers to get started:

  • There are 3 main stages:
    • Flow recognition, which is basically the syntactical stage: check the word types and order. Project pages are:
      • the flows (aici/flows),
      • the code that is attached to these flows (aici/code/flow code)
      • the code that recognizes the flows in the input (aici/code/flow recognition)
    • Frame recognition, or the semantics stage. This is where we try to find meaning in the words. Project pages are:
      • the frames (aici/frames)
      • the code that is attached to the frame sequences (aici/code/frame seq code). The frames don’t have code (yet).
      • the code that recognizes the frames in the flow results (aici/code/frame recognition).
    • action execution, or the response of the network to the input. Project pages are:
      • Actions: all the different actions that the system knows (not yet a lot, should be extended).
      • Output: some common code blocks for rendering output. This is also used by the frame sequences, since they are also used to render data in a predefined format.
      • Action helpers: code that the action neurons can use to perform common tasks, like stopping a conversation or controlling the timers. Timer callbacks are also stored here.
  • The transition between the different stages can be located in aici/code/transition.
    • More specifically, the ‘Transition’ neuron is used to go from the flows to the frames and finally to the actions. 
    • An action is started using the ‘Execute action’ neuron as meaning for a link from a data cluster to the action that needs to be started.
  • The project also contains some mindmaps that describe the inner data structures and functionality.
Jan
19
2010

Aici’s first words

image

New release coming shortly.

Oct
03
2009

Lightning fast

The new thread locking algorithm is beginning to work and it looks fast, real fast. Yes

Oct
01
2009

Demos explained: Scanner

Intro

Time for the second demo overview: the Scanner.  It builds on most of the ideas found in the first demo but it goes way further, and actually does something very useful (although you wouldn’t say it at first).  It’s probably going to be a lengthy piece so I’m thinking of cutting it in 2 or maybe even 3 parts. Anyway, lets first start it up, either through the start menu shortcut (in the Demo’s sub folder, conveniently called Scanner demo), or by opening it in NND (File/Open, select the ‘My documents/NND/Demos/Scanner’ folder). Once the project is loaded, you should see a single text communication channel open (called Text sin), if this is not the case, go to View/Communication channels/Text sin and make certain that it is is selected.

Overview

image

Lets get a taste of what it does, so enter some text (or leave the one that’s already there) and press the ‘send’ button (or enter).

You’ll notice that it basically does the same thing as the echo demo: the input text is echoed back, except that it’s a bit slower. If you had the debugger tab open, you probably also noticed a lot more activity, so something more must be going on.

And indeed, if you take a closer look to the text sin channel, in the upper section, you can see the neurons that were send to the network (on the left) and those that were returned (on the right), which are different. This was not the case with the echo demo, it simply sent all the incoming neurons back out, as they were. In this demo though, we get back something completely different: TextNeurons, that represent the same thing as the int neurons that were sent as input (if you regard them as ASCII characters). Hence the name of the demo, it’s a scanner.

Converting a stream of ASCII chars into words, integers, doubles and signs is a pretty useful feature and that’s all this demo is capable of doing, but that’s only because I stopped there. You see, in the background, is a general purpose algorithm that converts an input stream of neurons into a single result cluster, using any and all of the flows that are defined in the network. This means that you can use the same algorithm with many different flow definitions. I have simply defined some to recognize words, integers and doubles. You could go further and add flows to find verbs, sentence subjects,.. (in fact, that’s what the AICI 1 demo does). You could even go further still and create flows for visual objects or audio fragments, the same algorithm can be used. Unfortunately though, the editor doesn’t yet support such types of displays for flows (will probably be added somewhere in the future though).

Details

So how is the translation actually performed? To explain this, let me first recap some of the basic concepts of neurons and flows:

  • imageThe different types of  data available to a neuron are: incoming and outgoing links, possibly one or more parent clusters, for clusters possibly 1 or more children and a meaning. And finally value neurons also have their value off course. This is important, cause when the translation process starts, this is all the available information.
  • Flows are nothing more than clusters that contain flow items, which can be statics or conditionals (loops and options). These in turn can only contain conditional parts. They represent a single branch of the decision tree. Parts can again have the same data as flows: statics or conditionals. So if you are a neuron (a static, part, conditional or flow), you can always look up into your list of parents to see in which flows and parts it is used.

imageNow, if you recall from the first demo, an input starts by creating an IntNeuron for each ASCII value, which is linked to another neuron using the ‘Letter’ neuron (ID 109). These are all put on the execution stack and the processor starts (the Rules code on the Letter neuron is executed).

So both neurons are new and only have each other as links. This means that image we can’t use links or parent/child relationships to resolve the first step, but instead must use something different. The only other thing that remains is the value of the int neurons, so this is compared against some constants to see if they are digits (0..9) alpha numeric (a..z+A..Z), spaces | returns, or something else.  This comparison results in the creation of 1 new neuron per input neuron: the result cluster, in which we store the integer. One of these clusters (or it’s duplicate, due to a split) will eventually store the end result.  This cluster is linked to one of 3 static neurons: Digit, Alpha or Space (signs like . or , are handled a bit differently, this will be explained later). Naturally, if the integers would represent color values, we would use other starting points than digit or alpha. In other words, this first part is variable according to the type of input and the required accuracy of the algorithm. As meaning, we use  the start of the ‘flow recognition’ algorithm, called ‘Stage 1.1’ and put the result cluster back on the stack.  It’s important to put this one back on the stack, and not the item we are looking for. That’s because the algorithm can perform numerous splits and we want the result to be duplicated not the searchable, cause the contents of the list are continuously modified and we don’t want the result of one processor to be modified by  another one (I had to learn this the hard way).

After this initial step, the actual recognition algorithm kicks in. This consists out of 4 stages, grouped by 2. Meaning that  stage 1.2 is executed immediately after stage 1.1 for each neuron (this is the same for stage 2,1 and 2.2), but at the end of stage 1.2 and 2.2 all the result neurons are collected into a single cluster. Only after the last link of the last item on the stack has  been processed, are all the result clusters put back on the stack, with links for the next stage.  This is done for allowing to group items together.

The different stages are:

  • Stage 1.1 (search parts/flows): Find conditional parts or flows in the list of parents of the searchable. If there are multiple results, perform a split for each, after the result list has been filtered (these are the shortcuts). If there are no results and this is the only and last item still on the stack, the end result has been found.
  • Stage 1.2 (sequence-combine and filter): Check if items are sequential (2 flow items declared after each other in the same parent list, which is a part or flow) and handle floating flows, which are allowed to appear anywhere in the input stream, but which break up the sequence of other items.  Different actions can be performed if the order of the items is not ok: try to solve further or exit without result. Results of sequential items are grouped together.
  • Stage 2.1 (search conditionals): Find conditionals in the list of parents of the searchable if this is a conditional part, otherwise the stage is simply skipped. If there are multiple results, perform a split for each after the result list has been filtered (not yet completely implemented at this stage). If there are no results, there is an error in the flow definition.
  • Stage 2.2 (process loops and sync-points): If there was a conditional found in the previous stage, check if this is a loop. If so, and the previous item is of the same loop, combine the results. Also start a sync-point (will be explained later) if this was defined on the conditional.

These 4 stages are repeated until there is only 1 result cluster on the stack that represents the end result of a flow which is no longer used in any other flows. This cluster is made the result of the  split for the processor it ran on.  Off course, because there were possibly many splits, there could be many results. These will all be presented in the split-callback cluster (you need to provide a code cluster to the Split instruction, which will be called when all sub processors are done). In this demo, the result is sent back to the sin that caused the input, in a normal situation thought, this will simply start another process, as is done in the AICI 1 demo.

During this whole process, the algorithm is capable of executing callback code (attached to the statics, conditionals, parts and flows) at certain specific moments in the code. This is where the magic happens. The following types of callbacks are possible (together with  their execution time):

  • Flow code: this code cluster is executed when a flow has been recognized in the stream. The ‘Result’ variable (ID 1822) contains all the neurons that match the flow. It’s here for instance that the int neurons are converted to a single word using the CiToS (Cluster with ints to string) instruction.
  • Filter flow code: this code cluster is called from stage 1.1  (or 2.1, but this is not yet completely implemented) just after all the next items were retrieved from the list of parents of the searchable (stored in the CurrentTo variable). It allows the flow item to determine if it is a valid result, given the current state of the network. This is done by checking the contents of a number of globals, like ‘Prev stage item’ (ID 1956), which contains the previously processed neuron.

In the next post, I’ll go deeper into the specifics of the algorithm itself, for as you’ve guessed by know, it’s a bit funky, and I don’t want to forget all the subtleties, since it’s definitely still a work in progress (there are many improvements still possible).

Sep
14
2009

Roadmap

I thought I’d write something down on how I see things progress from here on. No dates and times, just a general idea of what I have planned for the next release (version 0.3), so here goes.

Thread sync system

There still is a major hiccup in the execution core: some instructions aren’t guaranteed to be uninterruptable, that is to say, some, like the instructions that change links, can be interrupted by other processors before finishing. This could result in data corruption. 

The current protection mechanism against data corruption also appears to slow down the core way to much (most time is actually spend waiting on a sync object here or there).  While trying to fix this, I pushed a bit to far, which resulted in the occasional loss of recycled id’s, in other words, deleted neurons are dropped incorrectly (you can usually see this in version 0.2 after a large input stream has been processed, the explorer might show some red slots).

The fix is a new thread syncing system on top of that provided by the OS, instead of trying to rely on the slower, clumsier locks  provided by the system, I need to create something new.  The design is mostly done, it just needs implementing, which requires some work.

AICI 1

I really like to get this demo working as soon as possible, so I will probably do a lot of work here.  There are some updates required to the system though, before this is possible:

  • The wordnet importer needs some work. It currently doesn’t do a good job with respect to verbs, that is to say, their conjugations aren’t imported and linked properly.  This needs to be corrected so that it’s easier to determine the part of speech of a word in a sentence.
  • Verbnet needs some serious work.  Not all the data gets imported or can be edited. I also need to write the neural network code to compare the data that was found against the known(imported) verbnet frames. And finally, I need to attach some action code to the frames, so that they actually do something.
  • I also decided to rebuild the English grammar used by the AICI example, so this needs finishing.  I did this to force me to take a closer look at each and every flow item as to better understand later on where the filter code needs to be attached and where to attach the code that will build the result.

I guess by the time I have gotten this far, a lot more, currently unforeseen stuff will have been added, so for now, lets make this all for version 0.3. After that, I plan to do some more work on the visual side of things (not how the UI looks, but the image sensory interfaces and their UIs).

Sep
12
2009

NND 0.2 released

I sort of blew out all the cylinders while attempting a first run directly from numbers to the English grammar: memory usage went up to +1.5 gig, thread count was +800 and a gazillion temporary neurons had been created before it all came to a grinding halt. Some redesign was required. So I beefed up the execution engine so that you are now able to throttle the maximum amount of simultaneous system threads that are used, which saves an incredible amount of resources and it prevents the app from grinding to a halt (well, that and the removal of a small army of bugs).

I also realized that the flow recognition algorithm simply was not yet mature enough to be used in even the most basic situations. Not just because it was not yet able to process the more complex flow structures, but also because I made a basic design error that I did not yet know off: always put the result cluster on the stack (‘from’ part of link) and the item-to-search in the ‘to’ part. Getting this algorithm ready definitely took the most effort, but it was a grate catalyst to improve the debugger and the execution core. Here’s a (non exhaustive) list of new/changed things:

  • New instructions: GetClustersFiltered, GetChildrenFiltered, Freeze
  • Added ‘Stop all processors’  and ‘kill single processor’  commands
  • Added import of VerbNet data
  • Updated Frame editor so that it is able to edit VerbNet data, this is a work in progress, not all data items found in verbnet can already be edited.
  • major update to the scanner demo
  • Added the start of the AICI 1 demo, which will/should soon become able to parse the English grammar (it’s already able to parse the verb ‘to be’ ex: you are, I am, is he, he is, they were,.. (silly I know, but a start is a start!)
  • Updated flow editor with overlay items, which display the presence of certain (useful) links, like code that is attached to the flow item.
  • Added a new dialog: ‘Overlay editor’ in the Tools menu for editing the overlays that are defined in a project. This means that overlays can eventually be used everywhere.
  • new debug feature: split paths, which allow you to track previously recorded paths of processors.
  • new debug feature: attached neurons, which allow you to track down data changes in threads that should not be allowed to change the data.
  • Extra info about the static neurons (like the display title and the description) is now stored globally, outside of the project so that I only have to update 1 thing, and not all the projects whenever some documentation needs to be updated. This does mean that you can no longer store any description info for these statically defined neurons.
  • Updated the WordNet sensory interface so that it generates POSGroup objects, which ‘group’ all objects together, that have the same text and part of speech. This saves a grate number of unneeded splits while processing the English grammar.
  • Made project saving/opening multi threaded so that the UI doesn’t appear frozen.
  • Many, many small bug fixes all over the place.

Perhaps a final note, best to uninstall any previous versions.  I haven’t yet tested how the installer works when it overwrites a previous installation, so it might screw things up.  And off course, you can get the new installer from here.

Sep
06
2009

Finally

Man, this was a tough nut to crack, but it’s done, it’s finally done. The flow recognition algorithm is working. 

image

To find some of the more tedious bugs, I had to create 2 new debugging techniques: attached neurons and split paths, which I will explain shortly. The whole algorithm eventually became seriously elaborate to deal with some of the more complicated situations. I need to document this very soon before I forget myself (I guess the radio silence is out of the window, dev time simply took to long for 1 algorithm, I need to do at least 3 more of these which would take far to long). I’m pretty convinced there are still some caveats to work out, but, as far as I have been able to test, all the situations using a scanner flow definition seem to work. The core still shows some hick-ups at times, which can still result in bogus results, but this has also been improving considerably. Expect a new update very shortly (like in a week or so).

Aug
15
2009

A mind of it’s own

My network is starting to get a mind of it’s own:

image

I am trying to get the flow recognition algorithm working so it can handle mixed content (words, numbers and signs mixed), which has been one of the more difficult algorithms I have worked on, to date. Clearly, I still haven’t got the order quite right.  This has been the main issue holding back a new release by the way.

Jul
22
2009

Auch

I hate storms, especially when they happen round the same time that there is a solar eclipse on the other side of the world. My crohn did not agree with this situation one bit and made me aware of this through an unwelcome series of cramps, followed by first brown, than red stuff.

So, I’m not doing much today.

 
Powered by Wordpress and MySQL. Theme by openark.org