It’s possible to create your own characters for the chatbot designer app. The process consists out of 2 parts: first you need to create a set of images, once that’s done, you need to make a ccs file that combines all the images.
Images
The animations and visemes are stored as a series of images which are displayed in rapid succession, much like how old school film works. Before you can display these images though, they need to be created. Now, in the olden days, they used to have big chunky camera’s for that. Things have changed a bit since. We can use tools like DAZ3d, Poser, Blender,…
Content
What you put in the images is entirely up to you. There aren’t many limits with respect to content in the sense that there aren’t any ‘expected’ parts. That said, since we are dealing with chatbots, I’d try to put in something resembling a mouth somewhere, just to get some lip-syncing working. Most 3D tools these days, come with some way to manipulate mouth positions. The designer uses 21 different images for lip-sync (+ 1 for silence, which is actually the background image). Most 3D tools only provide morphs for 16 mouth positions. The remaining 5 visemes can be created by reusing other images or by creating your own, if the 3d app allows it. Here’s a good visual overview of all the visemes.
Animations like blinking eyes or moving hair also need to be exported from your 3D package as a series of images. Luckily most 3d software provides a feature to just that. Also, some animations can use the same image multiple times, it’s always best to reuse the same image in those cases, to save memory. For instance, an eye blink can be done with 2 or 3 images: eye fully open is provided by the background, half closed and fully closed.
File formats
Currently, the following file formats are supported:
| BMP | bitmap |
| GIF | Graphics interchange format |
| JPEG | Joint photographics experts group |
| PNG | Portable network graphics |
| TIFF | Tagged Image file format |
It should also be possible to use xaml files (to create flash typed characters) , but this is not yet tested and will most likely not yet work. Give me a shout if you would like to do this.
Processing
Once you have created all the images, it’s best to process them a little further. At this stage, every image always uses up the full viewable area of the character. That’s ok, if all you want to do is lip-sync and possibly a little animation during idle times. But in this setup, your character can’t blink while speaking. To accomplish this, we need to cut out only the valid parts of each image and make the rest transparent. This way, the system can overlay multiple images and create the illusion of different parts moving at the same time.
Another added advantage of this process is that it usually (depending on the file type) shrinks the file. PNG files, when edited with the correct tools, will use less space if there is a bigger transparent area.
There are multiple ways you can crop the images. You can use the eraser tool found in most bitmap editing tools like paint.net. Though, I’ve noticed that at least some versions of Photoshop don’t shrink png files when using the eraser, so that’s maybe not the best tool for the job. Personally, I used a little home-made app to do the
job. You can download it from here.
It’s very simple to use. As you can see on the screenshot, there are 2 big buttons to the left which contain the source images. Press on each button to load the images. The top image should contain the root, in the bottom, you put the images that need to be cropped. To the right, on top, you see the result for the currently selected target image (seen in the bottom button). With the slider, you can scroll through the loaded images.
On the next line, there is a checkbox and a slider. These control the calculation process: do you want the parts of the image that are different or the same, and with the slider you select the tolerance level used to calculate the difference. When you put the slider fully to the left, there is 0 tolerance, meaning that there can be no difference between the pixel in the source image and that of the image that needs cropping. Put the slider fully to the right and the difference has to be very big.
Finally, with the button labeled ‘save’, you can save the processed images to a directory that you select (the original files are kept).
Depending on the way that the images are rendered in the 3D package, the file format and the quality of the rendering engine, there can be a bigger or smaller difference between parts of the images that should be the same. That’s why you normally have to play a little with the tolerance level. The lower, the better. Sometimes it’s better to keep a lower tolerance level and manually remove the remaining dots with an eraser in a bitmap editing package.
CCS file
When you’re ready with your images, it’s time to put them all together, a bit like a collage. Unfortunately, there currently isn’t yet a ready-made tool for this, so you are gonna have to do a little xml writing. Fortunately, there’s a lot of copy paste involved. The outer xml tag, the start of the file, is a <Character> tag, like so:
<?xml version="1.0" encoding="utf-8"?>
<Character xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
</Character>
Now, Let’s go over each section in order of the file content:
Character Info
The first section, and the easiest, is the character information. This is how it looks like:
<CharacterInfo>
<Name>Mika</Name>
<Author>Ady Di Pierro</Author>
<Copyright>Copyright 2011 Ady Di Pierro</Copyright>
<License />
<AuthorWebsite>http://www.laticisimagery.com.au/</AuthorWebsite>
<CreationDate>2011-08-07T14:43:26.602+02:00</CreationDate>
<LastUpdateDate>2011-08-07T14:43:26.602+02:00</LastUpdateDate>
<Rating>
<Rating>Unknown</Rating>
<Sexual>false</Sexual>
<Violence>false</Violence>
<Other>false</Other>
<Description />
</Rating>
<OnlineOptions>
<OnlineCharacterBaseUrl />
<PreferredWidth>150</PreferredWidth>
<PreferredHeight>150</PreferredHeight>
</OnlineOptions>
</CharacterInfo>
Just copy the above and paste it into your xml file, just under <Character>. This part only contains reference information about the character: what’s it’s name, who designed it, what’s the licence, possibly a website,… All of the info in this section is optional, you can leave the tags empty, but it’s best to include them. This information, by the way, is displayed in the popup window on the ‘chatbot window’ (the button in the lower-left corner on the images window).
Online options are currently still skipped but will probably be used in future, online versions.
Background
<Background>
<ImageResource>images\Kima00.png</ImageResource>
<ImageResource>images\KimaEars.png</ImageResource></Background>
The background section defines the images that should be used as background (obviously). 2 things worth mentioning: a background is defined as an ImageResource. This is used often throughout the file. Whenever you want to reference an image file, you use this element. The text part of the xml element defines the relative path to the image (that is relative to the CCS file).
Also, and that’s perhaps the weirdest part, you can declare multiple backgrounds. Every image in this list will be displayed as a background, unless an animation turned one of the images off. And that’s the main usage of having multiple backgrounds: so that animations can turn part of the background off while playing. A good example are Mika’s ears. They are drawn in a separate background image, so that the ‘flip-ears’ animation, can hide the background-ears while it’s playing. We do this cause some images in the animation sequence are smaller then the ears in rest, which would otherwise give ugly results with half an ear overlapped and the rest still visible. I’m certain there are plenty of other cool tricks to done with this feature. Hiding a background image is explained in the ‘Animations’ section.
Animations
The ‘Animations’ section is one of the bigger parts of the file. This is where you declare all the animation sequences available to the character for emotional expressions and idle times. The background animations (which run all the time, not jus at idle times), are declared somewhere else. Anyway, here’s how an animation definition looks like:
<Animations>
<Animation>
<AnimationFrames>
<AnimationFrame>
<Duration>5</Duration>
<ImageResource>images\other\Kima Ears (01).png</ImageResource>
<VisemeGroupName/>
</AnimationFrame>
<Duration>5</Duration>
<ImageResource>images\other\Kima Ears (01).png</ImageResource>
<VisemeGroupName />
</AnimationFrame>
</AnimationFrames>
<EnableFrameSpeaking>true</EnableFrameSpeaking>
<HoldLastFrameForSpeak>false</HoldLastFrameForSpeak>
<FirstFrameUnderlay>false</FirstFrameUnderlay>
<BackgroundSuppress>KimaEars.png</BackgroundSuppress>
<Name>earsmove1</Name>
<ZIndex>0</ZIndex>
</Animation>
</Animations>
The section starts with an <Animations> element, which can contain 0, 1 or more <Animation> elements. Each animation defines a series of ‘frames’, where a frame represents a single image in the animation. A frame contains a Duration section, an ImageResource and a VisemeGroupName. The last one, you can forget about, it’s there because of backward compatibility reasons with the original CCS file format and should always be empty (for now). We’ve already been over the ImageResource element and the duration simply declares how long that the image should be displayed. This is expressed in milliseconds.
Underneath the frames, you need to declare some more info about the animation. Besides the name of the animation, which is used to start the animation in a rule’s output patterns, you have:
| EnableFrameSpeaking | When true, speech is allowed during the animation. When false, speech will wait until the animation is done. |
| HoldLastFrameForSpeak | Is currently not used and can be true or false. |
| FirstFrameUnderlay | When true, the complete background is hidden and the first frame of the animation is used as background |
| BackgroundSuppress | This element can be declared multiple times. Each element contains the name of a background image that needs to be hidden while the animation is running. |
| ZIndex | Optionally determines the ZIndex at which the animation is displayed. This is useful to move parts in front or behind other parts. The background always has a ZIndex of 0. |
It might seem a tremendous task at first to declare every frame in an animation like this. But it turns out that most animations can be build using between 3 and 7 images, sometimes reusing images to build sequences of about 10-20 frames. So all in all, this is still manageable.
VisemeGroups
The next section declares the images used for lip-syncing. The basic structure looks like the following xml snippet (note that it doesn’t contain an entry for all 22 viseme images, just the first 2).
<VisemeGroups>
<VisemeGroup>
<Name>Default</Name>
<ZIndex>1</ZIndex>
<VisemeImages>
<VisemeImage>
<VisemeIndex>0</VisemeIndex>
<ImageResource>images\Kima00.png</ImageResource>
</VisemeImage>
<VisemeImage>
<VisemeIndex>1</VisemeIndex>
<ImageResource>images\Visemes\Kima03 EH.png</ImageResource>
</VisemeImage>
</VisemeImages>
</VisemeGroup>
</VisemeGroups>
The file format already allows for multiple viseme groups to be declared, although at the time of writing, only the first one is used (and supported). Multiple viseme groups could be useful for moving heads: a viseme group for each head position. As such, the ‘name’ element for each group would be used to reference a group. But, as already mentioned, this is something for the future.
The ZIndex element defines the Z-order that should be applied to the viseme images. This allows you to manipulate the order of the images. For instance, you could use this to move a part of the background image on top of the viseme images.
Next comes the ‘VisemeImages’ group, with a ‘VisemeImage’ for each mouth position + optionally 1 extra image for the silence position. Each VisemeImage contains an ImageResource (as described above) and an Index (VisemeIndex), which determines the letter that the image represents (so the order in which the VisemeImages are declared, is irrelevant).
The silence is primarily for backward compatibility with verbot characters (which don’t have a separate background section). If you have a ‘Background’ section, the viseme image at index 0, will be skipped, otherwise it’s used as the background. That’s why the ‘background’ section has to be declared before the VisemeGroups.
IdleLevels
Idle time is the time when a bot doesn’t have anything to say or any emotion to show. In other words, nothing’s happening. In order to create the illusion of being alive during this period, you can use idle levels to start animations (that were declared in the ‘Animations’ section) when the bot is idle. Here’s the definition:
<IdleLevels>
<IdleLevel>
<MinStartDelay>5</MinStartDelay>
<MaxStartDelay>15</MaxStartDelay>
<MinDuration>5</MinDuration>
<MaxDuration>15</MaxDuration>
<MinInterval>2</MinInterval>
<MaxInterval>8</MaxInterval>
<AnimationNames>
<AnimationName>earsmove1</AnimationName>
<AnimationName>eyes squint</AnimationName>
</AnimationNames>
</IdleLevel>
</IdleLevels>
Like most other sections, you can again declare multiple IdleLevels. In this example, we only have 1 though. The idea behind multiple IdleLevels is this: when the idle time starts, the first idle level is activated, but when the duration of the level has ended, the next one is activated until the last one is reached, which remains running until some activity happens. This way, you can have a bot act differently after x amount of idle time, progressively.
Each idle level contains 4 bits of information: 3 time ranges and a list of animation-names that the idle level can use. Each range has a Min and Max component, indicating the lower and upper part of the range. The different times are used for:
| Delay | A value is selected at random from this range to delay the start of the animation (which is otherwise immediately after the last output) |
| Duration | A value is selected from this range to determine the duration of the idle level. This is only used if the level isn’t the last in the list. |
| Interval | Each time an animation finishes, a value is selected from this range to determine how long the system should wait before it starts another animation from the list (this is selected at random). |
BackgroundAnimations
Idle levels are very useful to create a sense of liveliness, but they can be expanded upon. Sometimes, it’s also useful to have animations run all the time in the background, even while speaking. A good example of this could be breathing or eye blinking. For this purpose, there is the final section, called ‘BackgroundAnimations’, which contains a series of special animation definitions, all of which will run all the time, in a loop. Here’s the definition:
<BackgroundAnimations>
<Animation>
<AnimationFrames>
<AnimationFrame>
<Duration>10</Duration>
<ImageResource>images\other\Kima Nose Flare (01).png</ImageResource>
<VisemeGroupName/>
</AnimationFrame>
<AnimationFrame>
<Duration>10</Duration>
<ImageResource>images\other\Kima Nose Flare (02).png</ImageResource>
<VisemeGroupName />
</AnimationFrame>
</AnimationFrames>
<Name>nosebreath</Name>
<LoopStyle>VarTimer</LoopStyle>
<MinStartDelay>1</MinStartDelay>
<MaxStartDelay>3</MaxStartDelay>
<ZIndex>2</ZIndex>
</Animation>
</BackgroundAnimations>
Unlike idle levels, the animation is declared inline this time. That’s because there is a small difference in declaration between this type and idle/emotion animations. Another reason is to make certain that background animations can’t be used as emotions (they run all the time anyway, no need to start them separately).
The way that the individual frames are declared is the same as with regular animations: a list of ‘AnimationFrame’ objects that define the image, the duration and an optional viseme group. The difference is in the extra options: there is an extra ‘LoopStyle’ and ‘StartDelay’ range. ZIndex is also the same as in animations: It determines the Z-Order at which the animation is displayed, the higher the number, the more to the top it will be.
The ‘StartDelay’ range is used to build in a small idle time between each animation loop. Each time that the animation needs to be started, a value is selected at random from within the range. That’s the delay which will be used. Not every loop-style makes use of this delay, some do, others skip it.
Finally, the LoopStyle element can have the following values:
| None | The least useful: no looping at all. |
| Jojo | At each start, the end frame is selected at random from the entire list of frames. When this end is reached, the animation is played in reverse until the first frame is reached and the loop starts again. Each start is delayed by x amount of time, where x is a random number picked from the StartDelay range. |
| FrontToBack | The images is played continuously, front to back without pauses. |
| VarTimer | The images is played front to back in loop, but each time with a delay that is selected from the StartDelay range. |
And that’s it. Once you’ve got all these sections laid down, you’ve got yourself a character. All that remains, is to copy the files to the {my documents}\NND\Characters dir and restart the chatbot designer. Enjoy.

