AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Core meta data for your chatbot

Hi all,

While I’ve been developing my bot, I came to the conclusion that some aspects of the bot development has been duplicated.  With that effort, I started putting together a collection of meta data (for lack of a better word) that I believe can help everyone with their own bot development.

My AICore script is basically a csv file that contains word lists, word mappings, substitutions, spelling mistakes, etc.  This file can be read in almost any language, and converted into anything you can imagine.

I would like to turn this into a community project.  If you have your own data file that you’d like to contribute to this file, do send it to me, and I’ll include it.  My aim here is to build a core set of data that anyone can use.

You can download the file from my website :

In addition, I now also have a RiveScript file that I generate from the same data set.

If you would like to have AIML, ChatScript, or any other format generated from the data set, let me know what the format should look like, and I’ll gladly build a script to produce the same data set.


  [ # 1 ]

Hi Phil,

Just ran into this, that’s quite an amazing file - thanks for sharing!

Allow me to chip in with similar files to yours that I made for Yokobot. They are typically formatted as json, and ‘knowledge’ is just long lists of small English phrases.


Person names:

All language/knowledge files together in a zip files:

Also linked to these same files in my bot’s ‘about > data’ section:

Curious if these are of any use!


  [ # 2 ]

I thought I’d just convert the lists of names to my program’s vocabulary, but as with most databases I’ll have to go over them manually because among the listed names are e.g. Bear, Bird, Angel, Candy, etc.
It still saves me a lot of time though, so thank you for composing the lists.


  [ # 3 ]

Ah yes, the old favourite of “I am Randy” always proves difficult. grin


  [ # 4 ]

I even use a bunch of syntax rules that take hints from capitalisation and typical name introductions shortly after greetings and/or when the username is unknown, but capitalisation can’t help if a name is the first word of a sentence: “Andy is short” (Loebner Prize 2014). To my program this might as well have been an adjective and its synonym. But yes there’s always still Frank, Bill, Will and Mark.

I’ve converted Wouter’s list now, increasing the vocabulary by 2800 names. Phil’s list was decidedly longer and contained a lot of doubles too (surely you can filter those out). Now if my program messes up it’ll be for better reasons smile


  [ # 5 ]

Awesome, glad I could be of help!


  [ # 6 ]

In return, here’s my list of countries and capital cities. Not entirely correct though: I merged/deleted some compound names like “Vatican City” to “Vatican”. Apparently there’s also a capital named “Male”, which had me wondering how my program mistook a city for a person.


  login or register to react