AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

The lamentable state of knowledge representation
 
 

In case you didn’t know, knowledge representation (KR) is one of the most important topics in AI. In my opinion KR is *the* most critical topic in AI. Some evidence of my opinion is that Schank lists KR first among his features of the field of AI:

  (1) For Schank (1990, p. 13) AI is a distributed phenomenon: “poten-
tially . . . the algorithmic study of processes in every field of enquiry”—
a broad description but one that is narrowed by a list of 10 features which
he considers to be characteristic of AI. The features are:

a. Representation—how do we represent what we know in a machine?
b. Decoding—translation from the real world into the representation
  selected;
c. Inference—the process of figuring out the significance and full
  meaning of a collection of knowledge represented explicitly, or
  sensed directly;
d. Control of Combinatorial Explosion—finding ways to limit the po-
  tentially limitless inference process;
e. Indexing—organization and labeling of memory such that relevant
  items can be located quickly;
f. Prediction and Recovery—ability to predict from current knowledge
  and recover from inevitable mistakes;
g. Dynamic modification—“the quintessential AI issue,” learning:
  knowledge structures must change over time;
h. Generalization—the process of drawing conclusions from disparate
  data, the basis of creativity;
i. Curiosity—a process of probing beyond the known and understood,
  of constructing both questions and explanations;
j. Creativity—the process of generating new information, often viewed
  as generating useful relationships between known items that were pre-
  viously thought to be unrelated.
(“A New Guide to Artificial Intelligence”, Derek Partridge, 1991, page 2)

In neuroscience, learning is sometimes listed first as the primary unsolved mystery of the brain, but I would think that in order to learn something, a system has to first represent the phenomenon, and only then can it change something in its representation to show what changed in time. The following is an example of a list from neuroscience that lists a number of KR-related mysteries, although in different wording. Note that the KR-related topics tend to be at the top of the list.

10 Unsolved Mysteries Of The Brain
...
1. How is information coded in neural activity?
...
2. How are memories stored and retrieved?
...
7. How is time represented in the brain?
http://discovermagazine.com/2007/aug/unsolved-brain-mysteries#.UOeeV3dg-54

Knowledge representation is basically just how you represent knowledge about things or relationships between things. Some familiar examples of knowledge representation are: natural language text, databases, rules/productions, mathematical formulas, formal logic formulas, sets, Venn diagrams, tables, graphs, drawings/illustrations/sketches/photographs/videos, networks/graphs, arrays, lists, software objects. If I understand correctly, all those abstract data types you learn in computer programming classes (linked lists, trees, etc.) are merely more examples of KR methods.

So here was my reasoning: Since the secret of AI quite likely lies in finding a suitable form of KR for representing the real world in all its variety, uncertainty, and changing patterns, then all that needs to be done for a major advancement in AI is to study and think about KR until such a method becomes clear. The obvious way to start is to look at books on KR, especially KR related to AI, understand what has been done, read the author’s hints of what KR the brain likely uses, maybe tweak or extend an existing method, then you’re done!

I wasn’t naive enough to think this path of research would be easy, but I *was* naive enough to think that existing books on KR would be very helpful. In the university library I visited, I found two books solely about KR, which isn’t bad, since KR books aren’t very common. One of those books was “Knowledge Representation” (John F. Sowa, 2000), which had been highly recommended to me by a scientist at one startup company where I worked, a scientist who knew Sowa personally. Because that company had worked me outrageous numbers of hours per week and because the book was quite expensive to buy, it took me about four years to finally get around to finding a free library copy with free time to peruse it.

What a disappointment Sowa’s book was! To be fair, I should state that I was expecting something more like an encyclopedia that listed hundreds of KR diagrams/methods, like a reference book for people trying to push the state of the art in AI. (Why else would somebody write an entire book on KR, if not to be useful in computer science, AI, and neuroscience? Those are the main fields that need it.) I had also expected a lot of exotic KR methods I had heard about over the years, the ones that don’t normally get listed in textbooks, like neural synchronization and signature tables. There was *nothing* like that in Sowa’s book, not even close. The book mostly listed only conventional techniques, and didn’t even list some used in AI.

Also to be fair, I should state that I later began to appreciate the book quite a bit for what it was. It did have some excellent parts, especially the list of six classification features for any kind of logic, which can be combined arbitrarily to produce fuzzy logic, multi-valued logic, and any other type of logic you might want to invent. There were also some great insights into the limitations of logic, the difference between logic and ontology, fields of science using different granularity, etc. But as a fundamental source of ideas for pushing the state of the art in AI, it was close to useless for me.

I then looked at another KR book at that library, one focused on AI: “Knowledge Representation: An AI Perspective” (Han Reichgelt, 1991). It sounded very promising, but it wasn’t. In fact, it was even worse, with such limited coverage of AI KR methods that there wasn’t even so much as an interesting quote I could pull from the book. Then I looked at “Representations of Commonsense Knowledge” (Ernest Davis, 1990), and I also found that to be weak.

I haven’t exhausted all the possibilities, of course. I did see the book “Handbook of Knowledge Representation (Foundations of Artificial Intelligence)” (eds. Frank van Harmelen, Vladimir Lifschitz, and Bruce Porter, 2008) for about $200 on Amazon.com
(http://www.amazon.com/Knowledge-Representation-Foundations-Artificial-Intelligence/dp/0444522115), but that’s not in any library I know of, I’m certainly not going to buy it unseen, and after all my prior disappointments, I’d almost guarantee it’s going to be more of the same.

Therefore I’ve come to the sad conclusion that the state of KR is very weak, at least for what has been published in books. Books won’t give AI enthusiasts a strong enough boost to push them into AGI orbit. I suppose this means there is a needed niche for an encyclopedic book on KR of the type I described, and sadly, I’m probably qualified enough to write it, but considering that I’ve spent the past year just trying to get a single article completed, my writing such a book isn’t going to happen in the foreseeable future. Sorry for all the bad news, but I think heavy AI die-hards should know my opinion of the state-of-the-art of KR. Many other people see the situation differently?

If I’m wrong and there does happen to be a much better book on KR, here is what I believe it should include, at an absolute bare minimum…

(1) many exotic KR methods for AI, including:
signature tables (http://webdocs.cs.ualberta.ca/~sutton/book/ebook/node109.html)
neural synchronization (http://www.scholarpedia.org/article/Binding_by_synchrony)

(2) many KR methods outside of AI, including:
adjacency matrices (http://en.wikipedia.org/wiki/Adjacency_matrix)
Feynman diagrams (http://www.quantumdiaries.org/2010/02/14/lets-draw-feynman-diagams/)
Dirac/bra-ket notation (http://en.wikipedia.org/wiki/Bra-ket_notation)
unfolded box diagrams
group theory
chaotic attractors
photographs
raster patterns

(3) all KR methods for all existing software paradigms, including:
algorithms/flowcharts
logic
software objects (of object-orientated programming)
productions (of expert systems)
Venn diagrams (of set-orientated programming)
aspects (of aspect-oriented programming)

(4) deep relationships between the above methods that allow new methods to be created and compared, as Sowa did with his six organizing attributes of logic

(5) suggestions and recent findings that suggest what KR method biological brains might use—the Holy Grail of KR

Now that would be a great book. If anybody wants to collaborate and knows how I could financially survive while writing part of it, be sure to let me know. Or just go ahead and steal this idea, if you want, if you think you can pull it off by yourself. grin

(Caveat: This thread of mine isn’t quite up to my normal standards of thorough research, but it’s such a large topic and the timing is so appropriate that I went ahead and posted it tonight anyway. Enjoy.)

 

 

 
  [ # 1 ]

a bit old, but still valid: about objects and assets

 

 
  [ # 2 ]

updated version: http://bragisoft.com/2013/01/objects-and-assets-abstract-and-concrete/

 

 
  [ # 3 ]

There have been many attempts at developing formal knowledge representation over the years. A number of good frame-based representation systems were developed in the 80’s. A common problem was that they were basically “closed” systems where the system and it’s associated knowledge were hosted on a single server. In the 90’s the Resource Development Framework (RDF) (http://www.w3.org/RDF/) was developed by a W3C working group. In 2001 the Defense Advanced Projects Agency (DARPA) began the DARPA Agent Markup Language (DAML) (http://www.daml.org/) research project. AI researchers from a large number of universities and commercial research labs formed an integrated team to build upon and advance the earlier RDF efforts. Early in the DAML project European Union researchers involved with the Ontology Inference Layer (OIL) effort (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.2713) were brought into the DAML project to combine the best elements of the two research efforts. The result was the development of the Web Ontology Language (OWL) (http://www.w3.org/TR/owl2-overview/). OWL is a Web-based knowledge representation language that supports distributed KR across the internet. It is based on formal descriptive logic and therefore supports formal reasoning. A very large group of researchers continue to develop Semantic Web related technologies and applications.

 

 
  [ # 4 ]

I agree that RDF and OWL are useful technologies to investigate for knowledge representation.  In particular, they give you ready access to structured knowledge which you can query online or store locally.  For example, Wordnet is available this way from here: http://wordnet.rkbexplorer.com/sparql/

Similarly, a lot of code is available for handling and/or reasoning with this knowledge.  I am using the Jena library in my Java experiments (http://jena.apache.org/)

 

 
  [ # 5 ]
John Flynn - Feb 5, 2013:

OWL is a Web-based knowledge representation language that supports distributed KR across the internet. It is based on formal descriptive logic and therefore supports formal reasoning.

A few years ago I was trying to get involved with the semantic web, I bought and read several books on it, and began hanging out at a Meetup Group for the semantic web, so I definitely know something about the semantic web, though I never got around to coding anything.

At that time, one expert on the east coast told me in somewhat cryptic wording in his e-mail that OWL is not suitable for definitions. Now that I’ve learned more about the brain and the semantic web, I know exactly what he was talking about. As you said, the knowledge represented by OWL is *structured* knowledge: knowledge that has already been abstracted from the real world and put into organized, symbolic form by humans. That means OWL cannot itself interpret the real world and extract information from it; OWL is limited to the artificial world of computers and their languages and ontologies, which are created by humans. There’s that gap again that I keep mentioning between the real world and virtual world: machines will have to be able to sense the real world directly, create their own ontologies, write their own programs, etc. in order to be considered intelligent. Computers running OWL are just manipulating symbols as they were before, without any understanding of what they’re doing. My final assessment of the semantic web was that it was a very useful thing, but that there’s no hope of it ever becoming intelligent.

More specifically, if I understand correctly, two things OWL can’t represent are complicated continuous data (such as images or sound or neurocontrol) and relationships between more than two objects (at least not easily and intuitively). For example, the sentence “Mark Atkins writes John Flynn a response” involves three objects (subject, direct object, indirect object) + one verb, but OWL is limited to triples, therefore OWL is limited to only two objects + one relationship. There are workarounds by clustering/reification, for example, but those are bulky and nonintuitive methods.

This is not too surprising, since the very function of language is to remind or recall real-world data (such as images or sound) that the communicating entities (usually humans) already know about. Language is sort of like a set of pointers to overhead slides, photographs, or objects in the real world. For example, to communicate the piece of news that “Grandpa died yesterday” would require only the static concepts of Grandpa and relative time plus the dynamic concept of dying, all of which the interpreter already knows about—images or concepts common to both people communicating. That sentence could be communicated just by pointing a stick sequentially at three existing images, two for the nouns, and one for the verb. If you think about it, almost all news we communicate is merely substitution of our existing, shared concepts into slots/variables that have grammatically-specified roles. In contrast, if the speaker/communicator had to describe from scratch a plecostomus in order to convey the news that “Grandpa’s plecostomus died yesterday”, that chore would be far more difficult, especially with triples. That’s exactly where triples or OWL run into their worst problems: representing real-world data, or making definitions from scratch. Triples are almost identical to semantic nets (directed graphs with labeled nodes and arcs), which have the same limitation of representing only binary relations relations easily, without resorting to reification (http://www.csee.umbc.edu/courses/671/fall09/notes/12.pdf).

Language cannot be understood without considering the everyday knowledge that
all speakers have about the world.
(“Natural Language Understanding”, Second Edition, James Allen, 1995, page 465)

It is not completely clear what the Semantic Web (SW) is meant to be. Statements about it differ or, rather, the interpretations of statements about it differ. The grand view is that the SW is the core model of the way the world is, expressed in a manner that supports reasoning about this world. The modest view is that the SW is the minimal apparatus of shared generic terminology that can be used to send some carrier pigeon messages from one universe of discourse to another. The ‘authorised’ view presented in Berners-Lee, Hendler and Lassila (2001) might be regarded as intermediate. It treats the key notion of ontology as a structure of well-defined, i.e. unambiguous, concepts standing for objects, properties, relations etc, which has an accompanying logic allowing inference about the concepts. This is much like the terminological and assertional components familiar from AI knowledge representation. There is indeed no claim that everything that users may want to talk about (or all information) can be captured in this way, either descriptively or inferentially,
http://www.google.com/url?sa=t&rct=j&q=“semantic+web”+A.I.+all+over+again&source=web&cd=35&ved=0CEgQFjAEOB4&url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.99.7285&rep=rep1&type=pdf&ei=XnYZUdHcEKXk2QWWoYD4Aw&usg=AFQjCNHUhXxTILJHzujr-ZttVP_94YiNKg&cad=rja

Another problem with Description Logic (used for the semantic web) is that it cannot represent numerical information. Also, logical rules are not decidable if unrestricted. (http://www.semantic-web-book.org/w/images/5/5e/KI09-OWL-Rules-2.pdf)

During the development of the ontology, we faced several representational problems while trying to represent the mathematical expressions inherent to neural networks in Description Logic (DL). Since this formalism is based on set theoretic semantics, it fits properly to representing domains in which relationships among sets suffice. For such domains, DL terminologies can accurately describe the domains’ classes. On the one hand, the choice of this formalism for the Semantic Web lies actually on these grounds. On the other hand, it is not endowed with any apparatus to deal with many types of mathematical operations and constraints that would turn it into a more generic formalism, like what can be represented by Prolog, for instance.
http://ceur-ws.org/Vol-938/ontobras-most2012_paper7.pdf

You have a good point in that neither triples nor OWL were mentioned in any of those knowledge representation books I browsed, and they should have been.

Sorry I haven’t been responding or hanging out here lately: I’ve been rather overloaded lately with work, and quality assessment of these posts takes time.

 

 

 
  [ # 6 ]

Those interested in new methods of knowledge representation please look here http://generalinformationtheory.com

The representation language is described in the chapter 4 http://generalinformationtheory.com/content4.php

The introduction into the theory is publicized here http://www.sciedu.ca/journal/index.php/air/article/view/2225

The detailed discussion of the language basics and the approach will follow in several months from now.

 

 
  [ # 7 ]

@Jan Bogaerts: thanks for that link, I didn’t read that yet but as somebody who only recently started pondering these things, it’s fairly amazing!

I, for one, am storing my bot’s knowledge simply in a relational SQL database - and so far I think this really is how we store things. To a certain level of course, but, for each new aspect of our knowledge, after a bit (or a LOT) of pondering, I always seem to be able to come up with something that ‘captures’ how my brain stores it quite well.

Lucky I don’t know much about the research on this problem, or I’d just be dazzled and give up I think smile

One issue I have with articles I read on this is that they almost immediately start going on about synonyms, or words with multiple meanings - to me that’s just a linguistical distraction, an afterthought, in how to represent the world. The real fun is in the structure. High-level structures first, exceptions / edge cases later please. But somehow NLP research LOVES to focus on that aspect more than anything else. It’s like talking about taxonomy, and immediately worrying about how pinguins are birds that cannot fly. Sure, it’s something to be dealt with at some point, but let’s first get the high-level structures figured out please.

 

 
  login or register to react