AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Phrases
 
 
  [ # 16 ]

@Dave - I thought there would be something like that already available. The way I see it everything can be represented as a concept and even though the textual representation is different between languages the meaning would still be the same. Maybe it isn’t quite that way in other languages; I am only fluent in English so I wouldn’t know.

@Andrew - You have links for everything don’t you? That’s Awesome! You wouldn’t know if there are any open-source projects using that sort of system or something similar would you?

 

 
  [ # 17 ]

Andrew is a veritable cornucopia of useful knowledge and wisdom, and has provided not only myself, but the entire community with very valuable insights and information. smile

As for the possibility of what you are looking for already existing, it’s entirely possible. Just because I’m not aware of such a thing doesn’t mean it’s not out there, somewhere. smile

 

 
  [ # 18 ]

By representing everything as a concept, do you mean something like this: http://bragisoft.com/2013/01/objects-and-assets-abstract-and-concrete/

 

 
  [ # 19 ]

Interesting read. Yes, that is what I meant. It sounds like it could work well in theory but in practice I don’t know. Must not be too bad though if it has already been implemented…

 

 
  [ # 20 ]

I’ve tested this on wordnet, which gives no problems in this area.

 

 
  [ # 21 ]

That is good to know. Since you have more experience with this idea than I, is it something that you think would work well as a general knowledge base of facts and their relationships?

 

 
  [ # 22 ]

Yes. There are some things to watch out for: it’s a bit tricky to find a correct balance between clustering/grouping/indexing of data for faster searching - memory and disc streaming for very large data sets and processing speed.

 

 
  [ # 23 ]

If multi-lingual representation is important to you, take a look at ConceptNet (http://conceptnet5.media.mit.edu/).  It has good multi-lingual links between concepts.

For open-source NLP, the alternative to OpenNLP is StanfordNLP.  I haven’t compared them but the Stanford parser and named-entity recognizer are fairly well developed and documented (in Java).  The parser is available online, so you can test it.  Be warned though, they are memory-hungry.  If anyone has compared these two, I would be interested to know.

 

 
  [ # 24 ]

At first it won’t matter very much to me, however if at any point in the future I decide I would like to implement multi-lingual capabilities I don’t want to be hindered because I did not adequately plan for it and have to restructure a majority of the code or database because of it.

 

 
  [ # 25 ]

Chad, I’m going through that exact challenge with Program O version 3. It seems that supporting languages other than English involves a bit more than deciding which character set to use. cheese

 

 
  [ # 26 ]

@Chad all the links that I post are for open source software. One of the links that I gave you is actually for a catalog of many different language generations systems.

However today I believe I have struck gold. Take a look at the Grammatical Framework. It is a high level language for writing grammars and supports various operations such as parsing and generation. It is completely open source and it comes preloaded already with grammar libraries for no less than thirty different natural languages! Even if they are just basic libraries they will serve as an excellent starting point for numerous projects.

http://www.grammaticalframework.org/

 

 
  [ # 27 ]

Neat thread! And great links, Andrew. smile

I don’t think phrase recognition can be acheived in a vacuum—it has to include the context of what’s being said. A lookup list works around this because the creator of the list essentially decides the most common context for, using Chad’s example, “United States” or “Atlantic Ocean”. This context is then assumed for all user input.

However if you want the bot to determine for itself if a phrase represents a single object, then the context must come from the bot as well. In my own project, I employ a full parsing scheme. Ambiguous noun phrases are only resolved if one grouping leads to a “more complete” parse than another grouping. Parses are also weighted according to what phrases they contain—phrases that have been encountered before are considered more likely to be correct than new constructions. In this way, the bot is effectively building its own database of trusted phrases.

Edit: Ooo—I haven’t posted for a little while, but it seems like the site is running pretty speedily at the moment. Nice. smile

 

 
  [ # 28 ]
C R Hunt - Feb 22, 2013:

Edit: Ooo—I haven’t posted for a little while, but it seems like the site is running pretty speedily at the moment. Nice. smile

SHHHH!!! Don’t jinx it, CR! cheese

 

 
  [ # 29 ]
Chad J - Feb 18, 2013:

@Dave - I thought there would be something like that already available. The way I see it everything can be represented as a concept and even though the textual representation is different between languages the meaning would still be the same. Maybe it isn’t quite that way in other languages; I am only fluent in English so I wouldn’t know.

Google’s translate features work from a premise similar to this. Instead of considering words and grammars, the program is trained by reading translated copies of the same material. Whatever groups of words often appear in proximity together must correlate to the same concept. The results of this approach are pretty good—far beyond what other auto-translation software has acheived at least.

But the more I learn German, the more I am convinced that language also influences the way we think about events and therefore the concepts we use. For example, sentences in English seem more likely to be constructed around processes rather than objects and vice versa in German. So whereas an English speaker might think, “thing one actioned thing two” a German speaker would instead frame things as, “thing two, which has been actioned by thing one” (only, because it’s German, they would have one long word to represent this: “der Thingtwoactionedbythingone”).

Not all languages are so disparate—I originally studied French in school and I recall the grammatical constructions being generally similar to English (adjective placement notwithstanding).

So does thinking about processes rather than objects really constitute different “concepts”? A “concept” is such a nebulous idea anyway, but I would argue it does. For example, I once read an article about a study (found it!) that considered the effect of grammar on assigning guilt. Whether or not a language tended to use more passive or active forms of expression changed the way those witnessing an event interpreted it. Their ability to recall who performed what action depended on whether the viewer interpreted that action as intentional or not.

Anyone who’s actually bilingual want to weigh in? smile

 

Edit: Dave, I even used the “preview” feature and it worked! In good time too! Okay, I won’t say any more. But yay! surprised zipper

 

 
  [ # 30 ]

In my own project, I employ a full parsing scheme. Ambiguous noun phrases are only resolved if one grouping leads to a “more complete” parse than another grouping. Parses are also weighted according to what phrases they contain

That’s very similar to what I do. I first create all possible combinations that can be build with the input sentence in relationship to all the known ‘phrases’ (n-grams basically). Then the parser figures out which is the best fit against the set of known patterns.

But the more I learn German, the more I am convinced that language also influences the way we think about events and therefore the concepts we use.

I think perhaps it’s the other way round (or a feedback-loop). A good example is the ‘dutch’ language. It’s basically spoken in 2 regions: in the Netherlands and a part of Belgium. Though the language is the same (vocabulary and grammar are shared), the cultures are a bit different (though not much). This is often expressed in the way that the language is used: the type of words that are used and such. I’d say that the Dutch-Dutch is harder, more direct (usually also louder) while the Belgian-Dutch is more indirect, softer, perhaps a little more surreal.

 

 < 1 2 3 > 
2 of 3
 
  login or register to react