AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

AI Needed for a Special Project!!!
 
 
  [ # 16 ]

Nice reply, I enjoyed reading that smile

I wouldn’t be so hasty as to deem an electrical and/or mechanical system’s ability to express its “thoughts and insights” as impossible.  There IS a beautilfully simple and elegant model underneath everything which allows a great deal of this “illusion” as you so put it, to function and operate.  An algorithm itself won’t solve the problem of strong AI, but a philosophy will, and thats what we have….a philosophical view of a simple, beautiful underlying mechanism, that appears (so far at least) to drive almost eveything.  Can I/we achieve that, I have no damn clue…will we have a great deal of fun and produce some cool stuff as a consequence, yes we will LOL

Anyway, enough of that for now….

I think for what you need is certainly an achievable goal right now if I understand it correctly, with moderate horsepower in terms of the hardware required.  The devil is in the detail of course, but converting sensory inputs, to a textual representation, which can then be verbally produced by a computer in response to a verbal request, is very doable.

 

 
  [ # 17 ]
Dan Hughes - Mar 9, 2013:

Nice reply, I enjoyed reading that smile

I wouldn’t be so hasty as to deem an electrical and/or mechanical system’s ability to express its “thoughts and insights” as impossible.  There IS a beautilfully simple and elegant model underneath everything which allows a great deal of this “illusion” as you so put it, to function and operate.  An algorithm itself won’t solve the problem of strong AI, but a philosophy will, and thats what we have….a philosophical view of a simple, beautiful underlying mechanism, that appears (so far at least) to drive almost eveything.  Can I/we achieve that, I have no damn clue…will we have a great deal of fun and produce some cool stuff as a consequence, yes we will LOL

Anyway, enough of that for now….

I think for what you need is certainly an achievable goal right now if I understand it correctly, with moderate horsepower in terms of the hardware required.  The devil is in the detail of course, but converting sensory inputs, to a textual representation, which can then be verbally produced by a computer in response to a verbal request, is very doable.

I’ve come to the right place I see…where have you guys been all my life?  grin  OK, let me provide you all with two different situations where the AI might find it self in, and provide also the expected responses.

1.  Driver gives a command during a ride..

    Driver Command: “Siri (cant reveal the real name yet) increase speed by 15% and hold” 
    Siri Possible Response 1. “Command acknowledged increasing speed by 15%”
    Siri possible Response 2. “At the current velocity increasing speed by 15% requires activating stability control.  Stability Control Activated, increasing speed.”
   
    Driver Response: “Hey who the hell told you to do that, deactivate stability control, and maintain current speed”
 
    Siri Response 1: This action is not recommended, nevertheless as you request, stability control deactivated, cruise mode engaged.”

2. Technician gives a command during diagnostics.  Problem, sensor on the CAN BUS network malfunctioning.

    Tech command 1: “Siri open debug portal authorization Alpha, Bravo, Tango, Mike, 1, 8”
    Siri Response 1: “Command authorization recognized, debug portal now open.”

    Tech command 2: “Siri what is the current status of CAN node 8?”
    Siri Response 2:  “CAN node 8 reports the following
                        7 Acknowledgement errors when transmitting MessageID 5
                        12 CRC errors during reception of MessageID 22
                        125 successful transmissions of all remaining Messages
                        Current Error Status Active.”


These are two very real situations for the AI.  Your thoughts?

 

 

 
  [ # 18 ]

“I am sorry Dave…I’m afraid I can’t do that”  (ok, somebody had to post that!  You know you were all thinking that!)

I hope you know C Dan! You’ve got this one as I’m a Java/.NET guy.  I once automated my entire house using X10 voice commands like “Dim lights by 20%,” “What is the weather going to be like today” along with about 100 other commands. Even went has far as putting Mics/speakers in every room for the house to respond back with confirmation and results.  The house was always listening, so I gave the house a funny Name which when recognized triggered the voice recognition of what followed. It’s about time someone came along to finally create Kitt!

 

 
  [ # 19 ]

Daniel,

Your examples are perfectly possible, hell, I’d even say we could do that kind of thing right now with our current level of tech.  Only additions would be the controllers and interfaces to the various sensors and systems around the vehicle so that it could be controlled accordingly.

I’d define what you need as narrow AI, that is, a set environment with a limited scope of scenarios which need to be processed…which makes the work you need to do a lot easier.  Sure you could mix in stronger stuff for a more adapative system, but that of course depends of the project, budget etc.

Bottom line, those examples would be not too difficult for us, or even a number of participants on this board.

———

Brian,

HAL, what a guy hehe cool project btw, I keep telling myself to set some time aside to do the same, but it never happens.  What was the name you assigned btw?

 

 
  [ # 20 ]
Dan Hughes - Mar 9, 2013:

Daniel,

Your examples are perfectly possible, hell, I’d even say we could do that kind of thing right now with our current level of tech.  Only additions would be the controllers and interfaces to the various sensors and systems around the vehicle so that it could be controlled accordingly.

I’d define what you need as narrow AI, that is, a set environment with a limited scope of scenarios which need to be processed…which makes the work you need to do a lot easier.  Sure you could mix in stronger stuff for a more adapative system, but that of course depends of the project, budget etc.

Bottom line, those examples would be not too difficult for us, or even a number of participants on ....

  Great!! As far as the controllers and interfaces to various sensors and systems around the vehicle, that we will be doing all in house, “that’s our speciality.”  The on board vehicle control system will be a custom mulitcore SoC (System on a Chip) designed by us, that SoC will facilitate communication and control of the systems throughout the vehicle.  In addition it will respond to commands issued by the AI on the driver’s behalf, and report status with a granular level of detail about all internal systems. 

  Here is a brief architecture overview.  For this project we envisioned that the audio device which speaks for the AI, records and transmits user input will be seperate from the AI.  The AI will communicate with this device on a wireless link.  The AI, we were hoping, would run out of one of the cores embedded with in the SoC, this is why we need it done in C.  The available memory will be limited for each core, and the overall architecture of the embedded SoC will be asymmetrical.  This implies that each core will act independently of the other and only high priority tasks and those actions which directly impact the state of the vehicle will be controlled by the AI ( ie speed, direction, etc ) via an internal command protocol.  Lets move a bit further and let me attempt to describe how we hypothesized this AI might be built. There is a reason why I use the word built and not designed, let me explain.

  Lets say for example you created an algorithm who’s whole purpose was to listen you talk daily, and somehow identify the words you used to communicate, and identified also the context in which those words were used.  Lets say that you created another algorithm who’s soul purpose was to observe the tone and inflections in your voice while you talked and noticed the context in which theses different tones and inflections were used.  Recognizing the context is probably the most difficult part but lets assume we can do that.  If this can be accomplished we could design the AI to respond to commands and situations it observes using “only” the data gathered by the other two algorithms, the AI would have no other choice but to sound and respond how you would.  Doing it this way we we could literally download different personalities into the system for the AI to use.  We were thinking that this approach might “size down” the AI and make it compact enough for our SoC.

  I’d like you all to critique this method, how far off are we, can it be done this way??  I have thick skin, so don’t be afraid, and again I’m totally ignorant of this subject so I’m probably WAY OFF! cheese

 

 
  [ # 21 ]

Ok, my questions in response to that would be, what is the CPU spec, and how much memory exactly is available?

Voice recog software is fairly CPU and memory hungry, which is why Siri on your iPhone records what you say then sends that off to a server for processing, opposed to doing it locally.  In addition, once you have your speech to text result, converting that to a bunch of POS and Grammar tree’s is again fairly resource heavy.

I’m no expert on voice recog software development, I know the basics, and the implementation theory, but I’ve never done it from scratch to a working implementation.  I’d suggest you either buy that technology in or investigate some open source/GPL implementations of it and build it into the system, as developing that alone will be a rather large undertaking.

On the AI side, you could “cheat” somewhat, but you’d then loose the ability to understand any contexts at a deeper level.  I’m not 100% at this point if that is something you actually need, or can do without.

Now a little critique smile  The architecture you propose doesn’t really make much sense.  You state that there will be a separate device that will be the “front” for the user input, record and send that to the SoC device for processing, why?  Wouldn’t it be more convenient to embed the AI into that device also and be able to provide it more resources for processing?  I guess it depends on your price point for this, which I do not know either.

I would use a hardlink between your SoC and the above device.  I assume you will be using the standard 502.11 protocols, which will be fine, until you have a vehicle full of Androids and iPhones all looking for Wifi access points, then you will have interference headaches.  PLus as this device transmits the user input to the SoC for processing in your proposed design, if the vehicle is commanded to “Stop NOW!” for some emergency, you need that to be reliable and probably have some connection redundancy depending on the application parameters!

C would be a nightmare for this task, an OO language like C++ would be much better suited to voice recog and the AI processing as this isn’t a closed, linear execution system.  As far as I know, most voice recog systems employ a Neural Network for the audio processing, and if anyone can code a NN in straight C and not get depression I’d like to meet them smile

 

 
  [ # 22 ]

Daniel,
If you would like a commercial proposal, drop me an email and I will work one up for you.

In the meantime, I will give you some food for thought for your project.
2 threads you may find of interest:

Siri - ignore the STT angle, it’s a chatbot ... is it any good

The age of intelligent machines has arrived.
I’ll also add a more general overview of conversational user interfaces for intelligent devices on this thread.

There are 4 basic architectural designs when building a “conversational user interface”. These represent trade offs in time, flexibility, capability, cost (of hardware) and maintainability.

- Embedded:
This would be on custom hardware, if you need voice recognition run locally, size of memory and processing requirements are much greater. As such it is sometimes better to use a wireless link to different device.
Smart Refrigerator

- Local:
Use a standard platform/PC/Tablet/Cell phone to host the user interface.
The IRobot Ava platform might be an example of a local environment. The interface that runs the bot runs on an Ipad.
Bots running on cellphones are also starting to become popular and may serve as examples. A mass produced android phone can serve as an inexpensive hardware component.
Polaris
Romo
Few of these currently have a conversational AI user interface, but there is no reason that they couldn’t.


- Cloud:
  Siri
  Chevy Spark’s Smartphone Voice Recognition Feature


- Hybrid:
You can mix where the various functionality runs.
You can get a feel for this by using my bot: Skynet-AI on any device with a chrome browser, or an Apple Ipad or Iphone (no need for SIRI, just voice recognition). Voice recognition is done in the cloud, the AI runs on the device.

The speed and size of the AI enable it to run on a wide range of devices. Some of the devices include:
Mobile Devices:
IPad/IPod/IPhone
Android
Blackberry
Samsung
Nokia
Windows Phone
Kindle

Game Consoles/TVs:
Playstation Portable
Playstation 3
Nintendo Wii
Google TV

PC Browsers:
Internet Explorer
Opera
Opera Mini
Safari
Chrome
Firefox
Mozilla
SeaMonkey
RockMelt
Lunascape

Operating Systems:
Windows
Macintosh
Linux
Android

The system runs on virtually any device that has a browser and JavaScript enabled.

 

 
  [ # 23 ]
Dan Hughes - Mar 10, 2013:

Now a little critique smile  The architecture you propose doesn’t really make much sense.  You state that there will be a separate device that will be the “front” for the user input, record and send that to the SoC device for processing, why?  Wouldn’t it be more convenient to embed the AI into that device also and be able to provide it more resources for processing?  I guess it depends on your price point for this, which I do not know either.

I would use a hardlink between your SoC and the above device.  I assume you will be using the standard 502.11 protocols, which will be fine, until you have a vehicle full of Androids and iPhones all looking for Wifi access points, then you will have interference headaches.  PLus as this device transmits the user input to the SoC for processing in your proposed design, if the vehicle is commanded to “Stop NOW!” for some emergency, you need that to be reliable and probably have some connection redundancy depending on the application parameters!

Dan…

Could it be that you have problems with the proposed architecture because you are envisioning this system in a car?  If you applied your suggestion to let’s say a motorcycle, it wouldn’t work.  Anywhere you place the single circuit with a mic on that vehicle wouldn’t work, it’s too loud, the mic could never hear the commands from the driver.  Now lets say you place the mic and audio device in the driver’s helmet, and the control circuit board somewhere else on the motorcycle, now hearing commands becomes easier right?  We were hoping to simply the tech needed for the mic and audio circuit board by putting the AI in the SoC, on the main motherboard where power and memory is not so much of a concern.  If your suggestion is place the whole AI in the helmet (in this example) that is more like it, that makes sense.  If we did that we could run the AI on off the self parts and put the custom control circuitry on the motherboard some where on the motorcycle (in this example).

Also we wouldn’t be using WiFi, or any 502.11 protocols, they are too complicated, and as you suggested too busy, a simple radio communicating on an obscure channel, with an embedded custom MAC should do the trick.  No WiFi or Bluetooth needed really.  The “Stop NOW” safety guard would be autonomously handled by the control module on board communicating with external sensors, and of course the driver pressing the brake, that will remain the primary method for stopping the vehicle. smile

As to your other statements about memory usage and processing power, we will have MigaBytes of storage space, Gigabytes of Storage maybe possible as well, but RAM will diffinitely be limited to MegaBytes.  The custom CPUs will not be Ghz processors, low MHz is more like it.  So maybe your suggestion of off loading the “work” to a more powerful device or a compute farm ie the cloud, might be the best option.  Nonetheless now you got me wondering why is a processor, which is processing over 200 million calculations per second not fast enough???  Why do you need to be 5X faster to do this task, a Ghz, over a billion calculations per second to so this, or a compute farm, petaflops of calculations per second???  Why is that necessary, it does make you wonder if the approach to this is the most optimal?  Seriously,  think about that 200 Millions calculations per second is not enough processing power! Why?  You don’t have to answer that, just thinking out loud.

 

 
  [ # 24 ]
Merlin - Mar 10, 2013:

Daniel,
If you would like a commercial proposal, drop me an email and I will work one up for you.

In the meantime, I will give you some food for thought for your project.
2 threads you may find of interest:

Siri - ignore the STT angle, it’s a chatbot ... is it any good

The age of intelligent machines has arrived.
I’ll also add a more general overview of conversational user interfaces for intelligent devices on this thread.

There are 4 basic architectural designs when building a “conversational user interface”. These represent trade offs in time, flexibility, capability, cost (of hardware) and maintainability.

- Embedded:
This would be on custom hardware, if you need voice recognition run locally, size of memory and processing requirements are much greater. As such it is sometimes better to use a wireless link to different device.
Smart Refrigerator

- Local:
Use a standard platform/PC/Tablet/Cell phone to host the user interface.
The IRobot Ava platform might be an example of a local environment. The interface that runs the bot runs on an Ipad.
Bots running on cellphones are also starting to become popular and may serve as examples. A mass produced android phone can serve as an inexpensive hardware component.
Polaris
Romo
Few of these currently have a conversational AI user interface, but there is no reason that they…

Bruce thanks for the input.  I’ll take some time to look over the other threads, in the mean time, I did send you an email.

 

 
  [ # 25 ]

Um… Daniel? Merlin is Ken, not Bruce. cheese

I’m sure it’s a simple mistake. smile

 

 
  [ # 26 ]
Steve Worswick - Mar 9, 2013:
Dan Hughes - Mar 9, 2013:

3.  “Rationalizer”..

I make this file available for anyone to use at http://www.mitsuku.com under the AIML files tab, as I am a firm believer that if it wasn’t for patents, copyrights, intellectual property etc, we would probably have put a man on Mars by now instead of still being in the steam age.

I mean no offence by this and realise you are working on a commercial product. I also apologise for taking the thread off topic.

I believe you are wrong.  There are very good reasons for some intellectual property rights. Without patent protection available, we would not be as advanced as we are.

Trade secrets are evil, and the government should not prosecute for theft. 

Copyrights are fine, if they’re not useful .  Copyrighting a painting or story is OK.  Allowing someone to copyright code is wrong because of the duration of copyrights. 

Patenting a program is OK if the program is new and useful.  The programmers/inventors of Visicalc, the first spreadsheet should have been allowed to patent it. 

This is off topic, but important.  I’m in the process of writing up an essay on intellectual property.  I’ll try to get it online this coming week.

 

 
  [ # 27 ]

Daniel,

Yes I was envisioning the system in a car, but then you are correct about a motorcycle, you would reqiure a different setup due to noise.

As for the processor, if you were to tune the AI so that it only performed the minimal of what you needed, then you may be able to get that to operate in 200Mhz or so if you wanted to use a true NLP parser.  You could cheat and go down the road of pattern matching in sentences looking for keywords to act upon and that would be a lot cheaper.  It totally depends on how smart you want this thing to be when it comes to processing commands and retorting.

The voice recog is another beast entirely though, and from what I know about that discipline, performing accurate voice recognition and speech to text in 200Mhz and a few megabytes of memory will be quite an task.  If anyone knows this subject better than I right now, maybe you should speak up smile

 

 
  [ # 28 ]

Hi Dan, Hi Daniel

As I was following this thread, paying attention on the subject, whilst keeping noise away; I see an interesting project, and want to add some thought on it, based on my experience, as an Electronics Engineer and NLP Researcher and Software Developer.

Car-Voice interface, inside a car is a complicated issue, many companies are (trying to do / doing this), including Honda, Volkswagen, Audi, and more to come..

I divide the work in buiding blocks, for analysis

Stage1: Voice Recognition: wears a complex trade-off between quality of recognition, noise and vocabulary coverage, added to the language model. The ‘best’ open vocabulary state-of-the-art ASR (Automated Speech Recognition) precision lies somewhere between 30 and 70%, so 3 of every 10 word will be missed/misunderstood, unless you limit the vocabulary and/or get a robust language model behind. This need lots of memory, for the storage of the modelling algorithms (whatever HMM, SVM, CRF, MaxEnt, etc.) Also needs a lot of training, even if it wants to be user-independent its needed to better the user’s recognition rate and lower error rate. Here you need a several core + several GHZ processor, (Core i7 for example) or you need a DSP packed with a lot of firmware doing the CPU’s calculations to lower CPU’s needs, but the development of such a hardware requires quite a lot amount of inversion (time+money).
Siri and Android’s Google ASR runs on the web, backed up with powerful multi-core petaflop-cloud-architecture, and you cannot pack all of this inside a black-box cube, inside a car, running from on a single 60 A/H 12V battery!

Stage 2: You will need a framework to place AI and be able to post a high-level language commands, in order to interpret the text out of the ASR, classify the commands, and do the needed NLP processing to determine the message content (POS tagging, Chunking, NER, Parsing, etc.), then you need to link to actual engine-status and then trigger the stage of NLG (Natural Language Generation) to get the response. This will require another powerful CPU, some big+fast data access for vocabulary and a good back-modelling to perform efficiently, just in time: you cannot get a miss, if the car is running, and the driver issues an order “Turn Left” the system must ‘turn left’ on the next output, and you cannot get the word disambiguate until… crash! (end of story)

Stage 3: The output logic (actuators, measuring, etc.) will fit on a fingernail, because abstraction is simple and a simple reasoner can be built on silicon, running a few MHz CPU.


The Stage 2 is complex, as many on this thread have pointed out. Some issues are:

ASR is resource-hungry, even for low number of words, more if there is presence of noise…

Morphologic annotation of the recognized words is simple in English, but very complicated in highly inflected lang, like Spanish, Portuguese, French, Polish, Russian, Italian, German, etc.

POS tagging is complex, but easy in a limited context

NER is even more complex, but simple on a limited context of entities

Parsing is also complex, and lang dependent, specially if there is a free-will interface, like speech, many Parser engines will fail, due to the ambiguities of the grammar.

You will need to develop a special grammar for this element, keeping it from being ambiguous and other minor grammar issue’s. You will need a robust statistical-parser or a GLR with context dependent parsing, which is a difficult issue, due that all NLP-Parsing theories use/are based upon CFG grammars, and a conversation do-not!

You need also do some Co-reference Resolution (Anaphoric) this is not simple, and highly language dependent, and not context free.

NLG is complex, there are not many standards and to do it simple you can populate it making canned responses, if not you must do Planning.. (another not-so-well NLP + AI theme) this is also not so-well known, there are few algorithms, and fewer frameworks available.

To start over: all this components are Highly Language-Dependent, there is no cross unique element to cope with all issues.

Those above, were exactly the kind of problems I was facing, when 6 years ago, I tried to design and build a conversational self-contained robot for industrial usage. This is why I am developing since then a complete framework to address all of the above mentioned issues, only ASR was not included by me, because this solution is hardware+software dependent and there are good enough.

Actually I am targeting .NET which is also can be industrially embedded on many processors.

If you like I can disclose some architectural stuff on the development, it reaches >600k lines of code, in 34 projects, reaching from robust morphological text analysis, POS tagging, Robust context-aware GLR chunking, Anaphora resolutions tied upon a GLR engine (capable of unknown verb/noun/adjective inflection) for textual responses.
All this is wrapped with a event-based high level compiled-language running on a AI run-time environment.

 

 
  [ # 29 ]

Andres has some excellent points.

 

 
  [ # 30 ]

Hi Andres

  Thanks for the detailed response.  I think (after reading your response) the “scope” of the AI we are asking for is not as broad as you suggest.  The two scenarios I explained above will probably be the extend of the scope.  Dan (Huges) I believe has a better understanding of the scope after reading the scenarios above, he said it best, what we need is a “narrow” AI.  Make no mistake I don’t doubt your assessment of an AI and what is required, and I’m quite sure that your brief technical description is accurate, hopefully we can find a mean between the extremes for our project.  Perhaps there is some way to encapsulate a broad AI into narrower version, or a more simple copy.

 

 < 1 2
2 of 2
 
  login or register to react