Chatbot Suzette Fools One Judge During Loebner Prize Contest

My chatbot, Suzette, won this year’s Loebner Prize contest and even confused a judge into voting for her over a human (or should I say he confused himself). But here is the blow-by-blow of this weird event.

When I arrived at the contest, I figured I had good odds to win if nothing went horribly wrong. Yes, Suzette had easily qualified over the 3 other competitors (her score 11 pts, the nearest competitor 7.5). Her design and data naturally gave her an edge over her competitors on the human knowledge test questions of the qualifiers. But human judge chat was an entirely different matter than the qualification test. Still, I felt she could carry on a detailed conversation better than the others and should win.

Initial installation of the programs occurred on Friday. From prechat conversations with the other contestants I learned that A.L.I.C.E. came with 3 redundant disks. Yet all three turned out to be blank! What a scare that must have been. Dr Wallace managed to install by retrieving the program over the Internet. Cleverbot is now at 45 million lines of memorized user chat (at a rate of doubling every year). And UltraHal is now listening to tweets, so has 300K of user chat it learned and 400K of tweets it has accepted for learning (code decides if the user has had enough responses and doesn’t trigger any red flags).

Then we get to the competition. While the CalState organizers had initially planned to have various interdepartmental professors act as judges (like English dept, etc), they backed out at the last minute, so all the judges were from the Engineering/Computer Science dept. Talk about guys who might know what to expect from chatbots! And all the humans were students from the same departments. What a weird mixture to compete in. And then, each round was 25 minutes. That’s bad if you want confuse a judge about who is human. But really, the programs have no chance for that. So it’s good because it gives the human time to compare each program against the other. Though it’s not clear to me that the judges tried to use their time to do that.

And the students didn’t really understand their role. It was merely to BE HUMAN and convince the judges of that. Before startup there was informal chatting between humans and judges, which was obviously inappropriate and it was then pointed out to the humans that since the judges already knew their names, they had best use false ones in the competition.

So, Round 1. After a few exchanges, somehow Suzettte got stuck into repeating exactly what the judge said for the rest of the round. I have no idea how. The round is a total disaster. I’ve never seen such a bug before. Maybe it’s in my only-lightly-tested protocol for the competition. I have no idea. But it completely derails my hopes for Suzette. She could still win on points only if she outdoes her opponents for every other judge and the other contestants vary all over the place.

Round 2, a great demonstration of Suzette. She should win on this round alone.

Round 3 gets off to a horrible start. Somehow, Suzette can hear the judge but the judge can’t hear Suzette. Makes no sense. A couple of restarts of Suzette doesn’t fix this. Eventually they restart the judge program, and that clears it (not that that makes any sense either). Then, after a few rounds, it’s clear Suzette has the judge from hell. He wants to know who she’s going to vote for in the upcoming election (the unspecified California governor’s race). And when she has no useful answer he wants her to name a candidate in the race. And when she has no answer to that, he simple keeps repeating the question ad nauseum, insisting she answer it. Suzette gets irritated. Then she gets angry. Suzette then gets bored. Suzette threatens to hang up on him The judge doesn’t back down until the last seconds of the round. I figure that’s the end of life as we know it.

Round 4 is a mixed bag. Suzette is ok but not great. It’s all over.

When the scores are tallied, Suzette ties with Rollo Carpenter’s Cleverbot for 2nd-3rd. Yet, it turns out, the 3rd round judge got the human subject from hell. Poetic justice! The human was all over the place—confusing, vague. The judge voted irritated/angry/bored Suzette as human. Instant win since no other program swayed the judges.

What more can I say?”

Related Chatbot: Suzette

Chatbot Suzette Fools One Judge During Loebner Prize Contest

Comments

Privacy statement

Summary:

What data does chatbots.org store?

How does chatbots.org store my data?

How do I access my data?

What data is shown?

When will you use my contact details?

Who can modify my data?

Is my data secure?

I received comment spam!

Any other questions

Loebner Prize

Chatterbox Challenge

Chatbot Battles

IVA Gala

BCS Machine Intelligence

Terasem Turing Prize

Bot Prize

Soon listed here

Personal Archievements

Hot

Award diary

Most Awarded

Most awarded last 3 years

Now on Turing100 in 2012

Browse All Chatbot Categories

Chatbot Reviews

Science Statistics

Chatbot Statistics

Use our Chat Match Tool to get started with Chatbots for Business

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

How many team members (marketing, sales, IT and customer support) will be involved in your chatbot system?

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

We're putting your report together.

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

What chat automation functions are most important to you? Check all that apply.

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

Who should we send the information to?

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

Who should we send the information to?

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

What is the best number to reach you?

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

Subscribe

Use our Chat Match Tool to get started with
Chatbots for Business

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

How many team members (marketing, sales, IT and customer support) will be involved in
your chatbot system?

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.