Whats wrong with the Turing test

home | about | contact | privacy statement

Posted by Sam on Nov 14, 2007 at 08:30 AM UTC - 5 hrs

The Turing Test was designed to test the ability of a computer program to demonstrate intelligence. (Here is Alan Turing's proposal of it.) It is often described as so: if a computer can fool a person into believing it too is a person, the computer has passed the test and demonstrated intelligence. That view is a simplified version.

Quoth Wikipedia about the rules:

A human judge engages in a natural language conversation with one human and one machine, each of which try to appear human; if the judge cannot reliably tell which is which, then the machine is said to pass the test.

Specifically, the test should be run multiple times and if the judge cannot decipher which respondent is the human about 50% of the time, you might say he cannot tell the difference, and therefore, the machine has demonstrated intelligence and the ability to hold a conversation with a human.

I bring this up because recently Giles Bowkett said that poker bots pass the test, and pointed to another post where he said the Turing test was beaten in the 1970s by a paranoid repeater named PARRY.

I suppose the idea of poker bots passing the test comes about because (presumably) the human players at the table don't realize they are playing against a computer. But if that is the case, even a losing bot would qualify - human players may think the bot player is just an idiot.

More interesting is his idea that the Turing test was beaten in the 1970s.

In defense of that thought, Giles mentions that the requirement having the questioner "actively [try] to determine the nature of the entity they are chatting with" is splitting hairs. Even if you agree with that, the poker bot certainly does not qualify - there is no chat, and there is certainly no natural language.

If that still constitutes hair splitting, I'm sure we can eventually split enough hairs to reduce the test so that most any program can pass. Give the sum(x,y) x,y ε {0..9} program to a primary school math teacher, and the fact that it can add correctly may lead the teacher to believe it is a smart kid. Then add in the random wrong answer to make it more believable.

In any case, if you accept the hair-splitting argument about the program that actually did chat in natural language, then certainly PARRY passed the test. On the other hand, while you may consider that the test was passed in fact, it was not done so in spirit.

I'll admit that I think the requirement of a tester talking to both a human and a computer program is an important one. Even disregarding the "no occasion to think they were talking to a computer," if we limit the computer's response to a particular set of people who think they are in a particular context, we could (for example) use Markov-model generated sentences with some characters in keyboard-distance-proximity replaced within the words. We now have a drunk typist. Perhaps in his drunken stupor he is not at his sharpest, but he would likely still be considered an intelligent being.

If the human tester is able to ask questions of both the computer and another human, do you think he would ever choose the computer program that rephrases his questions to sound paranoid as the human over the person who answers to the best of his ability?

The point is that without that requirement, the test becomes meaningless. Giles notes that

All the test really does is expose how little we know about what makes people people. The fact that it's easy to fool people into thinking a computer is human doesn't actually teach you anything about the difference between computers and humans; all it does is teach you that it's easy to fool people.

I think that's only true if you conceive of the test as he does. By keeping all of the requirements Turing proposed, it becomes quite a bit harder to pass, and retains its utility. Further, removing that quality changes the essence of the test. Since we can assume the test was designed to be useful, can we really say that insisting the test retain its essence is unreasonable?

What do you think?

Hey! Why don't you make your life easier and subscribe to the full post or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!

Last modified on Nov 14, 2007 at 08:32 AM UTC - 5 hrs

Comments

Leave a comment

Hi Sam - always fun to see a fellow ColdFusioner ramble OT on more philosophical tangents.

I'm not entirely sure I follow your take on the Turing test, but given your characterization of it, I can't help but wonder whether you might be missing an important point. The Wikipedia quote is a bit limited - it should read:

"If the judge cannot reliably tell which is which, then the machine is said to pass the test..." ...in the context alone of being accessible only via teletype and providing no additional visual or other communicative cues. Passing that test in that context does not translate into passing other tests in other contexts.

The Turing Test was deliberately misnamed - it was only an observation about the assumptions you must make in navigating the world in real-time.

You assume a fictional center of gravity in the soccer ball coming down on your head because if you took the time to calculate its mass, trajectory, wind conditions, etc., it'll fly right past you.

Similarly, you assume I'm not an insulted PARRY coming back to flame you because actually taking the time to confirm my identity would consume your afternoon if it could be done at all.

Which brings me to the real point Turing was getting at: the Turing Test is really about the illusion of intelligence and of self. In a certain context, convincing you of intelligence is the same thing as being intelligent by definition.

The crazy thing is that you do this with your own fictional self narrative too. As Daniel Dennett writes: "all introspection is nothing but impromptu theorizing."

Anyway, like I said, fun to take a break from my own spaghetti code and think deep thoughts a bit - thanks for the opportunity.

Posted by Jason on Nov 28, 2007 at 08:16 AM UTC - 5 hrs

@Jason - it's sort of on topic. I /do/ have an AI/Machine learning category on here ;)

I like your points. I guess I wasn't talking about the original test as Turing explained it as I made it seem, but more about what the test had become.

I get that "In a certain context, convincing you of intelligence is the same thing as being intelligent by definition." In fact, those are good words to use. I guess my main "beef" was that dropping the requirements as Giles suggested took away that certain context for me. I hoped to show that with a little reduction to absurdity... dunno if that was convincing though.

Anyway, thanks to you for the thoughtful comment as well! Definitely given me the urge to go read the paper in full again!

Posted by Sammy Larbi on Nov 28, 2007 at 04:28 PM UTC - 5 hrs

I believe this might be relevant:
...
In Turing Test Two, two players A and B are again being questioned by a human interrogator C. Before A gave out his answer (labeled as aa) to a question, he would also be required to guess how the other player B will answer the same question and this guess is labeled as ab. Similarly B will give her answer (labeled as bb) and her guess of A's answer, ba. The answers aa and ba will be grouped together as group a and similarly bb and ab will be grouped together as group b. The interrogator will be given first the answers as two separate groups and with only the group label (a and b) and without the individual labels (aa, ab, ba and bb). If C cannot tell correctly which of the aa and ba is from player A and which is from player B, B will get a score of one. If C cannot tell which of the bb and ab is from player B and which is from player A, A will get a score of one. All answers (with the individual labels) are then made available to all parties (A, B and C) and then the game continues. At the end of the game, the player who scored more is considered had won the game and is more "intelligent".
...

http://turing-test-two.com/ttt/TTT.pdf

Posted by huoyangao on Dec 26, 2007 at 06:27 PM UTC - 5 hrs

Interesting paper. Do you have plans to keep it there forever?

How do you deal with the problem that arises:

1) What is your gender?
aa - I am male
ba - I am female

If the interrogator chooses to think ba is the correct answer, then any other questions that might hint at gender if B gives answers more suited to a female, would likely be chosen by the interrogator as the correct one, even if A's answers are really good.

Of course, it doesn't just have to be gender - there are multitudes of questions that could be similarly biased.

Posted by Sammy Larbi on Dec 27, 2007 at 10:12 AM UTC - 5 hrs

Yes, I plan to keep the paper there forever.

On the gender question.
Let's say A is male but somehow C (mistakenly) thought A is more likely to be female. In such case, then B do score one point. However, according to the TTT protocol, when this question is finished and scored. C will be informed of all the labels so C will then know A is male. Therefore C's this one mistake will not affect future scorings.

Things will be more clear if the test is done in multiple choice format. Since B failed to guess A's answer. B will not score any point.

In short, TTT is not simply two human being taking a Turing Test.

Thanks for your comment and happy new year.

Posted by huoyangao on Dec 31, 2007 at 06:28 PM UTC - 5 hrs

Ok, that's what I thought but I wanted to be sure that's what you meant to get across in the paper. In that case, how do you get around the opposite bias? (Or why doesn't it matter?)

I understand it's not two humans - but I was thinking you might still ask a question like that - the goal for the machine is to come up with intelligent answers that make it sound human, so the response to "are you male or female" probably should not be "I am a machine." =)

Happy new year to you as well!

Posted by Sammy Larbi on Jan 01, 2008 at 11:31 AM UTC - 5 hrs

Leave a comment

Topics
.NET (19)
AI/Machine Learning (14)
Answers To 100 Interview Questions (10)
Bioinformatics (2)
Business (1)
C and Cplusplus (6)
cfrails (22)
ColdFusion (78)
Customer Relations (15)
Databases (3)
DRY (18)
DSLs (11)
Future Tech (5)
Games (5)
Groovy/Grails (8)
Hardware (1)
IDEs (9)
Java (38)
JavaScript (4)
Linux (2)
Lisp (1)
Mac OS (4)
Management (15)
MediaServerX (1)
Miscellany (76)
OOAD (37)
Productivity (11)
Programming (168)
Programming Quotables (9)
Rails (31)
Ruby (67)
Save Your Job (58)
scriptaGulous (4)
Software Development Process (23)
TDD (41)
TDDing xorblog (6)
Tools (5)
Web Development (8)
Windows (1)
With (1)
YAGNI (10)

Resources
Agile Manifesto & Principles
Principles Of OOD
ColdFusion
CFUnit
Ruby
Ruby on Rails
JUnit

RSS 2.0: Full Post | Short Blurb
Subscribe by email: