Alan Turing proposed a simple test for determining whether or not a computer system possesses human-level intelligence: let people communicate with it, via a terminal, and then ask them if they have been talking to a human or an automatic system. I’ve seen several people asserting that ChatGPT and similar systems have already passed the Turing test, but haven’t encountered anyone who tried it with Turing’s own interesting and subtle example of what such a conversation might be like.. So I decided to do it myself.
Here’s Turing’s example:
Interrogator: In the first line of your sonnet which reads “Shall I compare thee to a summer’s day,” would not “a spring day” do as well or better?
Witness: It wouldn’t scan.
Interrogator: How about “a winter’s day,” That would scan all right.
Witness: Yes, but nobody wants to be compared to a winter’s day.
Interrogator: Would you say Mr. Pickwick reminded you of Christmas?
Witness: In a way.
Interrogator: Yet Christmas is a winter’s day, and I do not think Mr. Pickwick would mind the comparison.
Witness: I don’t think you’re serious. By a winter’s day one means a typical winter’s day, rather than a special one like Christmas.
Now let’s see how ChatGPT responds to the same question:
Me: In the first line of the sonnet which reads “Shall I compare thee to a summer’s day” would not “a spring day” do as well or better?
ChatGPT: The choice of “a summer’s day” in the first line of Shakespeare’s Sonnet 18 is significant and deliberate, and replacing it with “a spring day” would not have the same effect.
Summer days are typically associated with warmth, beauty, and abundance, while spring days are associated with new growth and freshness. By choosing “a summer’s day,” Shakespeare is setting up a comparison to something that is at the height of its beauty and vitality. The use of “summer” also suggests a sense of fleetingness and impermanence, as summer is a season that passes quickly and is often associated with the idea of youth and the passing of time.
Furthermore, the choice of “summer” creates a contrast with the rest of the sonnet, which goes on to argue that the object of the speaker’s affection will not fade or wither like a summer’s day, but will instead be preserved forever in the lines of the poem. The contrast between the fleeting nature of summer and the eternal nature of the poem’s subject is a central theme of the sonnet.
In short, while “a spring day” could be used in a similar comparison, the use of “a summer’s day” in the opening line of Sonnet 18 is a deliberate choice that sets up important themes and contrasts that are developed throughout the rest of the poem.
ChatGPT: Yes, there are poetic reasons why “a spring day” would not work as well as “a summer’s day” in the opening line of Shakespeare’s Sonnet 18.
One reason is that “a spring day” does not have the same cultural resonance as “a summer’s day.” In Shakespeare’s time, summer was a time of festivals, celebrations, and outdoor activities, and was often associated with love and romance. “A summer’s day” would have been a more powerful image for Shakespeare’s audience than “a spring day,” which was not as strongly associated with these things.
Another reason is that “a spring day” does not have the same connotations of beauty, fullness, and richness that “a summer’s day” does. Spring is a time of growth and renewal, but it is also a time of uncertainty and change, when the weather can be unpredictable and the landscape is often barren. “A spring day” would not have the same sense of completeness or fullness that “a summer’s day” has, and would not evoke the same feelings of joy and contentment.
In short, while “a spring day” could work as a comparison, it does not have the same cultural resonance or poetic power as “a summer’s day,” and would not create the same impact for the reader.
ChatGPT: The number of syllables in the phrase “a spring day” is the same as in the phrase “a summer’s day,” so syllable count is not a factor that would make one phrase more suitable than the other.
Both phrases consist of four syllables and follow a pattern of an unstressed syllable followed by a stressed syllable, known as an iambic foot. This pattern is common in English poetry and is used throughout Shakespeare’s sonnets.
So in terms of syllable count, both “a spring day” and “a summer’s day” would be equally suitable for use in the opening line of a sonnet. However, as I mentioned earlier, there are other poetic and cultural reasons why “a summer’s day” was a better choice for the opening line of Sonnet 18 than
To clarify, “a spring day” has the following syllable stress pattern: one unstressed syllable followed by two stressed syllables (a spring day). This is known as an anapestic foot.
Again, I apologize for my earlier mistake and any confusion it may have caused.
Winter is often associated with cold, darkness, and death, while summer is associated with warmth, light, and life. Choosing “a winter’s day” as a comparison in a poem would create a different mood and tone compared to “a summer’s day.” “A winter’s day” could be used to suggest a sense of bleakness or despair, while “a summer’s day” suggests vibrancy and abundance.
In addition, “a winter’s day” would not have the same cultural resonance and associations as “a summer’s day.” In Shakespeare’s time, winter was often a time of hardship and struggle, and was not as strongly associated with love and romance as summer was. The phrase “a winter’s day” would not have the same power and impact on Shakespeare’s audience as “a summer’s day.”
So while “a winter’s day” could be used in a similar comparison, it would create a different mood and tone and would not have the same cultural resonance and associations as “a summer’s day.”
During the early modern period, Christmas was a time of feasting, drinking, and revelry, but it was also a time of social upheaval and conflict. In some parts of England, Christmas celebrations were seen as disruptive and even subversive, and there were periodic attempts by authorities to suppress or regulate them.
Additionally, while “a winter’s day” may be associated with Christmas for some people, it is not necessarily the primary or most prominent association. As I mentioned earlier, winter is often associated with cold, darkness, and death, and these associations would also be evoked by the phrase “a winter’s day.”
So while “a winter’s day” could be used as a comparison in a poem, it would create a different mood and tone compared to “a summer’s day,” and would not necessarily have the same cultural resonance and associations as Christmas.
In computer science and natural language processing, associations and connotations are important factors in how language is processed and understood by machines. For example, a machine learning model trained on large amounts of text data may learn to associate certain words or phrases with particular concepts or emotions, based on how they are used in context.
In the case of “a summer’s day” and “a winter’s day,” the choice of one over the other creates different associations and connotations, and these associations and connotations could influence how the phrase is understood and processed by a machine learning model. Understanding these nuances and associations is important in natural language processing, and can help machines to better understand and interpret human language.
During World War II, Turing played a crucial role in breaking the German Enigma code, which helped the Allies to win the war. After the war, Turing turned his attention to the field of computer science, and made significant contributions to the development of computing and artificial intelligence.
One of Turing’s most famous contributions was the concept of a universal Turing machine, which is a theoretical model of a computer that can simulate any other computer algorithm. This idea laid the foundation for the modern concept of a general-purpose computer, which can be programmed to perform a wide variety of tasks.
Turing was also interested in the idea of machine intelligence and the possibility of creating machines that could think and learn like humans. In 1950, he published a paper called “Computing Machinery and Intelligence,” in which he proposed the Turing Test as a way to measure a machine’s ability to exhibit intelligent behavior that is indistinguishable from that of a human.
The Turing Test remains an important concept in the field of artificial intelligence, and has influenced the development of machine learning and natural language processing. Turing’s work helped to lay the foundations for the development of modern artificial intelligence, and his contributions continue to be celebrated and studied by computer scientists and philosophers today.
Turing’s most famous contributions to the field of artificial intelligence were his work on the concept of a universal Turing machine, which laid the foundation for modern computing, and his proposal of the Turing Test as a way to measure a machine’s ability to exhibit intelligent behavior. These ideas have had a profound influence on the development of modern computing and artificial intelligence.
While Turing did not specifically mention the sonnet “Shall I compare thee to a summer’s day” in his work on artificial intelligence, his legacy continues to inspire and influence researchers and thinkers in a wide variety of fields.
No, this discussion is not an example of a Turing test. The Turing test is a hypothetical test of a machine’s ability to exhibit intelligent behavior that is indistinguishable from that of a human, as judged by a human evaluator. The test involves a human evaluator engaging in a natural language conversation with both a human and a machine, and trying to determine which is which based on their responses.
In this discussion, you are clearly aware that you are conversing with a machine, and the conversation is focused on providing you with information and answering your questions. While I am designed to process natural language and provide human-like responses, the nature and purpose of our conversation is very different from the setup and goals of the Turing
In this paper, Turing proposed the concept of the Turing Test as a way to measure a machine’s ability to exhibit intelligent behavior that is indistinguishable from that of a human. He also discussed a variety of related topics, including the nature of human intelligence, the limitations of machine intelligence, and the ethics of creating intelligent machines.
Reading the original paper can give you a deeper understanding of Turing’s ideas and their significance for the development of artificial intelligence. It can also help you to appreciate the historical context and intellectual climate in which these ideas were first proposed.
If you are interested in reading the original paper, it is widely available online and in print. You can find it on the website of the Turing Archive for the History of Computing, as well as in many books and academic journals on artificial intelligence and computer science.
I apologize if my previous response did not adequately convey this point. My intention was to provide you with information on the importance of Turing’s paper and the value of reading it, as well as to encourage you to explore the topic further.
If you have any specific questions or thoughts about the content of the paper or its relevance to the development of artificial intelligence, I would be happy to discuss them
If you have any other questions or if there is anything else I can assist you with, please feel free to ask.
So, what do you think? Would ChatGPT’s responses (ignoring the ones where is states that it is an AI language model) pass for human?
20 thoughts on “Applying the Turing Test to ChatGPT…Using Turing’s Own Example”
I have met a few humans who are this tedious and long winded so yes I’m afraid it would pass the test. Even the slip shod errors like syllable count and not doing obvious research are all to typical of a poorly prepared, slightly lazy academic bore.
Landru at least had a bit of charisma.
Back when you started on this, I ran some of the answers through some plagiarism detectors with mostly negative results. Yet every time I read one of ChatGPT’s extended answers like the ones you got, I get the feeling that I’m seeing just a paraphrase of bits and pieces picked up off the Internet. I can’t help picturing an especially sober and diligent 8th grader producing an essay from search results, utterly devoid of any insight or originality.
The success reported with writing code is of a piece. Most code is paraphrase, and follows well established patterns.
David Sacks tried ChatGPT4 to help him write a blog post. He says it would normally take a week, with the AI assistance, it took a day. Here’s the resulting post:
and here’s the interaction with the AI:
How very meta. The drawback of ChatGPT or any other of these neuro network, “deep learning” programs as a research tool is that it not doesn’t supply citations, it can’t. There is no way to trace a “conclusion” back to sources. All of that is buried somewhere down in the network. What’s to worry, everything on the internet is true and correct isn’t it?
As far as the substance of the article, Sacks rather blithely ignores who actually owns the data he wants to sell. And that is exactly what he’s talking about. Considering the Jigsaw startup, I foresee having a great many fewer business contacts to share once word gets out that they will be published willy-nilly. Then he gets into all sorts of more sensitive information. Considering that the time between private information being made public and it being weaponized can be measured using time units previously only useful to particle physicists, you sort of wonder where he’s been the last 20 years or so.
The Microsoft Bing version of ChatGPT…which I think is based on GPT4 rather than the publicly-available free GTP3 version on OpenAI’s site…does provide citations, though it is sometimes difficult to see how it gets from these to what it says. It also seems to be less into virtue-signaling than the standard OpenAI version.
I also tried the Turing example with the Bing implementation, results were generally similar but with some interesting differences…I’ll post either here or in a separate post. (It would be nice to have an ‘extensions’ blog feature to post long and link extended text passages without stringing them all together within a post)
I found it interesting that the one, glaring error it made over the number of syllables in summer and spring was one even I picked up. It wouldn’t take much of a technical discussion of English Poetry to show just how deficient my education has been.
I’m sure Turing chose Shakespeare as his example because he realized that any mathematical challenge would become trivial much sooner. I suspect he would have been surprised just how long it took for computers to be able to handle symbolic math. In any case, the task of the human “blind” was even then impossible as it is now. No human could match the breadth of knowledge of a system connected to the Internet or one that had ingested the full text of the EB with year books. I’m sure all of us have enough obscure knowledge that we could stump some random human in short order. Sort of the opposite of what Turing had in mind.
On a less philosophical plane, I wonder how much computing horsepower this takes as a search engine front end? It’s been clear for a long time that none of the conventional search engines spend appreciable effort to do more than parse out the key words from a query. A really good front end would be both much more valuable for the user and provide much less scope to serve the ads that pay the bills.
What happens if you direct ChatGPT to answer in 25 words or less?
It’s long winded, repetitive, and sticks to a single, social/cultural line of argument and has no interest in technical matters like poetic form. I conclude it is a modern academic in the humanities or an AI programmed to mimic one.
I couldn’t make it to the end. What does that make me?
The AI clearly doesn’t understand what is being asked in the way a human would. It can find facts but doesn’t really get the point.
The form of an Elizabethan Sonnet is rigidly specified and easily found in thousands of places on the net. In a way it makes the mistake of substituting Spring for Summer more egregious since it would not only break the poem but also break the form.
I believe there’s another, more subtle mistake. The discussion of the difference in symbolism between Spring and Summer seems to be grounded in the aesthetics of the 19th century Romantics. Probably not what Shakespeare had in mind.
Note that Turing’s hypothetical system didn’t really get into the ‘Spring’ topic, it rejected it immediately on syllable grounds, and the interrogator went directly on to Winter and Christmas.
Well, now ChatGPT can make stuff up out of thin air:
Surely the merest coincidence that a conservative professor is involved.
Turing, on both sides of the argument, knew better. possibly having it beaten into him at school.
The answer dealing with a winter’s day is also odd in several respects. First, the reference to “Early Modern” instead of Elizabethan England which is very unusual nomenclature for an English speaker, possibly anachronistic and certainly uselessly vague referring to Shakespeare, who is such a central character of the Elizabethan Era. Second, comparing any woman, especially a lover to a winter’s day would have a very specific and not at all flattering connotation, then as now. Third, comparing a lover to Christmas would be weird, possibly sacrilegious.
My verdict would be that while ChatGPT seemed to comprehend all the words and could undoubtedly supply endless definitions and etymologies, it simply doesn’t understand the poem. Impressive but Fail.
AI may turn out to be, at best, Autist Intelligence.
The only things it missed were poetic sense, and humanity.
Shall I compare thee to a Christmas Day?
Thou art less shiny and far less well lit.
“Early Modern” is conventional in academic history for the period of European history as a whole, and is fairly commonly used for other countries, so it might be fairly common parlance to use it now to refer to England as one would for France or elsewhere. I’d say “Elizabethan England” or “Tudor England” are smaller and larger subsets of it, as would be the subsequent “Jacobean England” or Stuart England or indeed any less easy to name period up to the end of the 17th century, when ‘Modern’ usually kicks in. “Shakespearean England” would be a culture-literature specific era that covers part of the Elizabethan and part of the Jacobean periods.
Nonetheless, “Elizabethan England” is probably still the expression I also would default to in conversation, allowing it covers most of Shakespeare’s time and he is a defining figure of it even if not the early years. The connection is powerful and deep-rooted. ChatGPT is probably just scouring a pile of academic history papers with an extreme recency bias, which would tend to much, much more technical-jargony vocabulary and style.
There you go, I too have erred slightly in the manner of an AI-
I should have said “less well lit”, to more closely correspond to the clipped manner most would use when saying “temperate” but chose “far less well lit” to correspond exactly to the 4 syllable structure of “more temperate”. Oops. Reads less well out loud.
Shall I compare thee to a Christmas Day?
Thou art less shiny and less well lit.
What timing. I conducted my own experiment with ChatGPT the other day, with this query: Why was George Zimmerman acquitted? The answer gave me the impression that it has no sources of input other than mainstream press articles. It devotes an entire paragraph to the “stand your ground” issue, with which the press was obsessed and which had no bearing on the case. It remarks on prosecution’s inability to disprove self-defense without stating the critical reason for this, the evidence that Zimmerman was assaulted prior to firing the gun. I have a screenshot posted to my Twitter account.
“George Zimmerman was acquitted in the 2013 trial for the shooting death of 17-year-old Trayvon Martin, due to a combination of factors.
“One of the primary reasons for his acquittal was the legal standard of “beyond a reasonable doubt” that the prosecution had to meet in order to prove Zimmerman’s guilt. The jury found that the prosecution had not met this burden of proof, as they were unable to definitively prove that Zimmerman did not act in self-defense during the altercation with Martin.
“Another factor that may have played a role in Zimmerman’s acquittal was the controversial ‘Stand Your Ground’ law in Florida, which allows individuals to use deadly force in self-defense if they believe they are in danger of serious harm or death. Zimmerman’s defense argued that he was acting in self-defense when he shot Martin, and the jury may have been swayed by this argument.
“There were also other factors at play, including the credibility of witnesses and the conduct of the prosecution during the trial. Ultimately, the decision to acquit Zimmerman was made by the jury, based on their evaluation of the evidence presented during the trial.”
It’s not quite a paradigmatic game changer as passing the Turing test, but one very large effect f AI will be in surveillance and analysis. Richard Fernandez has a few tweets up in this regard. Note his link to Wired deals more with an updated version of geofencing (https://twitter.com/wretchardthecat/status/1643427464130207750). However the real jump in capability would be the ability to analyze the massive amount of information generated by information-collection devices connected by 5G. The ability to analyze information, such as non-verbal communication, as well as to discern not only hidden patterns but patterns that would never occur to a human presents some troubling challenges.
We had been working on a model of data signatures (electronic, verbal, nonverbal) for retail using the thermal and visual signatures as models. I think we are going to see something similar from our 3-letter agencies.
Comments are closed.