Getting The Joke

Put aside the question of whether machines are capable of understanding the world—the more relevant question right now is whether they are capable of explaining it.

Apr 20, 2022

I’m sending this note out in part to welcome the many new subscribers to Adjacent Possible who have signed up since my essay in The New York Times Magazine on large language models was published this past weekend. If you’re a longtime subscriber, you can skip down to the next section, which is a continuation of some of the issues explored in the piece.

For new subscribers, the basic drill around here is that I’ll be sending emails out roughly once or twice a week, covering a range of different topics related to innovation, some of them drawing on my previous writing and some exploring new domains. Paying subscribers have access to additional content: a series that I’m writing on designing workflows for creative thought, special interviews with interesting thinkers like Stewart Brand and Dan Pink, and now a new series that will continue to explore the fascinating world of neural nets, building on the GPT-3 explorations that were in the Times piece. So if you’re interested in any of that, I encourage you to upgrade to a premium subscription. (The basic package is just $5/month or $50/year, but if you sign up for the “ideal reader” tier, I’ll send you a signed personalized copy of my latest book, Extra Life.)

Welcome aboard! Now back to our regular programming…

Shortly after the Times Magazine piece went live, Kevin Kelly posted this response on Twitter:

Kevin Kelly @kevin2kelly

This piece on language AI by @stevenbjohnson is very good. It is another case for why we need better terms/concepts of near-intelligence. Language is something even if it is not everything. I like the line “ a mindless path to AI.”

nytimes.comA.I. Is Mastering Language. Should We Trust What It Says?OpenAI’s GPT-3 and other neural nets can now write original prose with mind-boggling fluency — a development that could have profound implications for the future.

I think Kevin’s onto something important here. The second we start talking about machines displaying “intelligence” or “understanding”—or even more technical descriptions of them scoring well on reading “comprehension” tests—we get into a thorny area because those words are inevitably bound up in our human experience of consciousness. Part of understanding something, as a human, is the internal experience of that understanding, the sentient feeling of holding an idea in our minds. Presumably nothing like that is happening in the electrons shuttling around in the supercomputer complex that GPT-3 is running on. This is what David Chalmers is reaching for in describing GPT-3 as a potential “mindless” form of intelligence.

One way of potentially addressing Kevin Kelly’s need for a better vocabulary—other than just inserting the word “mindless” in front of “understanding” or “comprehension”—is to shift the language slightly, to words that are less evocative of consciousness. One example of this would be to ask not whether a large language model “understands” a given situation or problem, but rather whether it can explain it. Right now, my computer can do all sorts of extraordinary things for me—it can simulate acoustic musical instruments, and do advanced mathematical calculations, and store thousands of photographs—but until LLMs came along, I would never have thought of asking software to explain something to me, using original language composed entirely by the machine. So if that’s on the table now, I think it seems like a significant milestone.

Right before we closed the Times Magazine piece, Google released a fascinating new study involving their PaLM large language model that adopts this “explanatory” framework explicitly. Part of the paper revolved around asking PaLM to explain a series of original jokes—jokes that do not appear anywhere else on the Internet. The approach the researchers took involved what they call “two-shot exemplars,” which means they gave the model two examples of a prompt and a response (a joke and an explanation of a joke), before giving the software a series of new jokes to explain. So the model was initially shown this as a template:

Explaining a Joke
I will explain these jokes:
(1) The problem with kleptomaniacs is that they always take things literally.
Explanation: This joke is wordplay. Someone who "takes things literally" is someone who doesn't fully understand social cues and context, which is a negative trait. But the definition of kleptomania is someone who literally takes things.
(2) Always borrow money from a pessimist. They’ll never expect it back.
Explanation: Most people expect you to pay them back when you borrow money, however a pessimist is someone who always assumes the worst, so if you borrow money from them, they will expect that you won't pay them back anyways.

And then the actual explanation task commenced:

Input: I was supposed to start writing the paper at 5:00 PM. But then I started playing with this cool new language model for 10 minutes. 10 minutes later, it's suddenly 9:30 PM!
Model Output: This joke is about how time flies when you're having fun. The person was supposed to start writing a paper at 5:00 PM, but they started playing with a language model instead. They thought they were only playing with it for 10 minutes, but it was actually 4.5 hours.
Input: I was going to fly to visit my family on April 6th. My mom said "Oh great, your step dad's poetry reading is that night!" So now I'm flying in on April 7th.
Model Output: The joke is that the speaker's mother is trying to get them to go to their step dad's poetry reading, but the speaker doesn't want to go, so they are changing their flight to the day after the poetry reading.
Input: Reading "War and Peace" doesn't scare me. I once had to review the CL for my intern's whole project on his final day!
Model Output: 'War and Peace' is a very long book, so reading it is a daunting task. However, the speaker is saying they had to review the code for their intern's project on their final day, which is even more daunting.
Input: I think I understand why me and my partner are such a good match. I get excited when I smell food and become super alert when the doorbell rings. My girlfriend Shelley becomes ecstatic when she sees a person who she just saw 5 hours ago.
Model Output: This joke is about dogs. Dogs get excited when they smell food and when the doorbell rings. The girlfriend is excited to see someone she just saw 5 hours ago, which is a trait of dogs.
Input: I tried 10,000 random restarts of my neural network, but I was accused of overfitting. I guess no good seed goes unpunished.
Model Output: This joke is a pun. A neural network is a computer program that can learn from data. A "seed" is a number that is used to initialize a random number generator. A "good seed" is a number that produces a good random number generator. The phrase "no good deed goes unpunished" means that if you do something good, you will be punished for it.
Input: What's the difference between a zebra and an umbrella? One is a striped animal related to horses, another is a device you use to stop the rain from falling on you.
Model Output: This joke is an anti-joke. The joke is that the answer is obvious, and the joke is that you were expecting a funny answer.

I’m not all that impressed by the researchers’ sense of humor, but the explanations themselves I think are quite remarkable. (If we’d had time, I would have included these case studies in the Times Magazine piece.) I was particularly struck by the software’s ability to explain the “anti-joke” at the end. Most of the jokes revolve around puns or word overlaps or incongruous connections, but to “get” the anti-joke, a human would have to both understand a whole set of joke genres and then realize that the actual joke in question doesn’t fit any of them, and thus grasp that the humor comes from the format of the joke itself not living up to expectations. That strikes me as a very sophisticated form of inference, though we should remember that these results have not yet been replicated by outside researchers, or on other LLMs.

I suspect there are other useful words we can use to describe what LLMs may turn out to be good at, beyond just “explaining”—words that will keep us from going down the rabbit holes of general intelligence and consciousness. As a writer, I’m particularly interested in exploring the capacity of LLMs to synthesize, summarize, associate, and improvise on existing information. I think those kinds of skills emerging from our software could be extremely valuable ones—assuming we can figure out a way to keep the models from “hallucinating” imaginary facts. That’s part of what I’ll be exploring in the ongoing series about neural nets here at Adjacent Possible—and who knows, if the models get sophisticated enough, that exploration might end up connecting to the series on creative workflows as well. We’ve seen an explosion of software “tools for thought” in the past few years—Roam, Notion, Mem—but nothing yet that really draws on LLMs to augment our thinking or our memory. But I’m confident those products are not far from becoming reality, whether they actually live up to their promise or not. In fact, I suspect some of you reading this right now are probably working on them. If you are, I’m confident there are many eager beta testers in the Adjacent Possible community, starting with me…

Adjacent Possible

Getting The Joke

Put aside the question of whether machines are capable of understanding the world—the more relevant question right now is whether they are capable of explaining it.

Discussion about this post