For all their talents, neural nets have a strange unwillingness to say three simple words: “I don’t know.”
[A little housekeeping before we get to the main event. This week is the paperback release of Extra Life, my book on the doubling of global life expectancy that came out last year. If you haven’t had a chance to read it, you might want to pick up a copy now. (Here’s Steven Pinker’s review of it from The New York Times if you’re interested in learning more.) And of course, if you’d like a signed personalized copy of the hardcover, you can always sign up for the “Ideal Reader” subscription tier here at Adjacent Possible.
The post below is another in my continuing series exploring the frontier of deep learning and AI, The Prompter; the opening section is available to all subscribers, while the second half, which dives into some of the thornier questions, is only accessible for paying subscribers. Going forward the newsletter won’t be quite so AI-obsessed — I just had a lot of material to explore in the wake of the Times Magazine piece on large language models from last month.
Okay, now back to our regularly scheduled programming…]
In my essay last month about GPT-3 and large language models, I talked about the propensity of these systems to “hallucinate” facts out of thin air, one of the core challenges that will have to be overcome if they are ever to become genuinely trustworthy. In the piece, I talked about the entirely fictitious bio of a made-up Belgian chemist that GPT-3 dutifully served up to me, and a strange episode where the software hallucinated that it had somehow spent some time in prison. But a few weeks ago, I had another interaction with GPT-3 that I think underscores the weirdness of this behavior. I happened to be working on a piece that involves Doug Engelbart, the inventor of the computer mouse and one of the key figures behind the graphic interface. In the piece, there’s an opening paragraph describing a formative experience Engelbart had as a Navy technician in late 1945, while stationed on an island in the Philippines. Just for fun, I decided to give GPT-3 the paragraph as a prompt, and see where it would go next. My hunch is that this kind of improvisational, collaborative work is where GPT-3’s descendants will first become useful to writers: you write for a bit and then ask the software to conjure up a dozen potential new directions to explore—most of which will be useless, or too obvious, but some of them will open up new avenues for you. This will all be particularly relevant when you can do local trainings of future models based on your own reading/writing history, so that the results are uniquely suited to your own set of interests and influences.
When I fed GPT-3 the Engelbart story, this is what it suggested for the next paragraph:
When he arrived at the weather station, Engelbart was assigned a dormitory room with a couple of other men. One of them was a giant, a man named Ivan Sutherland, who was six feet six and weighed two hundred and fifty pounds. Sutherland was an engineering student at the University of Utah, who had been drafted into the Navy. Sutherland’s hobby was building radio sets out of spare parts, and he and Engelbart would spend hours discussing the technology. “He was a really good teacher,” Engelbart later said. “He taught me a lot about electronics.” Sutherland had also brought with him a stack of Popular Science magazines, and he shared them with Engelbart. “I remember one issue in particular, it was about the future of computers and how it was going to revolutionize society,” Engelbart recalled. “It was talking about how computers were going to be used for all kinds of different things, including playing bridge and chess.”
I nearly spit out my coffee when I first read this paragraph. Usually when you give GPT-3 a prompt involving a relatively well-known figure like Engelbart, the software generates results that are reliable, but a bit boring if you already know something about the person in question. They often feel like a well-crafted summary of a Wikipedia entry, which is not that far from what they actually are, most of the time. But this paragraph was something very different. I happened to know that Ivan Sutherland—who taught at the University of Utah—had, in the early 1960s, invented “Sketchpad”—generally considered the first visually interactive software ever created. So Sutherland and Engelbart had already been associated in my mind, given their shared “60s interface pioneers” bios. But this was a twist in the story that I had never heard! Sutherland and Engelbart had been thrown together by the tides of war on a South Pacific island, and had bonded over their ham radio obsessions decades before they would go on to help invent modern computing.
Now this is impressive, I thought: I’d turned to GPT-3 as a source of ideas for the second paragraph as a lark, but damn if it didn’t actually deliver an historical anecdote that was genuinely useful for my essay. But of course I was using experimental software here, so I started to dig around to figure out the actual facts behind the story. The first thing I uncovered was that Sutherland followed up his work on Sketchpad with a stint at ARPA, which funded some of Engelbart’s research. Further confirmation! I starting writing the sentences in my head: Almost two decades after their serendipitous encounter in the Philippines, the two men found themselves puzzling over the same problems that captivated them so many years ago.
The second thing I uncovered was this: Ivan Sutherland was eight years old in 1945.
GPT-3 had just hallucinated the entire thing.
Now, on the one hand you have to give the software credit for its uncanny sense of what would have been a brilliant plot twist in young Doug Engelbart’s life, a random Navy assignment that leads to a world-changing collaboration, exactly the sort of serendipitous link that would have delighted his future biographers, were it actually true. But on the other, much bigger, hand, you’d have to say: what the hell, GPT-3? You just made all that up! I mean, I suppose it’s possible that Engelbart did time in the Navy with another Ivan Sutherland associated with the University of Utah who was not eight years old at the time. (I searched around a bit and couldn’t come up with anything.) But it seems pretty clear that the software was just making it up as it went along—another case study of GPT-3’s troublesome skill as a “bullshit artist,” as Gary Marcus likes to say.