Machine Readable
Language models are opening new avenues for inquiry in historical research and writing. But will they undermine the reading of primary sources?
Last Sunday's edition of The New York Times Magazine was a special issue devoted to answering a particularly helpful question: what is AI actually useful for today? Put aside the long-term debate about radical job loss and superintelligence; we've had more than two years of consumer-grade AI applications, which should be plenty of time for them to prove their merits as functional tools, even if the future may lead to more ambitious applications down the line. So where is AI actually helping people right now?
The issue features a long, probing meditation on how AI is beginning to transform the way we analyze and write about history, written by my long-time editor Bill Wasik. The piece is bookended with a case study of my use of NotebookLM to investigate early ideas for a project about the Gold Rush that I first wrote about here at Adjacent Possible last year. Here's one key section:
Johnson showed me the results of his experiments so far. He started his brainstorming process by giving NotebookLM excerpts from one of the finest existing histories on the Gold Rush, H.W. Brands’s “The Age of Gold.” He thought he might want to focus on the conflict between white gold-seekers and the Native American groups living in the Yosemite Valley in the 1850s, so he uploaded the text of an older source called “Discovery of the Yosemite,” by Lafayette Houghton Bunnell, who was part of the Mariposa Battalion, the militia unit that rode into the valley in 1851. Next, to bring in the Indigenous perspective, he went to public-domain websites and found two accounts about the people whom the battalion expelled from the valley: “The Ahwahneechees: A Story of the Yosemite Indians” and “Indians of the Yosemite Valley and Vicinity.”
Johnson started his conversation with NotebookLM with a little orientation, identifying himself as the author Steven Johnson, so that the A.I. (whose training allows it to understand almost the whole internet, as ChatGPT does, even if it constrains its answers to the uploaded sources) might get a sense for the kinds of books he writes. Then he started peppering it with questions: What in the two sources that focused on the Indigenous experience, he asked, was missing from the other two sources? When the model returned its summary, his eye was caught by the observation that “The Ahwahneechees,” by including short biographies of individual Yosemite Indians, “helps to humanize the people beyond being a tribal mass, which is something that ‘The Age of Gold’ and Bunnell’s book tend to do.” The A.I. listed some of their names, among them Maria Lebrado, granddaughter of the Yosemite chief Teneiya.
That piqued Johnson’s interest. He asked for more information about Lebrado, and the tool returned nearly 600 words of biography — that she was one of the 72 Native people forced to leave the valley by the battalion in March 1851; that she eventually married a Mexican man who ran a pack train down in the Central Valley; that she was “discovered” by a white historian in the 1920s and held up as the last of the original Yosemite Indians.
Right away, Johnson recognized that she would make a great character. He took note in particular of the fact that Lebrado returned to the valley near the end of her life. “I’m like, What’s the [expletive] structure of ‘Titanic’?” he joked. The book could open with what Johnson imagined was Lebrado’s emotional return to the valley at nearly 90 years old, before zooming back in time — to her childhood, to a broader cast of characters and the violent drama of the 1850s.
The essay does a great job of capturing the kind of collaborative brainstorming you can now do with an AI that is grounded in sources that you have personally curated. Unless you have experienced it firsthand, it’s hard to explain how different it is from just riffing with a general-purpose model where the knowledge base is not a specific collection of works that you have assembled. (The one thing that I wish had been clearer in the Times essay is that I had already read the Brand and Bunnell books in their entirety—the "excerpts" from Brand mentioned are specific passages that I had highlighted while reading and imported to Notebook using ReadWise.) So much of the early stage work for a book is really about probing vast archives of information, trying to see if there's a new angle that adds to the existing accounts. I didn't know for sure whether Maria Lebrado would in fact work as a major character after this initial line of inquiry—I'm still not sure, in fact—but discovering her story gave me something to anchor on as I read through the material, and suggested a potential structure for the book.
There's another point in the piece where I ask for Notebook's help filling out the details of a slightly wacky structure that had occurred to me later in the research: a variation of my "long zoom" idea where the first chapter would begin a million years before the Gold Rush, and then get progressively closer to the main event: 100,000 years, 10,000 years, and so on. I'd asked Notebook to help me flesh out what actual events would go in each chapter, based on the existing timeline in the sources I'd collected. It's another variation of this probing exercise: if I structured it this way, how would it work in practice? I find this kind of exercise amazingly valuable, assuming I have done two key things first: 1) explained to Notebook who I am and what kind of book I'm trying to write, and 2) curated a unique set of sources that allows the AI to explore a new configuration of ideas, and not just resort to its typical "average" response. Again, it's not giving me a final, definitive outline -- it's helping me quickly draft possibilities. If there's a problem with the existing state of the models right now for this kind of task, it's that they are tuned to be a little too supportive; you have to deliberately ask them to be more skeptical if you want constructive criticism, and I don't yet trust them to make an independent value judgment about the merits of a specific approach. When I'm in this kind of exploratory mode, I mostly just ask Notebook to help me generate possibilities, and then I evaluate what works and what doesn't work on my own.
***
The Times piece mentions a few other authors who have been using AI in comparable ways, as well as a few who are less intrigued by the approach. There's a brilliant line from Stacy Schiff, author of acclaimed biographies of Cleopatra and Véra Nabokov: “To turn to A.I. for structure seems less like a cheat than a deprivation, like enlisting someone to eat your hot fudge sundae for you.” On some level, I completely agree with Schiff here; longtime Adjacent Possible readers might remember that I wrote an entire post about how much I love plotting out the structure for my books. But it's not just about the pleasure of inventing structures; it's actually one of the things that I think I do best as a writer. So it would be completely illogical for me to simply outsource that work to the AI. What Notebook lets me do is experiment more freely, explore different hypotheses, or fill in blank spots that I've either forgotten or not yet discovered. I'm still eating the sundae, in Schiff's metaphor -- it's just tastier with Notebook helping me assemble the ingredients.
There's a general point worth making here, which is that in the Gold Rush research example, all the key ideas that are being generated are either coming from me, or from the human-authored works that I have collected. NotebookLM is effectively functioning as a conduit between my knowledge/creativity and the knowledge stored in the source material: stress-testing speculative ideas I have, fact-checking, helping me see patterns in the material, reminding me of things that I read but have forgotten. It's an important point to make in part because there seems to be a growing concern about offloading the actual research/reading phase to AI-generated summaries. I recently listened to a terrific discussion between Ezra Klein and Dave Perell (two people whose insights on AI-assisted writing and thinking I greatly admire) where Klein notes that he has found little value for AI in his research workflow so far:
I think I used to conceptualize knowledge the way you see it in the movie The Matrix, where it's like I wanted the port in the back of my mind that the little needle would go into. I had read John Rawls's Political Liberalism. I thought that what you were doing was downloading information into your brain.
Now I think that what you are doing is spending time grappling with the text, making connections. It will only happen through that process of grappling. So, the idea that you could speed run that, the idea that it could just be summarized for you...
Part of what is happening when you spend seven hours reading a book is you spend seven hours with your mind on this topic. The idea that O3 can summarize it for you, in addition to all this stuff you just will not have read, is that you didn't have the engagement. It doesn't impress itself upon you. It doesn't change you. What knowledge is supposed to do is change you, and it changes you because you make connections to it.
I do agree with Ezra here. You should read the things that you need to understand deeply. The time spent engaged in reading is truly transformative. There is no replacement for reading the core primary and secondary texts if you are writing nonfiction journalism or history. Generally, I don’t think of NotebookLM as doing the reading for me; it’s more like Notebook is doing the reading with me.
But in the Gold Rush example, I'm using Notebook not as a replacement for reading, but rather as a tool to help me figure out what is worth reading. As Ezra notes in his conversation with David, you can't read everything. And you particularly can't read everything when you're still trying to figure out what your next book topic should be. Anyone who works with research in one form or another relies on some version of summarization or excerpting to decide whether to read a given work: skimming the jacket copy, or stumbling across a provocative quote in another text. As Bill Wasik points out in the Times essay, the invention of the index was a great boon to scholarship, because sometimes scholars are looking for targeted information that doesn’t require ten hours of engagement: a tool like NotebookLM’s Mind Maps can effectively build you an index on the fly for dozens of documents that you’ve curated. That's helpful if you use it the right way, at the right time. When I load in those sources about the Native Americans of the Yosemite Valley, and ask Notebook to compare them to the existing sources I've already read, it's almost like I'm using the AI to generate a bespoke jacket copy, summarizing the text as it relates to the existing research and thinking I've done on the topic.
If tools like NotebookLM end up being primarily used as a replacement for reading the core sources you need to generate a deep, engaged understanding, that will indeed be a step backwards, and will produce less nuanced, less compelling works of scholarship. Avoiding that scenario is one of the reasons why from the very beginning NotebookLM has been designed to facilitate the reading of original sources, in ways that set us apart from nearly every other AI product out there: the entire text of each source is always readable in the app, and Notebook gives you inline citations that take you directly back to the relevant passages, so you are always one click away from the original material.
That said, there are many forms of knowledge work where skimming/excerpting/summarizing can be incredibly valuable, and more appropriate than reading a document from start to finish. You don’t necessarily want to read your entire car manual; you just want to be able to ask a specific question and get an accurate answer. Certain kinds of advice books or reference works could arguably be better explored through a conversational Q&A format as opposed to linear reading. Using Ezra's metaphor, some forms of information transfer are in fact more Matrix-like. We've tried to design NotebookLM to accommodate both modes.
At another point, Ezra says: "Having AI summarize a book or a paper for me is a disaster. It has no idea what I really wanted to know. It would not have made the connections I would have made."
If Ezra is experiencing this limitation in his use of AI tools, I would argue that it is a failure of UI not AI -- in other words, the user interface is not allowing him to teach the model about his own particular interests and sensibility, the way I do at the beginning of the Gold Rush exploration in the Times piece. You’ll often hear people say that the big risk for scholarship with these tools is that they eliminate the kind of serendipitous discovery that you can only get from analog research. (The same argument was made during the transition from library stacks to digital search.) I encourage anyone who still thinks this to try an experiment: create a notebook with your initial drafts of a project you are working on and then add a dozen or so new sources that are vaguely in the same space. (You can do this easily with Notebook’s new "Discover Sources" feature.) And then just ask Notebook to suggest ten surprising, less obvious connections between your original writing and the new sources. And then go back and read the new source material. In my experience, that’s a far more effective way to uncover serendipitous links than traditional reading without an AI collaborator. And a thousand times more effective than walking around library stacks, which I am old enough to remember doing!
***
One last teaser: near the end of the Times piece, I start speculating on what NotebookLM might look like as a distribution platform for information, and not just a productivity tool for creating that information:
In our conversation in Mountain View, [Johnson] put forward a possible new revenue stream: What if e-books of history came enhanced with a NotebookLM-like interface? Imagine, he went on, that “there’s a linear version of the story with chapters,” but then the primary materials the author used to write the book also come bundled with it. That way, “instead of just a bibliography, you have a live collection of all the original sources” for a chatbot to explore: delivering timelines, “mind maps,” explanations of key themes, anything you can think to ask.
It is perhaps the most brain-breaking vision of A.I. history, in which an intelligent agent helps you write a book about the past and then stays attached to that book into the indefinite future, forever helping your audience to interpret it.
I was deliberately vague about what that might mean in practice when I said those words to Bill Wasik earlier this year. But I'll have something much more solid to share on that front in a few weeks—including some news about how all these developments are going to impact Adjacent Possible. More to come after the July 4 holiday...
I’m excited to see what’s next for adjacent possible. This essay has given me some great ideas for how to incorporate notebooklm into the courses I’m teaching in the fall. Great point about their being so much to read and the invention of the table of contents & index being user interface elements that helps guide this.
I believe a short term gain with the rise of AI is going to be the rise of semantic web analytics outside of big social media and e-commerce companies. The tough part about semantic web analytics is managing all the relationships amongst your data. AI can do it in its metaphorical sleep and will bring this tech down to the layman, small business, and others.