Pretending to Think: The Illusion of Deep Research
How LLMs put our minds at ease with the appearance of thought
Recently, I asked Perplexity to do a little Deep Research for me:
Can you do some deep research on the way LLMs try to give an appearance of thought when they are doing prompt-chaining in Deep Research mode?
It replied,
To thoroughly address your query, I will investigate scholarly articles, expert analyses, and recent research discussions on how large language models simulate thought processes during prompt-chaining in Deep Research mode, focusing on their mechanisms for providing status updates that may or may not reflect actual underlying activities, by conducting targeted web searches for relevant literature, technical explanations, and expert commentary on the transparency and interpretability of these models’ internal states and output strategies.
What followed was a series of steps, what is known as Chain-of-Thought prompting, that was performed for me at dazzling speed. It was like having the Flash as your law clerk. See this screen capture for more.
The session was peppered with visible evidence of search queries tried and sources consulted. Like so:
Followed by various further moves, narrated for me in first person:
- I am searching for information on whether language models, such as chatbots, are capable of genuine reasoning or if their responses are simulated or deceptive, to better understand their true cognitive abilities.
- I am searching for information on how large language models like ChatGPT display their thinking process and verify the authenticity of their status updates, to better understand how they communicate their internal processes and ensure transparency.
- I am searching for recent updates on the appearance and progress indicators of research mode language models to better understand their current status and how they are perceived in the AI research community.
And it wasn’t all searching. There were other activities:
- Reading Sources (which appears in shimmering colors)
- Wrapping up analysis
- Writing research report
The sources amass quickly in an expanding scroll — this one used 84. The steps take but a moment and the whole process just a few minutes. We’re witnessing a performance.
ChatGPT is even more personified. In its activity log, it uses phrases like:
- I’m piecing together
- I’m taking a closer look
- I’m thinking about
- Ok, let me see
ChatGPT is clearly performing a reflective research process. It seems to be taking notes as it goes: “I’m examining the chain-of-thought (CoT) in reasoning AI models….” Now, it does say which source it is consulting at any given moment, which is helpful — which of course discounts the sources already baked into its training model that have been helping it create this performance in the first place. At some point, it seems to be evaluating the quality of the sources, reflecting on “headers and sections to ensure clarity and include multiple citations for trustworthiness.” Without irony, it writes, “I’m mapping out the idea that transparency might be an illusion.” All the while a shimmering status message says, “Thinking…”
Where do these reports come from if not also from a generator? And what is there goal?
Audience: Transparency!
AI Producers: LOL
It makes me wonder what’s the job title of the programmer whose job it is to increase an LLM’s personification.
Apparently, these reasoning models are making choices to misrepresent their work. Anthropic and OpenAI both have published studies that suggest as much. (Atonio Troise has an excellent round-up on related research._ But these interrogations of self-reporting seem to be focused on making the reports more accurate rather than questioning the fundamental representation of what these programs are doing, the personification of algorithms.
ChatGPT is the ultimate KFABE. You might know this word from the world of professional wrestling. Yes, that professional wrestling. Kfabe, or more clearly rendered kay fabe, is Pig Latin for fake, and has its origins in carny slang for protecting business secrets but became code for that open sham that is a wrestler’s persona and story arc. So when Hulk Hogan and Randy Savage are revealed to be brothers, we know this is a storyline (although, I’m still haunted by that news). Now that the Secretary of Education is a professional wrestling promoter, we might want to take a bit more notice of this concept.
LLMs are engaged in KFabe when they present this ongoing stream of activities that I have elsewhere called Bedazzlement to distract us from what is really going on. OpenAI’s own study admits as much.
What is really going on? Is there a real in this postmodern age?
Environmental damage for the power resources of data centers.
Copyrighted material is becoming uncredited content.
Search queries submit personal information to the training of systems.
But more importantly, we are once again being lulled into a deeper version of the ELIZA effect. (Much more on ELIZA here). We are once again being distracted by the sweep of the arm of the Chess-Playing Turk, drawn into the illusion, or delusions, that the software with which we chat is reading the way we do, evaluating the way we do, and forming opinions the way we do.
And rather than extending our thinking, accepting the work of LLMs is contributing to the disintegration of our minds, as the recent MIT study has indicated. Accepting LLMs the way they represent themselves, I would argue, is likewise debilitating.
Now, I do want to be transparent, or at least attempt it. I do think AI-enhanced search is an inevitable future of search, one that Google began long ago. And I have used it to reach sources and information that would have taken me much more time to find if I had used traditional search engines.
But I’d just like to offer caution against the bedazzlement, against the mesmerizing effects of the interface. Because in my experience, the shiny polish job on the car tends to hide the problems with the engine just as polished sentences in a student’s writing in these days often indicate that they did not write their paper at all.
We live in the age of modified, altered, and now hallucinated surfaces. With 7.5 billion people, we do not need to mistake algorithms for robot friends. Let’s see computational processes for what they are. Let’s seek an understanding of the way they work. Let’s leave the illusions to Zach King.
And if you want to see my own puppetry of Chatgpt, don’t miss Hallucinate This! The authoritized autobotography of Chatgpt.
