Back in July 2024, I was working with Anthropic’s Claude chatbot, and was getting frustrated with its (and ChatGPT’s) regular undermining of nearly-brilliant answers with ridiculously simple mistakes. I applied the phrase “idiot savant” to this phenomenon, though of course others were already describing it that way. I was also (somewhat relatedly) frustrated with Claude’s inability or refusal to provide sources or citations, not only for answers it generated (that’s somewhat understandable given modern AI models don’t store verbatim copies of their training material — a point often overlooked in public coverage of AI copyright litigation), but even when specifically asked to name two or three papers or researchers working in an area for which Claude had just given me a nice overview.
Attached are two transcripts, one from July 2024, the other from December 2024. In July 2024, I first asked Claude a silly test question about MICR codes at the bottom of checks, saw it overlooked a difference in Canadian MICRs that it clearly already knew about, asked it to provide sources, and then got into a lengthy discussion of the “idiot savant” problem, and various proposed solutions. When asked, it refused to give me even a single name of a researcher, much less cite to a paper, in fields it clearly knew a lot about. In December, it’s now willing to provide names and references, albeit with a DYOR caveat.
Of course, when querying an AI chatbat, “How did you come up with that answer?” or “What is your basis for that?” or more generally “How do you go about generating such answers?,” you can’t assume that its answers reflect any actual “introspection” into its own implementation,* as opposed to generalized (though often very useful) descriptions, based on its training data. Perhaps even “chain of thought” summaries provided by ChatGPT are open to question regarding how much this actually reflects steps they went through?
[Added Jan. 2, 2025: I should here mention a more recent chat with Claude about instruction fine-tuning, in the context of Claude Sonnet 3.5’s ability to take an uploaded file of (x,y) pairs of numbers, and try to determine what function generated the (x,y) pairs (i.e., Claude’s ability to perform regression on the fly, which seems quite different from answering a prompt based on next-token completion). When I tried this in Claude, in the course of coming up with its response to my prompt (which was nothing more than “I’ve uploaded a txt file with some data. Can you tell what’s going on in here?”), it generated an ad hoc JavaScript to visualize the trendline shown by the data. It only tried a linear relationship, and reported that the data represents “y = 0.0787x + 3.2711”, whereas a better fit would have been y=1.355791x^0.440975 (basically square root with some noise). But still! We then had an amazing chat trying to answer the question (“is this ability more of the same [in a good way], or something new?”) and it explained, with specific examples, how its instruction fine-tuning (related to FLAN, OpenAI’s InstructGPT, and Anthropic’s “Constitutional AI“; Askell paper) has enabled it to use OpenAI tools like repl, and how much this does and does not stem directly from its basic training. I’ll be writing this up separately soon; in the meantime, Claude chat on its data-analysis abilities from instruction fine-tuning here. See also excellent Nathan Lambert talk on the role of reinforcement learning in new “reasoning” models. See recent “reasoning” models like ChatGPT o1 (which is generally what I’ve been using in ChatGPT-related material on this site), forthcoming o3, or Google Gemini Flash 2.0 Experimental .]
Here’s Claude’s summary of the Dec. 2024 chat:
- Discussed Neuro-Symbolic AI as an approach to the “AI idiot savant” problem, combining neural networks with symbolic reasoning [AS: Note there’s no implication here that this is what new “reasoning” models are doing!]
- Noted evolution in my ability to provide researcher names/citations (with verification caveats), compared to previous reluctance
- Explored Knowledge Grounding, Coherence Modeling, and cited key researchers in these areas
- Clarified that while I can suggest relevant citations, I cannot identify specific sources used in generating my answers
- Discussed Explainable AI (XAI) and key researchers working on making AI systems more interpretable
- Examined concerns about AI systems reproducing training material, including technical aspects (statistical pattern reproduction vs. verbatim storage) and legal/economic implications
- Reviewed papers on extracting training data from language models and legal analysis of AI/copyright issues