And a discussion of the term “idiot savant”:
Andrew Schulman (AS)
When was the phrase “idiot savant” first (or most famously) applied to AI models, by who and in what context? And while you’re at it, briefly explain the phrase “idiot savant,” and why it’s sometimes said to apply to brilliant-but-yet-somehow-dumb (sorry) AI models. (I think of it as “wait a minute, you seemed to know this and that amazing thing, but you didn’t know this much simpler thing?!”)*** For bonus points :-) compare and contrast to the “Rain Man” phenomenon.
Claude Sonnet 3.5
The term “idiot savant” originated in 1887 when Dr. John Langdon Down (who also identified Down Syndrome) used it to describe individuals with remarkable abilities in specific areas despite significant cognitive limitations in others.
Gary Marcus notably applied this analogy to AI in his 2001 book “The Algebraic Mind” and subsequent writings, highlighting how neural networks can excel at pattern recognition tasks while failing at simple logical reasoning.
[AS: Claude or Marcus might be making a partially false distinction: perhaps logical reasoning is, at bottom, largely pattern recognition — after all, who says that understanding must be implemented from components that themselves exhibit understanding?** — with perhaps no more than 5% or 10% of it (or whatever falls into the idiot savant phenomenon and related shortfalls like BS/hallucination) not yet working the way we want, and possibly/probably only this small portion requiring something over and above “dumb” neural networks of weights/thresholds? Maybe even it’s pattern recognition “all the way down”?]
The “Rain Man” comparison is apt – like Raymond Babbitt’s character (based on real savants), AI models can perform incredible feats of calculation or pattern recognition while struggling with basic social understanding or common sense reasoning. However, unlike human savants who typically excel in one specific domain, modern AI models can be remarkably capable across many domains while still exhibiting surprising gaps in basic knowledge or reasoning.
Since I make claims about specific historical usage here, I encourage verifying these citations independently.
I also asked for a summary of the July 2024 chat:
AS
This chat is from back in July. Can you tell me which version of Claude I was interacting with here? Even if not, can you give me a bullet-point summary of what was discussed in this chat?
Claude
I cannot definitively identify which version of Claude was used in July, as that predates my knowledge cutoff.
Key topics discussed in this chat:
- MICR (Magnetic Ink Character Recognition) on checks – its history, international usage, and routing number formats
- An extended discussion about AI limitations, particularly the “idiot savant” quality where chatbots can be simultaneously impressive and make basic mistakes
- The paradox of AI systems being able to provide some types of factual information (like Gaborone being Botswana’s largest city) but not others (like specific research papers)
- Limitations around source citation and bibliographic information
The conversation highlighted these inconsistencies through direct examples and questioning
The discussion was notably reflective about AI capabilities and limitations, with the human participant pushing back on some of my explanations for knowledge gaps.
*Humans’ own “introspection” may be more a matter of post facto rationalization.
**Cue some statement from Heidegger about, … mumble …, things proceeding toward the center not requiring the characteristics of the periphery…? [No, it’s Wittgenstein, Zettel 608: “No supposition seems to me more natural than that there is no process in the brain correlated with associating or with thinking; so that it would be impossible to read off thought-processes from brain-processes. I mean this: if I talk or write there is, I assume, a system of impulses going out from my brain and correlated with my spoken or written thoughts. But why should the system continue further in the direction of the centre? Why should this order not proceed, so to speak, out of chaos? …”]
— see conversation with Claude about this quotation: it starts off thinking I mean Leibniz, it then quickly switches to Wittgenstein, identifies Zettel, narrows it down to Zettel 608, it then sort-of misunderstands (I think) what Wittgenstein was saying (and Claude may be too ready to agree with my interpretation, especially given I don’t know German, and was superficial enough to confuse Heidegger and Wittgenstein), but then we’re off to the races connecting Z608 to neural nets, and Claude starting to ask me questions — even as it acknowledges it can’t actually do anything with my answers; it is very explicit that it is asking questions simply because that’s a behavior that has emerged from its training. Even having a pretty decent sense of how these AI chatbots work (“it’s all just pattern matching and next-word prediction”), it’s pretty shocking that one can have a conversation like this.
***Or: “wait a minute, you seemed to know this and that complicated thing, but you didn’t know this much simpler thing, without which one can’t possibly understand what one is saying about the complicated thing?!” I learned something like this from my late father, who used to describe how a coworker would say “Hey Marvie, that was beautiful what you told me about [something to do with municipal bond refinancing]. I think I could write a f’ing [this was Wall St. in the ’70s] book about it based on what you told me. Now, there’s just one thing I’m saving trouble wrapping my mind around…” — and that one thing would, according to Dad, be The One Thing without which none of the rest of it made any sense (Net Future Value or something like that). On the other hand, one of the lessons from modern AI may be that understanding what one is talking about may not be necessary to say something useful and intelligent most of the time. “It has literally no idea what it’s talking about” is a problem, but it may be much smaller or less global problem than one might think.