 
                    Humans aren’t the only ones who lose a step or two brain-wise as they age.
Artificial intelligence (AI) programs start to show signs of mild cognitive impairment as they grow older, a new study published Dec. 20 in the journal the BMJ says.
Older versions of chatbots, like older patients, tend to perform worse on tests of cognitive ability.
“Not only are neurologists unlikely to be replaced by large language models any time soon, but our findings suggest that they may soon find themselves treating new, virtual patients -- artificial intelligence models presenting with cognitive impairment,” wrote a research team led by Dr. Roy Dayan, a neurologist with Hebrew University in Jerusalem.
For this study, researchers assessed the cognitive abilities of the leading publicly available AI programs, which are also called “large language models [LLMs].”
AI is being tested for its ability to help in medical treatment, but “if we are to rely on LLMs for medical diagnosis and care, we must examine their susceptibility to human impairments such as cognitive decline,” researchers wrote in a journal news release.
The AI programs responded to questions from the Montreal Cognitive Assessment (MoCA) test, a standard test used to check for signs of brain aging and early dementia in seniors.
The maximum score on the test is 30 points, with a score of 26 or above generally considered normal, researchers said.
ChatGPT 4o had the highest score on the test, at 26 out of 30, results show. ChatGPT and Claude 3.5 “Sonnet” both scored 25, and Gemini 1.0 scored only 16.
“None of the large language models ‘aced’ the MoCA test, in the parlance of one American president,” researchers wrote.
All the AI programs performed poorly at visuospatial skills and organizational tasks, like connecting numbers and letters in an ascending order.
“The chatbots seem to have difficulty in tasks that demand both visual executive function and abstract reasoning, as opposed to tasks requiring textual analysis and abstract reasoning, such as the similarity test, which were performed flawlessly,” researchers wrote.
In fact, this pattern of impairment resembled human patients with posterior cortical atrophy, a variant of Alzheimer’s disease, researchers said.
“Moreover, as in humans, age is a key determinant of cognitive decline: ‘older’ chatbots, like older patients, tend to perform worse on the MoCA test,” the researchers added.
For example, the Gemini 1.0 and Gemini 1.5 AI models differed by six points in test results.
“As the two versions of Gemini are less than a year apart in ‘age,’ this may indicate rapidly progressing dementia,” researchers wrote.
These flaws underscore the uphill battle AI will face in replacing human doctors.
Or, to put it more mildly -- “These findings challenge the assumption that artificial intelligence will soon replace human doctors,” the research team concluded.
More information
The National Academy of Medicine has more about AI in health care.
SOURCE: BMJ, news release, Dec. 19, 2024
