Skip to main content
  • Research article
  • Open access
  • Published:

Speech silence character as a diagnostic biomarker of early cognitive decline and its functional mechanism: a multicenter cross-sectional cohort study



Language deficits frequently occur during the prodromal stages of Alzheimer’s disease (AD). However, the characteristics of linguistic impairment and its underlying mechanism(s) remain to be explored for the early diagnosis of AD.


The percentage of silence duration (PSD) of 324 subjects was analyzed, including patients with AD, amnestic mild cognitive impairment (aMCI), and normal controls (NC) recruited from the China multi-center cohort, and the diagnostic efficiency was replicated from the Pitt center cohort. Furthermore, the specific language network involved in the fragmented speech was analyzed using task-based functional magnetic resonance.


In the China cohort, PSD increased significantly in aMCI and AD patients. The area under the curve of the receiver operating characteristic curves is 0.74, 0.84, and 0.80 in the classification of NC/aMCI, NC/AD, and NC/aMCI+AD. In the Pitt center cohort, PSD was verified as a reliable diagnosis biomarker to differentiate mild AD patients from NC. Next, in response to fluency tasks, clusters in the bilateral inferior frontal gyrus, precentral gyrus, left inferior temporal gyrus, and inferior parietal lobule deactivated markedly in the aMCI/AD group (cluster-level P < 0.05, family-wise error (FWE) corrected). In the patient group (AD+aMCI), higher activation level of the right pars triangularis was associated with higher PSD in in both semantic and phonemic tasks.


PSD is a reliable diagnostic biomarker for the early stage of AD and aMCI. At as early as aMCI phase, the brain response to fluency tasks was inhibited markedly, partly explaining why PSD was elevated simultaneously.

Peer Review reports


Alzheimer’s disease (AD) is the most common neurocognitive disorder, with memory deficits being the earliest and most characteristic symptom, and this is accompanied by other cognitive deficits such as executive dysfunction, apraxia, and aphasia [1]. In the past few decades, major progress has been made in the development of biofluid or neuroimaging biomarkers for AD diagnosis, such as cerebrospinal fluid measures and in situ imaging of Aβ and phosphorylated tau, other neuroimaging techniques, and neuropsychological tests [1]. However, these methods are limited by their high cost and invasive nature.

Language deficits are detected from the prodromal stages of AD or amnestic mild cognitive impairment (aMCI) and have been considered as a candidate biomarker for early diagnosis [2,3,4]. Most of these studies focus either on identifying characteristic linguistic parameters or using them to discriminate between healthy older people and those affected by aMCI or AD, and they indicate a large number of language components with ideal diagnostic values for discriminating AD, yet results were heterogeneous due to the variety of methods and vocal features being examined [5], not to mention the potential influences of distinct language spoken by subjects, or even the dialects of a particular language. And no single biomarker accurately diagnoses all cases of AD. Among them, pauses are often investigated as a hallmark of the lexical-semantic decline during speech production in AD [6, 7] and may be the key factor corresponding to speech fluency which is mainly determined by semantic and phonemic fluency [8].

Given that silent pauses are involved with impairment in multiple cognitive abilities, e.g., word retrieval, working memory, and execution, we put emphasis on the most important aspect—lexical-semantic processing and its functional alteration in AD or aMCI [9]. Further, the pause frequency in picture-based narrative has been reported to be associated with verbal fluency and grey matter density of anterior temporal lobe [2, 6, 10]. Although the task-based functional magnetic resonance imaging (fMRI) technique is a popular method to visualize brain areas supporting specific cognitive stimuli [11], to date, a limited number of studies have focused on functional alteration on the language network and its relationship with silent pauses.

Our previous study suggested that the computer-based analysis of certain language components could be a promising diagnostic method for early AD and aMCI [2] and highlighted the application of percent silence duration (PSD, in which silence is defined as the summed duration of all silent segments of the recording, mainly the various pauses) as a potentially reliable biomarker for the early stage of cognitive decline due to AD with translingual diagnostic value. In order to fully determine the translingual diagnostic value of PSD and its related brain network alteration, we will confirm the diagnostic value of PSD in the Chinese multi-center cohort and further validate its diagnostic value in an English-speaking cohort from the Pitt database, and the brain networks involved in verbal fluency which are related to PSD using a task-based fMRI experiment will be explored.


Multi-center Chinese-speaking RSF cohort in China

This is a cross-sectional study, with a total of 324 participants recruited from three memory clinics of hospitals in China (hereafter termed the RSF cohort: Ruijin Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai; Shanghai Sixth Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai; the First Hospital Affiliated to Zhejiang University, Zhejiang), in which 113 were NC (normal control), 95 were aMCI, and 116 participants were diagnosed with early phase AD. The registration number is ChiCTR2000036718 on the website associated with this study ( All participants (including the NC recruited among relatives of the aMCI and AD patients, with a request for NC participants also advertised) were recruited between August 2020 and July 2021 from the memory clinic of the RSF cohort centers mentioned above. The authors asserted that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. All procedures involving human subjects/patients were approved by the Ethics Committee of the RSF centers (approval number: 2020-261). All included individuals provided written consent.

Clinical assessment in the RSF center

To exclude other causes of cognitive impairment, we performed cranial MRI or computed tomography (CT) to exclude confounding factors such as stroke or intracranial space-occupying lesions. Serum folic acid, vitamin B12 levels, and thyroid function were tested to exclude endocrine and metabolic disorders. Clinical and demographic data including age, gender, and level of education were also collected. All subjects underwent neuropsychological tests including the following: the Mini-Mental State Examination (MMSE), the Montreal Cognitive Assessment-Basic (MoCA-B), and Addenbrooke’s Cognitive Examination-III (ACE-III), scoring according to the Clinical Dementia Rating scale (CDR) and the Cookie-theft picture description task from the Boston Diagnostic Aphasia Scales [12,13,14].

After clinical assessment, the participants were categorized into three groups: (i) a NC group, who were considered as cognitively healthy after the clinical consultation; (ii) an AD group, whose diagnosis was based on the clinical probable criteria for diagnosis of AD issued by the National Institute on Aging-Alzheimer’s Association workgroups in 2011 [15]; and (iii) an aMCI group, in which patients had a memory complaint corroborated by at least one informant, and a diagnosis was conducted using the Petersen criteria [16]. Participants were excluded if they had any other neurological diseases, any systemic disease which can lead to cognitive dysfunction, psychiatric disorders, or severe hearing or vision impairment.

English-speaking Cohort of the Pitt Center

The DementiaBank corpus, which is part of the TalkBank project, was used in the present study [17] and is an open-access database [4]. This corpus contained recordings of 104 controls and 208 dementia patients, from July 1983 to April 1988 (last modified in November 2018) involving the participants given a picture description task, which was originally designed for the Boston Diagnostic Aphasia Examination. The task required each participant to describe events depicted in the picture, the same as performed by participants in the China RSF center (Cookie Theft picture description task). We focus on the language character of aMCI individuals, in which most will convert to AD several years later. However, there were mainly mMCI (multi cognitive domain type) records and a lack of aMCI records in the DementiaBank. So, we decided to use individuals with mild AD with MMSE scores of over 24 to represent the early stage of AD, similar with the MMSE score range of aMCI individuals in the China multi-center cohort. There were 20 mild AD records after excluding unavailable records (recordings with a noisy background, speech time of over 60 s, or incomplete recordings), and 21 NC records were randomly selected form the control corpus. The diagnostic criteria for “Possible AD” or “Probable AD” determination were as specifically described in the study from Becker et al. [17]. In order to be consistent with the China RSF center and the previous study [18, 19], the samples with “Possible AD” and “Probable AD” labels are merged to compose the AD group in our study.

Recording protocol and speech analysis

Subjects in the China RSF center performed a Cookie Theft picture description task, during which they were given a picture and were told to discuss everything they could see happening in the picture in 1 min while being recorded. The mean time duration of the records is 39.6 ± 17.7 s. The RSF cohort individuals’ speech was recorded under the following configuration parameters of Cool Edit Pro software: a frequency of 160000 Hz, creating a 16-bit mono recording, and environmental noise was limited to under 45 dB. The automatic speech recognition (ASR) software for cognitive impairment v1.3 (developed by our team, China Software Copyright number 2016SR164680) for speech analysis was used, according to our previous study [2]. The Pitt records (the mean time duration of the records is 39.0 ± 17.4 seconds) were converted to the audio configuration parameters identical to the RSF recording using the Cool Edit Pro software. Each sample was analyzed by ASR software for cognitive impairment using v1.3 to extract the speech/silence parameters. The sum of all silent periods divided by the total speech time is the definition of PSD (ratio of total silent pause duration to total speech duration), expressed as a percentage. The definition of basic parameters set in our software was according to Pakhomov et al. [20], who had developed the measurements of spontaneous speech from the Cookie Theft picture description task for patients with dementia. Silence is defined as the summed duration of all silent segments of the recording, including general short pauses, general long pauses, and hesitation-associated pauses.

Task-based fMRI experiment

From current cohort in the Shanghai Ruijin center, a total of 48 right-handed individuals were recruited for further fMRI study. Seventeen participants were mild AD patients, fifteen were aMCI patients, and sixteen were NC. The inclusion and exclusion criteria were consistent with that of the current cohort as has been described above. In addition, patients with the following conditions were excluded: (a) moderate-to-severe AD indicated by MMSE < 15; (b) reading disability (or illiteracy); (c) abnormal findings in the brain MRI scan (e.g., tumors, stroke, hydrocephalus); (d) psychiatric disorders diagnosed by Diagnostic and Statistical Manual of Mental Disorders V (e.g., claustrophobia); and (e) refractive errors that cannot be corrected by MRI-supported eyeglasses. In addition to the neuropsychological scales assessed described above (MMSE, MoCA-B, ACE), MRI participants were further screened using the Boston naming test (BNT). Taking semantic and phonemic deficits into consideration, an fMRI verbal fluency task was adapted, as was shown in S Figure 1. The scanning protocol and processing methods are summarized in the supplementary materials.

Statistical analysis

According our previous study [2], for continuous variables, normality and homogeneity of variance was tested. ANOVA (3 groups), or Student T test (2 groups) was used for normally distributed variables with equal population variance, and the non-parametric tests Kruskal-Wallis (3 groups) or Mann-Whitney U (2 groups) test was used for variables with nonhomogeneous variance. When the differences were statistically significant (P < 0.05) among three groups, post hoc multiple comparisons were further made; when the variance was equal, the Bonferroni method was used; otherwise, the Kruskal-Wallis test was used. Receiver operating characteristic (ROC) curves were plotted for PSD by calculating the sensitivity and specificity of their diagnostic power in NC, aMCI, and AD type dementia. To explore the correlation between the parameters, correlation analysis and stepwise multiple linear regression were used. All statistical analyses were performed using SPSS.


The clinical characters of the subjects in China RSF center and Pitt center

There were 113 NC, 95 aMCI, and 116 AD patients in the China RSF multi-center cohort. Gender (female makeup in NC: 62.8%, aMCI: 56.8%, and AD: 52.8%) and educational level (NC: 12.2 ± 2.9 years, MCI: 11.5 ± 3.1 years, and AD: 11.3 ± 3.5 years) showed no significant difference among the NC, aMCI, and AD groups in the China RSF center cohort, and mean age was 67.6 ± 7.9, 73.0 ± 6.8, and 76.4 ± 8.2 years for the NC, MCI, and AD groups within this cohort (P < 0.001), respectively. However, there were significant differences between groups’ mean MMSE scores (NC: 28.7 ± 1.2, MCI: 26.2 ± 2.3, and AD: 19.0 ± 4.2), MoCA-B (NC: 26.0 ± 2.5, MCI: 20.7 ± 3.6, and AD: 14.5 ± 4.7), ACE-III performance (NC: 86.2 ± 7.4, MCI: 73.8 ± 9.3, and AD: 54.1 ± 13.6), and sub-items relating to fluency and language (Table 1, all P < 0.001), and the post-hoc comparison results are shown in S Table 1.

Table 1 Clinical characteristics of AD patients in the China RSF multi-center cohort

Regarding the Pitt center, 20 mild AD patients (MMSE ≥ 24) and 21 NC individuals randomly selected had clinical characteristics shown in S Table 2. PSD was also significantly different between NC and mild AD patients.

PSD as a biomarker for aMCI and AD

In the China RSF center cohort, compared with NC subjects, aMCI and AD patients had significantly increased PSD (Table 1, Fig. 1, P < 0.001), and PSD inversely correlated with cognitive performance (Fig. 2, S Table 3, P < 0.001). Following linear regression analysis, the variables representing aMCI and AD status of individuals in the cohort were significantly correlated with PSD after adjusting for age (S Table 4). The ROC curves comparing PSD-based classification sensitivity and specificity among NC, aMCI, and AD patients are shown in Fig. 1A–D. The AUCs of the curves are 0.74, 0.84, 0.80, and 0.65 in NC/aMCI, NC/AD, NC/aMCI+AD, and aMCI/AD, and the sensitivity and specificity of NC/aMCI, NC/AD, NC/aMCI+AD, and aMCI/AD is 0.71/0.71, 0.84/0.70, 0.78/0.79, and 0.85/0.43 respectively. The optimal cutoff for PSD in NC/aMCI, NC/AD, and NC/aMCI+AD was around 38.0 for each classifier using the SPSS ROC package. The distribution and comparison of PSD in NC, aMCI, and AD groups is presented in Fig. 1E. In the Pitt center cohort, PSD was verified as a biomarker to differentiate mild AD patients from NC (AUC of NC/mild AD is 0.70, Fig. 1F), and the difference in mean PSD between NC and mild AD patients was significant (Fig. 1G, P = 0.018).

Fig. 1
figure 1

ROC curves and comparison of PSD among NC, MCI and AD patients in the China RSF multi-center (AE) and Pitt center cohorts (F, G). The AUC and cutoff for PSD were 0.74 and 38.2 in NC/MCI (A), 0.84 and 38.0 in NC/aMCI (B), 0.80 and 38.0 in NC/aMCI+AD (C), and 0.65 and 58.5 in aMCI/AD (D). The comparison of PSD among NC, MCI, and AD (E, *P < 0.05 vs NC, #P < 0.05 vs aMCI) in the RSF center and between NC and mild AD patients (G, *P < 0.05 vs NC) in the Pitt center cohort demonstrated an AUC of 0.70 with a PSD cutoff of 44.0 to distinguish NC from mild AD (F)

Fig. 2
figure 2

The correlation analysis of PSD with cognitive performance. The heatmap (A) and the scatter dot of PSD with MMSE (B), MoCA (C), ACE-III (D), ACE-language fluency (E), and ACE-language-other (F)

Verbal fluency-based fMRI network

Demographic, neuropsychological, and language characteristics of fMRI participants are shown in S Table 5. There was no difference in age, gender distribution, nor education level among the NC, aMCI, and AD groups (P>0.05). The results of neuropsychological assessments and PSD for the fMRI participants was consistent with that of the RSF Center cohort as well.

Clusters showing significant difference in ANOVA analysis are presented in Fig. 3 and S Table 6 (cluster-level P < 0.05, FWE corrected). In the semantic task (Fig. 3A), the peak foci were mainly located at the bilateral precentral gyrus (PreCG), left pars opercularis (pOp) and pars triangularis (pTr), left middle occipital gyrus (MOG), and right precuneus and pTr. There was a significant positive correlation between the BOLD signals of all clusters except the left pOp and semantic fluency sub-scores of ACE-III (S Table 7). In the phonemic task (Fig. 3B), areas activated differently across groups were confined to the left PreCG, inferior parietal lobule (IPL), inferior occipital gyrus (IOG) and right pTr. Among them, the left IPL, PreCG, and right pTr were found to be associated with phonemic fluency sub-scores of ACE-III (S Table 7). In the post hoc analysis (Fig. 3C, D; S Table 8), we observed that in both AD and aMCI nearly all clusters showed remarkable deactivation in comparison with the NC group (P < 0.05, Bonferroni corrected). Compared with aMCI, the response of AD patients to fluency tasks in most of these clusters declined further except left pOp (P < 0.05, Bonferroni corrected), right pTr and PreCG (not significant) activated at a relatively higher level. The results remained robust basically when the age, gender, and education level of subjects were regressed out as nuisance covariates. In addition, in the semantic fluency–fixation contrasts, a cluster in the left cerebellum crus I was observed to deactivate in AD/aMCI (S Fig. 2). No group differences were detected in other contrasts, i.e., repetition > fixation; phonemic fluency > fixation; semantic/phonemic fluency > repetition; semantic fluency > phonemic fluency.

Fig. 3
figure 3

Group differences in fluency tasks. Clusters that activated at different levels among NC, aMCI, and AD in the semantic task (A and C) and the phonemic task (B and D) are presented respectively (one-way ANOVA, voxel-level P < 0.001, cluster-level P < 0.05, FWE corrected). Coordinates of clusters are listed in S Table 6. Results of partial correlation analysis between PSD and BOLD signals of the clusters in patient group (AD+aMCI) were presented in E and S Table 7 (age and gender controlled). L, left; R, right; PreCG, precentral gyrus; pOp, pars opercularis; pTr, pars triangularis; MOG, middle occipital gyrus; PreC, precuneus; ITG, inferior temporal gyrus; MFG, middle frontal gyrus; IPL, inferior parietal lobule; IOG, inferior occipital gyrus. Superscript digits one and two (1, 2) indicates the following: clusters in the same anatomical region

Unexpectedly, in the semantic task, the higher activation level of the right pTr was associated with the higher PSD in AD and aMCI (R = 0.43, P = 0.0148). While controlling for the effect of age and gender, we found the correlation between the right pTr and PSD was significant in both semantic and phonemic tasks (Fig. 3E, S Table 7). No significant correlation was observed in other clusters with PSD performance.


In the present study, we performed a comprehensive analysis of language components in NC, aMCI, and AD individuals, including characteristic PSD in both the China RSF multi-center cohort and the DementiaBank corpus of the Pitt center, with a task-based fMRI study of the underlying functional neural substrates. Our results show that PSD was both sensitive and specific in the diagnosis of aMCI and AD. Meanwhile, as another side of speech pause (PSD), verbal fluency was involved with functional alteration in the language network covering the bilateral PreCG, left ITG, and IPL, together with Broca’s area and its counterpart in the right hemisphere.

Language impairment is a core feature of AD [21]. Prior studies have shown a link between AD symptom severity and declining speech and language capability [3]. The data based on speech analysis of AD patients indicated that combined language characteristics provided a diagnostic accuracy of over 80% [5]. The first study using automatic speech analysis to identify MCI and AD patients compared the voices of healthy older adults and patients with extracted features that showed significant differences in several tasks and obtained the best combination through machine-learning methods, with an accuracy of 79% [22]. Another study used the features related to duration, speech rate, articulation rate, and pauses to obtain a 78.8% accuracy for MCI [23]. Our previous study on the combined language characteristics of PSD showed this metric better discriminated aMCI form NC with a limited sample size [2]. However, there was a lack of a language-specific parameter within that sample from multiple centers which could transgress language-specific differences. Therefore, in the present study, results from a considerably larger sample of NC, aMCI, and AD subjects in a multi-center cohort and an additional English-speaking validation cohort further confirmed that PSD as a single parameter is a sensitive indicator of aMCI and AD, both discriminated via an optimal PSD cutoff that achieves 80% accuracy (AUC of 0.8). The ability of PSD to discriminate mild AD was also validated in the Pitt center cohort. These results indicate that PSD is a non-invasive and easily accessed reliable biomarker for diagnosis of early-stage AD and is not restricted to different types of language or dialect, in both Chinese- and English-speaking populations. Although there is significant difference of PSD between aMCI and AD, the poor AUC (aMCI/AD, 0.65) and specificity (aMCI/AD, 0.45) indicated that PSD could not well predict aMCI due to AD.

Compared with task-based tests of episodic memory and other cognitive domains, a language task proved to be more sensitive and accurate in early identification of AD by fMRI [24]. To unveil the mechanism underlying the increase in PSD in AD/aMCI patients, we conducted a block-design fMRI paradigm focusing on verbal fluency, because pauses as a potential key factor correspond to speech fluency, which is mainly composed by semantic and phonemic fluency [8]. As it should be, the fMRI differences could still be ascribed to functional alterations in multiple cognitive abilities; thus, we put emphasis on discussing the clusters in canonical areas supporting language processing in young adults or older adults [25,26,27]. Echoing the aforementioned studies [28,29,30], for patients with AD or aMCI, there were fewer brain areas recruited in the semantic-lexical processing, in comparison with normal aging. One unanticipated finding was that this type of deactivation had emerged at as early as aMCI phase, while the corresponding symptoms did not become evident until the dementia phase [31]. On the contrary, the increased recruitment of brain resources in response to semantic tasks was found in older NC with or without a high risk of AD [32, 33], suggesting that physiological compensation in aging may have disappeared at the early phase of AD. In the semantic fluency task, clusters in left pTr, ITG, and PreCG (the lower part at the junction with pOp, precisely), as components of semantic network [26], deactivated markedly in AD/aMCI patients, partly explaining why semantic processing is disrupted [24]. Particularly, despite the decreased activation of the left cerebellum associated with declining fluency scores, the role it plays in language processing remains unknown [34]. Another interesting finding was that compared with the aMCI group, Broca’s area and its homologous areas in the right hemisphere in AD patients activated at a relatively higher level. Moreover, in AD and aMCI, the activation level of the right pTr was positively related to PSD, suggesting it may have a crucial role in pause-related network. Different from the role the left pTr play in the language function, the right pTr is considered to be a hub region supporting social cognition and control network [35]. Plenty of studies have found that there could be extra inter-hemisphere recruitment for the fluency task and a latent recovery of language function after brain damage [36,37,38]. However, contrary to what is seen in normal aging [33, 39], the reduced lateralization in pTr/pOp of AD patients did not result in enhanced fluency performance nor decreased speech pause, which could be better interpreted as a failed attempt or “decompensation” of the language network.

No group differences were detected in other contrasts, including repetition vs. fixation, semantic/phonemic fluency vs. repetition, and semantic vs. phonemic fluency. We supposed this could result from the fact that in older adults a more widely-distributed language network has been recruited even in the resting-state [40, 41], making the response to fixation, low- and high-difficulty tasks look almost the same.

Strengths and limitations

Firstly, this is a multi-center study identifying PSD differences in early stages of AD and its associated brain structures, combined with verification of the PSD effect in early AD in the Pitt center cohort; the underling mechanism of verbal fluency due to changes in specific brain areas was explored with task-based fMRI. However, our investigation is a cross-sectional study without observation of longitudinal changes in the patients. Secondly, there was limited enrollment of fMRI participants in a single center and a lack of electric voice-monitoring devices installed with the fMRI stimulus-presenting system, which could show how participants performed in the scanner. Lastly, for the various types of pauses distributed throughout speech recordings, more fine-grained analyses could be considered in detail in future studies.

Conclusion and hypothesis

This study provided new evidence that PSD is sensitive for diagnosis of early-stage AD or aMCI. At as early as aMCI phase, the brain response to fluency tasks was inhibited markedly, partly explaining why PSD was elevated simultaneously.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.


  1. Scheltens P, Blennow K, Breteler MM, de Strooper B, Frisoni GB, Salloway S, et al. Alzheimer’s disease. Lancet. 2016;388(10043):505–17.

    Article  CAS  Google Scholar 

  2. Qiao Y, Xie XY, Lin GZ, Zou Y, Chen SD, Ren RJ, et al. Computer-assisted speech analysis in mild cognitive impairment and Alzheimer’s disease: a pilot study from Shanghai, China. J Alzheimers Dis. 2020;75(1):211–21.

    Article  Google Scholar 

  3. Ahmed S, Haigh AM, de Jager CA, Garrard P. Connected speech as a marker of disease progression in autopsy-proven Alzheimer’s disease. Brain. 2013;136(Pt 12):3727–37.

    Article  Google Scholar 

  4. Ye Z, Hu S, Li J, Xie X, Geng M, Yu J, Xu J, Xue B, Li S. Development of the Cuhk elderly speech recognition system for neurocognitive disorder detection using the Dementiabank corpus. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2021. p. 6433-7.

  5. Martinez-Nicolas I, Llorente TE, Martinez-Sanchez F, Meilan JJG. Ten years of research on automatic voice and speech analysis of people with Alzheimer’s disease and mild cognitive impairment: a systematic review article. Front Psychol. 2021;12:620251.

    Article  Google Scholar 

  6. Pistono A, Pariente J, Bezy C, Lemesle B, Le Men J, Jucla M. What happens when nothing happens? An investigation of pauses as a compensatory mechanism in early Alzheimer’s disease. Neuropsychologia. 2019;124:133–43.

    Article  CAS  Google Scholar 

  7. Patricia Pastoriza-Domínguez IGT, Diéguez-Vide F, Gómez-Ruiz I, Geladó S, Bello-López J, Ávila-Rivera A, et al. Speech pause distribution as an early marker for Alzheimer’s disease. Speech Comm. 2022;136:107–17.

    Article  Google Scholar 

  8. Balogh R, Imre N, Gosztolya G, Hoffmann L, Pakaski M, Kalman J. The role of silence in verbal fluency tasks - a new approach for the detection of mild cognitive impairment. J Int Neuropsychol Soc. 2022;1-13.

  9. Pistono A, Jucla M, Barbeau EJ, Saint-Aubert L, Lemesle B, Calvet B, et al. Pauses during autobiographical discourse reflect episodic memory processes in early Alzheimer’s disease. J Alzheimers Dis. 2016;50(3):687–98.

    Article  Google Scholar 

  10. Yeung A, Iaboni A, Rochon E, Lavoie M, Santiago C, Yancheva M, et al. Correlating natural language processing and automated speech analysis with clinician assessment to quantify speech-language changes in mild cognitive impairment and Alzheimer’s dementia. Alzheimers Res Ther. 2021;13(1):109.

    Article  Google Scholar 

  11. Crosson B, McGregor K, Gopinath KS, Conway TW, Benjamin M, Chang YL, et al. Functional MRI of language in aphasia: a review of the literature and the methodological challenges. Neuropsychol Rev. 2007;17(2):157–77.

    Article  Google Scholar 

  12. Folstein MF, Folstein SE, McHugh PR. “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975;12(3):189–98.

    Article  CAS  Google Scholar 

  13. Goodglass H, KE. Assessment of aphasia and related disorders, 2nd edition. Philadelphia: Lea Febiger; 1983.

    Google Scholar 

  14. Hsieh S, Schubert S, Hoon C, Mioshi E, Hodges JR. Validation of the Addenbrooke’s Cognitive Examination III in frontotemporal dementia and Alzheimer’s disease. Dement Geriatr Cogn Disord. 2013;36(3-4):242–50.

    Article  Google Scholar 

  15. McKhann GM, Knopman DS, Chertkow H, Hyman BT, Jack CR Jr, Kawas CH, et al. The diagnosis of dementia due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7(3):263–9.

    Article  Google Scholar 

  16. Petersen RC, Smith GE, Waring SC, Ivnik RJ, Tangalos EG, Kokmen E. Mild cognitive impairment: clinical characterization and outcome. Arch Neurol. 1999;56(3):303–8.

    Article  CAS  Google Scholar 

  17. Becker JT, Boller F, Lopez OL, Saxton J, McGonigle KL. The natural history of Alzheimer’s disease. Description of study cohort and accuracy of diagnosis. Arch Neurol. 1994;51(6):585–94.

    Article  CAS  Google Scholar 

  18. Fraser KC, Meltzer JA, Rudzicz F. Linguistic features identify Alzheimer’s disease in narrative speech. J Alzheimers Dis. 2016;49(2):407–22.

    Article  Google Scholar 

  19. Hernandez-Dominguez L, Ratte S, Sierra-Martinez G, Roche-Bergua A. Computer-based evaluation of Alzheimer’s disease and mild cognitive impairment patients during a picture description task. Alzheimers Dement (Amst). 2018;10:260–8.

    Article  Google Scholar 

  20. Pakhomov SV, Smith GE, Chacon D, Feliciano Y, Graff-Radford N, Caselli R, et al. Computerized analysis of speech and language to identify psycholinguistic correlates of frontotemporal lobar degeneration. Cogn Behav Neurol. 2010;23(3):165–77.

    Article  Google Scholar 

  21. Forbes-McKay K, Shanks MF, Venneri A. Profiling spontaneous speech decline in Alzheimer’s disease: a longitudinal study. Acta Neuropsychiatr. 2013;25(6):320–7.

    Article  Google Scholar 

  22. Konig A, Satt A, Sorin A, Hoory R, Toledo-Ronen O, Derreumaux A, et al. Automatic speech analysis for the assessment of patients with predementia and Alzheimer’s disease. Alzheimers Dement (Amst). 2015;1(1):112–24.

    Article  Google Scholar 

  23. Toth L, Hoffmann I, Gosztolya G, Vincze V, Szatloczki G, Banreti Z, et al. A speech recognition-based solution for the automatic detection of mild cognitive impairment from spontaneous speech. Curr Alzheimer Res. 2018;15(2):130–8.

    Article  CAS  Google Scholar 

  24. Anderson AJ, Lin F. How pattern information analyses of semantic brain activity elicited in language comprehension could contribute to the early identification of Alzheimer’s disease. Neuroimage-Clin. 2019;22:101788.

    Article  Google Scholar 

  25. Friederici AD. The brain basis of language processing: from structure to function. Physiol Rev. 2011;91(4):1357–92.

    Article  Google Scholar 

  26. Vigneau M, Beaucousin V, Herve PY, Duffau H, Crivello F, Houde O, et al. Meta-analyzing left hemisphere language areas: phonology, semantics, and sentence processing. Neuroimage. 2006;30(4):1414–32.

    Article  CAS  Google Scholar 

  27. Shafto MA, Tyler LK. Language in the aging brain: the network dynamics of cognitive decline and preservation. Science. 2014;346(6209):583–7.

    Article  CAS  Google Scholar 

  28. McGeown WJ, Shanks MF, Forbes-McKay KE, Venneri A. Patterns of brain activity during a semantic task differentiate normal aging from early Alzheimer’s disease. Psychiatry Res. 2009;173(3):218–27.

    Article  Google Scholar 

  29. Paulesu E, Goldacre B, Scifo P, Cappa SF, Gilardi MC, Castiglioni I, et al. Functional heterogeneity of left inferior frontal cortex as revealed by fMRI. Neuroreport. 1997;8(8):2011–7.

    Article  CAS  Google Scholar 

  30. Metzger FG, Schopp B, Haeussinger FB, Dehnen K, Synofzik M, Fallgatter AJ, et al. Brain activation in frontotemporal and Alzheimer’s dementia: a functional near-infrared spectroscopy study. Alzheimers Res Ther. 2016;8(1):56.

    Article  Google Scholar 

  31. Vaughan RM, Coen RF, Kenny R, Lawlor BA. Semantic and phonemic verbal fluency discrepancy in mild cognitive impairment: potential predictor of progression to Alzheimer’s disease. J Am Geriatr Soc. 2018;66(4):755–9.

    Article  Google Scholar 

  32. Woodard JL, Seidenberg M, Nielson KA, Antuono P, Guidotti L, Durgerian S, et al. Semantic memory activation in amnestic mild cognitive impairment. Brain. 2009;132(Pt 8):2068–78.

    Article  CAS  Google Scholar 

  33. Meinzer M, Flaisch T, Seeds L, Harnish S, Antonenko D, Witte V, et al. Same modulation but different starting points: performance modulates age differences in inferior frontal cortex activity during word-retrieval. PLoS One. 2012;7(3):e33631.

  34. Yuan Q, Li H, Du B, Dang Q, Chang Q, Zhang Z, et al. The cerebellum and cognition: further evidence for its role in language control. Cereb Cortex. 2022;bhac051.

  35. Hartwigsen G, Neef NE, Camilleri JA, Margulies DS, Eickhoff SB. Functional segregation of the right inferior frontal gyrus: evidence from coactivation-based parcellation. Cereb Cortex. 2019;29(4):1532–46.

    Article  Google Scholar 

  36. Jiao Y, Lin F, Wu J, Li H, Fu W, Huo R, et al. Plasticity in language cortex and white matter tracts after resection of dominant inferior parietal lobule arteriovenous malformations: a combined fMRI and DTI study. J Neurosurg. 2020;134(3):953–60.

    Article  Google Scholar 

  37. Cabeza R. Hemispheric asymmetry reduction in older adults: the HAROLD model. Psychol Aging. 2002;17(1):85-100.

  38. Wierenga CE, Stricker NH, McCauley A, Simmons A, Jak AJ, Chang YL, et al. Increased functional brain response during word retrieval in cognitively intact older adults at genetic risk for Alzheimer’s disease. Neuroimage. 2010;51(3):1222–33.

    Article  Google Scholar 

  39. Marsolais Y, Perlbarg V, Benali H, Joanette Y. Age-related changes in functional network connectivity associated with high levels of verbal fluency performance. Cortex. 2014;58:123–38.

    Article  Google Scholar 

  40. Pistono A, Guerrier L, Peran P, Rafiq M, Gimeno M, Bezy C, et al. Increased functional connectivity supports language performance in healthy aging despite gray matter loss. Neurobiol Aging. 2021;98:52–62.

  41. Mohanty R, Gonzalez-Burgos L, Diaz-Flores L, Muehlboeck JS, Barroso J, Ferreira D, et al. Functional connectivity and compensation of phonemic fluency in aging. Front Aging Neurosci. 2021;13:644611.

Download references


Not applicable.


This study was supported by grants from the Ministry of Science and Technology of the People's Republic of China (2021ZD0201804), the Shanghai Municipal Education Commission—Gaofeng Clinical Medicine Grant Support (20172001), the Shanghai “Rising Stars of Medical Talent” Youth Development Program-Outstanding Youth Medical Talents (2018), and the Natural Science Foundation of Shanghai (219ZR1431500). Grant support for the Pitt corpus includes the following: NIA AG03705 and AG05133. Funding was also provided by the National Natural Science Foundation of China (81971576; 81801652) and the Innovative Research Team of High-level Local Universities in Shanghai. And the funding was provided by the projects ZD2020127, 216Z7702G, and C20210506.

Author information

Authors and Affiliations



HLW, RT, RJR, NYH, and GW were responsible for the study design. HLW, RT, RJR, QHG, GPP, HLC, YMZ, JTW, XYX, QH, and JPL were responsible for the data collection and verification. HLW, RT, and HLC were responsible for the data analysis. HLW, RT, HLC, and GW were responsible for the figures. HLW and RT were responsible for the manuscript writing. RJR, EBD, QHG, GPP, HLC, YMZ, JTW, XYX, QH, JPL, FHY, SDC, NYH, and GW were responsible for the manuscript critical review. HLW, RT, RJR, NYH, and GW had access to all the data in the study, and GW had final responsibility for the decision to submit for publication. The authors read and approved the final manuscript.

Corresponding authors

Correspondence to Na-Ying He or Gang Wang.

Ethics declarations

Ethics approval and consent to participate

All procedures involving human subjects/patients were approved by the Ethics Committee of the RSF centers. All included individuals provided written consent.

Consent for publication

Manuscript is approved by all authors for publication.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplemental Figure 1.

fMRI task paradigm. Supplemental Figure 2. Group difference in the Semantic Fluency-Fixation contrasts and its behavioral significance. One-way ANOVA, cluster-level P < 0.05, FWE corrected. Supplemental Table 1. The Post-hoc comparisons of the clinical characteristics in the China RSF multi-center cohort (Bonferroni corrected). Supplemental Table 2. Clinical characteristics of NC and AD patients in the Pitt center cohort. Supplemental Table 3. Correlation between PSD and cognition. Supplemental Table 4. The linear regression analysis of the variables correlated with PSD after adjusting for age. Supplemental Table 5. Demographic and neuropsychological/language characteristics of fMRI participants. Supplemental Table 6. Significant clusters in three linguistic tasks showing different activation by ANOVA analysis. Supplemental Table 7. Results of correlation analysis. # The spearman correlation coefficients were calculated between the average BOLD signal and the relevant fluency score of ACE-III. * The partial correlation coefficients were calculated between the average BOLD signal and PSD in the patient group. Supplemental Table 8. Post hoc analysis of cluster activation. Bold indicated significance (Bonferroni corrected).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, HL., Tang, R., Ren, RJ. et al. Speech silence character as a diagnostic biomarker of early cognitive decline and its functional mechanism: a multicenter cross-sectional cohort study. BMC Med 20, 380 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: