INTRODUCTION

The dystrophin gene (DMD) is localized on the X chromosome. Variants in DMD have been recognized as a cause of the most common form of muscular dystrophy during childhood, Duchenne muscular dystrophy (DMD).1 This fatal, X-linked disorder leads to progressive muscle weakness and less well-described non-progressive central nervous system (CNS) manifestations.2

A consistent finding among patients with DMD is the reduction in full-scale intelligence quotient. Although most individuals are not intellectually disabled, risk for cognitive impairment is increased among affected males and up to 30% of patients have intellectual disability.3, 4, 5 Apart from intellectual abilities, frequently reported neurocognitive function impairment has been published.6 Deficits in short-term memory, executive functions, visuospatial ability, as well as deficits in some aspect of attention, problems with narrative, linguistic and reading skills have been described, irrespective of general intelligence.7, 8, 9, 10, 11, 12 Moreover, a higher incidence of different neuropsychiatric disorders, such as autism spectrum, attention deficit hyperactivity disorder, obsessive-compulsive disorders and social behavior problems has been revealed among affected males.13, 14, 15, 16, 17

The impact of DMD on cognitive ability in cognitively healthy populations has not been studied to the best of our knowledge; therefore, in the current study we aim to investigate whether single-nucleotide DMD variants associate with variability in cognitive functions in general populations, suggesting loci in the DMD contributing to cognition, besides genuine DMD variants.

Materials and METHODS

Study populations

Our study population consisted of subjects from Erasmus Rucphen Family (ERF) and Rotterdam Study (RS). ERF is a family-based study that includes inhabitants of a genetically isolated community in the South-West of the Netherlands, studied as part of the Genetic Research in Isolated Population (GRIP) program.18 Study population includes ~3000 individuals who are living descendants of 22 couples who had at least six children baptized in the community church. All data were collected between 2002 and 2005. The population shows minimal immigration and high inbreeding; therefore, frequency of rare alleles is increased in this population. All participants gave informed consent, and the Medical Ethics Committee of the Erasmus University Medical Centre approved the study.

The RS is a prospective, population study from a well-defined Ommoord district in the Rotterdam city that investigates the occurrence and determinants of diseases in the elderly.19 The cohort was initially defined in 1990 among ~7900 persons who underwent a home interview and extensive physical examination at the baseline and during follow-up rounds every 3–4 years. Cohort was extended in 2000 and 2005.19 RS is an outbred population, predominantly of Dutch origin. The Medical Ethics Committee of the Erasmus Medical Center, Rotterdam, approved the study. Written informed consent was obtained from all participants.

Data collection procedure

Participants from both cohorts underwent extensive neuropsychological examination. In ERF study, different cognitive domains were assessed using Dutch validated battery of neuropsychological tests.20,21 We focused on neurocognitive domains which are known to be affected in patients with DMD.8, 9, 10, 11, 12 General cognitive ability was assessed with the Dutch Adult Reading Test (DART). Memory function was measured with a word learning test from which immediate recall and learning scores were derived while executive function was assessed with the Trail Making Test (TMT) parts A and B22 and verbal fluency tests.22 Visuospatial ability was assessed with the WAIS-III block-design subtest.

In the RS, global cognitive function was assessed with the Mini-Mental State Examination text (MMSE) test, while executive function and information processing speed were assessed with the Letter-Digit Substitution Task (LDST),23 the Word Fluency Test (WFT)24 and the abbreviated Stroop test.25 Examination was performed at baseline (MMSE) and during follow-up rounds (MMSE, LDST and WTF).

Participants from the both cohorts who had dementia or clinical stroke were excluded from the analysis as these conditions can influence neuropsychological assessment.

Genotyping/sequencing

The exomes of 1336 individuals from the ERF population were sequenced ‘in-house’ at the Center for Biomics of the Cell Biology Department of the Erasmus MC, The Netherlands, using the Agilent version V4 capture kit (Agilent Technologies, Santa Clara, CA, USA) on an Illumina Hiseq2000 sequencer (Illumina, San Diego, CA, USA) using the TruSeq Version 3 protocol (Illumina). The sequence reads were aligned to the human genome build 19 (hg19) using BWA and the NARWHAL pipeline.26,27 The aligned reads were processed further using the IndelRealigner, MarkDuplicates and TableRecalibration tools from the Genome Analysis Toolkit (GATK) and Picard (http://picard.sourceforge.net). Genetic variants were called using the Unified Genotyper tool of the GATK. About 1.4 million single-nucleotide variants (SNVs) were called and after removing the low quality variants (QUAL<150) we retrieved 577 703 SNVs in 1309 individuals. Further, for prediction of the functionality of the variants, annotations were performed using the SeattleSeq database (http://snp.gs.washington.edu/SeattleSeq Annotation131).

In the RS, exomes of 1764 individuals from the RS-I population were sequenced using the Nimblegen SeqCap EZ V2 capture kit (Roche NimbleGen, Madison, WI, USA) on an Illumina Hiseq2000 sequencer and the TruSeq Version 3 protocol. The sequences reads were aligned to the hg19 using Burrows-Wheeler Aligner.27 Subsequently, the aligned reads were processed further using Picard (http://picard.sourceforge.net), SAMtools28 and GATK.29 Genetic variants were called using Unified Genotyper Tool from GATK. Samples with low concordance to genotyping array (<95%), low transition/transversion ratio (<2.3) and high heterozygote to homozygote ratio (>2.0) were removed from the data. The final data set consisted of 903 316 SNVs in 1524 individuals.

Statistical analysis

Baseline descriptive analysis was performed with SPSS version 17 (IBM, New York, NY, USA). Deviation from normality of cognitive functions was assessed by histograms and P-P plots. As the ERF study includes related individuals, all single variants in DMD were tested for association applying additive linear-mixed modeling with the ‘mmscore’ function adjusting for age, sex and education in the GenABEL library of the R software.30 The ‘mmscore’ function uses the relationship matrix estimated from genomic data in the linear mixed model to correct for relatedness among the samples. Additionally, for the most interesting results gender stratified analysis was also performed. As most of these cognitive tests are correlated (the Pearson correlation coefficient ranged from 0.219 to 0.670), to adjust for multiple testing we first calculated the effective number of independent tests using the eigenvalues of a correlation matrix using the Matrix Spectral Decomposition (matSpDlite) software,31 finally Bonferroni correction was applied for the effective number of independent tests. The same strategy was also adopted for modeling linkage disequilibrium between the SNVs of the DMD. Considering the number of independent cognitive tests and independent variants, the significance threshold was set to 0.05/(4 independent cognitive tests × 124 independent variants)=1.00 × 10−04, whereas suggestive threshold was set to 1/(4 independent cognitive tests × 124 independent variants)=2 × 10−3. SNVs were coded 0, 1, 2 for genotypes AA, AB, BB in females, respectively, and 0, 2 for genotypes A, B in males.

Since sequencing is likely to reveal several variants that may be population specific, we also performed the gene-based Sequence Kernel Association Test (SKAT), a test specifically designed to analyze rare sequence variation in a specific gene/region.32 Assessing the joint effect of multiple variants within the gene/region, the SKAT is proposed as a more powerful approach for rare variants than a classical single variant analysis and several burden tests.32 The significance threshold for gene-wise analysis was set to 0.05/4 independent cognitive tests=0.0125, while the suggestive threshold was set to 1/4 independent test=0.25.

To assess the relationship between the SNVs outside the protein-coding regions with gene expression in the tissue, we used the Genotype-Tissue Expression (GTEx) project database.33

The data were deposited in the GWAS Central database, under the accession number HGVST1824 (http://www.gwascentral.org/study/HGVST1824).

RESULTS

General characteristics of the studied populations are shown in Table 1. The mean age in ERF was 48 years and 39% of the participants were males while mean age in RS was around 68 years and 44% of the participants were males. Around 30% of participants in the ERF study had only primary education compared with around 36% subjects in the RS.

Table 1 Descriptive statistics of the study populations

Number of SNVs in the DMD discovered by exome sequencing was 165 in the ERF and 482 in the RS (Supplementary Table 1). Around 70% of variants in the DMD had minor allele frequency (MAF) lower than 0.05 in ERF compared with around 98% of variants in the RS.

The results of the association analysis between SNVs in the DMD and cognitive functions with nominal level of significance in ERF study are presented in Table 2. Although none of the findings surpassed multiple testing correction using a Bonferroni threshold of 1.00 × 10−04, strong association was observed between rs147546024:A>G (β=1.786, P-value=2.56 × 10−04) and the block-design test. Gender stratified analysis showed nominally significant association in both genders (β=1.796, P-value=0.009 in males and β=1.623, P-value=0.018 in females). This rare (AG) variant with MAF of 0.011 was localized in the intron 1 of the DMD (chrX.hg19:g.33146086A>G) and although being highly conserved over species (conservation score GERP=4.08) has an unknown effect on the protein. On the basis of localization, we studied the relationship of this variant with gene expression in human tissues GTEx database but no significant eQTLs were found for this variant. The family-based design of the ERF study allowed us to check whether all the carriers (n=24) of this variant were closely related. All carriers were connected to each other in 10 generations (Figure 1).

Table 2 Association of DMD variants with cognitive abilities in ERF study
Figure 1
figure 1

Carriers of the SNV that achieved the strongest association in the ERF. Carriers are indicated in black.

Next, we explored the association of rs147546024:A>G in the population-based study (RS). Even though rs147546024:A>G is a previously identified genetic variation in dbSNP database (present in 6 copies in 1000 Genomes with an MAF of 0.004) it was not present in RS and was not in linkage disequilibrium with any of the other SNVs of DMD. This prompted us to look for overlapping variants between the two studies. Among 34 overlapping variants we identified the most interesting overlapping finding that is shown in Table 3. Among these variants, rs1800273 (chrX.hg19:g.31986607G>A) had similar MAF in both studies (0.038 in the ERF and 0.033 in the RS), similar effect size and same direction of the effect in both cohorts and was suggestively associated with block-design test in the ERF study (β=−0.424, P-value=0.066) and with MMSE in RS (β=−0.465, P-value=0.002) (Table 3). This G→A variant is localized in exon 45 of the DMD and is classified as a missense variant with a predicted damaging effect on the protein (PolyPhen score=0.99, conservation score GERP=2.52). This variant is present in 23 copies in 1000 Genomes with an MAF of 0.014. All carriers of the variant in the ERF were connected to each other (Figure 2).

Table 3 Overlapping variant in both cohorts
Figure 2
figure 2

Carriers of the overlapping SNV in the ERF. Carriers are indicated in black.

In the gene-based analysis using SKAT suggestive associations (P-values 0.087 and 0.074) were also observed both in ERF and in RS for DART and MMSE, respectively.

DISCUSSION

The aim of this study was to investigate possible impact of genetic variants in the DMD on cognitive ability in the general population. Even though none of the DMD variants surpassed the prespecified significance threshold, rs147546024:A>G was suggestively associated with block-design test in ERF, whereas rs1800273:G>A was nominally associated with MMSE test in the RS and marginally associated with block-design test in ERF.

rs147546024:A>G is localized in the intron 1196 bp far from the promoter of full-length protein isoform (Dp427p), which is expressed predominantly in the Purkinje cells of the hippocampus. The frequency of this variant in 1000 Genomes was observed to be 0.005 in individuals of European origin compared with ERF where the frequency was 0.011. This enrichment is expected due to genetic drift and isolation of the ERF population.18 Functional prediction of this variant showed high conservation score and unknown effect on the protein while gene expression analysis found no significant eQTLs in various human tissues. Interestingly, the rare allele of rs147546024:A>G was associated with better cognitive performance on block-design test which is designed to assess visuospatial ability. Similar to some studies which have described a sex difference in cognitive ability with a male advantage on the spatial domains,34 our study confirmed slight, but not significant, higher scoring of males on block-design test. It is known that better performance on block-design test is associated with autistic spectrum disorder35, 36, 37 and DMD is recognized as one of susceptibility genes for autism disorder.38,39 Suppression of the global configuration to process the information in a detailed manner, essential for this test, is described as a main characteristic of autistic patients.40, 41, 42, 43

Another biologically interesting finding while searching for overlapping variants in both studies was the missense G→A variant, rs1800273:G>A, which we found associated with block-design test in ERF and the test of global cognitive ability (MMSE) in RS. This variant was observed at a frequency of 0.033 in the individuals of European origin and absent in those of African and Asian origin. Localized in exon 45 of the DMD, this variant was classified as a missense variant with a predicted damaging effect on the protein. Since the DMD has three upstream and four intragenic promoters that control expression of full-length (Dp427c, Dp427m and Dp427p) and short protein isoforms (Dp260, Dp140, Dp116 and Dp71), exon 45 is present in the four different isoforms (Dp427c, Dp427m, Dp427p and Dp260) among which Dp427c and Dp427p are expressed in the brain.44 The Dp427c is expressed predominantly in neurons of the cortex and the CA regions of the hippocampus. It has been shown that this form of protein dystrophin colocalizes with inhibitory GABA receptor clusters at the postsynaptic membranes of hippocampal and neocortical pyramidal neurons where the synapse function is modulated.45, 46, 47, 48 According to various studies this dystrophin isoform has a stabilizing effect on the GABA receptors by limiting their lateral diffusion outside the synapse.49,50 Importance of GABA receptors for the regulation of cognition, emotion and memory is increasingly being recognized.51,52 The Dp427p is expressed in the cerebellar and hippocampal Purkinje cells and in the cortical brain.53,54 However, exon 45 does not affect three shorter DMD isoforms (Dp140, Dp116 and Dp71) which are known to be associated with cognitive function in DMD.55,56 rs1800273:G>A was detected earlier in DMD patients and is present in the Leiden Muscular dystrophy database.57 Since majority of DMD patients have cognitive impairment, the association of rs1800273:G>A with DMD may represent association with cognitive impairment. However, the presence of this variant and lack of the dystrophin protein—which can by itself lead to cognitive impairment—would make it difficult to study the separate effect of this variant in DMD patients.

One of the difficulties that our study had to deal with is heterogeneity in classification of phenotypes. Even though various cognitive tests are used in the studied populations, different cognitive domains can be compared since they are correlated. Therefore, moderate correlation (the Pearson correlation coefficient of 0.429, P-value<0.0001) between visuospatial ability and global cognition ability in the ERF, as well as correlation (the Pearson correlation coefficient of 0.460, P-value<0.0001) between visuospatial ability and executive function which is recognized as a central domain of cognitive functioning58,59 allow us to compare association of the most interesting overlapping variant with block-design test in the ERF and MMSE test in the RS.

The majority of variants called in our study were rare variants. Even though there is growing evidence that rare variants contribute to etiology of different complex traits, the search for rare variants is very difficult and challenging. Standard methods used to test for association with single common genetic variants are not powerful enough for the analysis of rare variants.60, 61, 62 Therefore with the available sample size, our study had limited power to detect association. This we attempted to overcome using the recently proposed gene-based analysis (SKAT) design for rare variant analysis.32 Assessing the cumulative effect of multiple variants in DMD implied only suggestive P-value for both cohorts. Still like other approaches that deal with rare variants this approach also has limitations in terms of power but suggestive P-values generated by SKAT pointed out that variants in the DMD may affect cognitive functioning in healthy populations.

In conclusion, analyzing the sequence variants in the exon of DMD in two cognitively healthy cohorts we find evidence of association of DMD with cognitive functioning in healthy individuals. Larger studies are required for confirmation.