Table of Contents
Ethical statements
The BABYDIAB and BABYDIET studies were approved by the ethics committee of Bavaria, Germany (Bayerische Landesärztekammer no. 95357 and Ludwig-Maximilians University no. 329/00, respectively). Written informed consent to participate in the study was obtained from all study participants or their legal representatives. None of the participants were compensated.
POInT was approved by the institutional review boards and regulatory authorities of the Technische Universität München, Medical Faculty (326/17 Af), the Medical University of Warsaw (199/2017), the UK Health Research Authority (18/SC/0019), Onderzoek UZ/KU Leuven (S60711) and Regionala Etikprövningsnämnden i Lund (2017/918). Written informed consent to participate in the study was obtained from legal representatives of all study participants. A separate informed consent from the participating families was required to allow biobank storage of material, such as the DNA used in this study. None of the participants were compensated.
The Fr1da study was approved by the institutional review board at the Technical University of Munich (70/14). Written, informed consent was obtained from the children’s parents or legal guardians. None of the participants were compensated.
Cohorts
BABYDIAB/BABYDIET
BABYDIAB and BABYDIET are prospective birth cohorts of individuals with a first-degree family history of T1D and who were born in Germany between 1989 and 200618,19. The primary aim of these studies was to investigate the natural history of islet autoimmunity and T1D. Blood DNA samples were collected at the age of 2 years or at 1 of the next follow-up visits if not collected at age 2 years. We included 958 children in the present analysis for whom consent was given to use their DNA for genetic research and whose DNA was of adequate quality and quantity for EWAS analysis. Among these, 608 children had a mother with T1D and 350 children had a father and/or sibling with T1D and were born to a mother without diabetes (Extended Data Fig. 1a).
POInT
POInT is a randomized, controlled and multicentre clinical trial organized through the Global Platform for the Prevention of Autoimmune Diabetes (GPPAD)20. It is investigating whether daily intake of oral insulin reduces the incidence of islet autoimmunity and/or T1D in children at increased risk of T1D (ClinicalTrials.gov registration no. NCT03364868). Between 2018 and 2021 a total of 1,050 infants were enrolled at 4.0‒7.0 months of age. Infants were eligible if they had a predicted genetic risk of >10% for developing multiple islet autoantibodies by the age of 6.0 years. Blood DNA samples were collected at the age of 1.5 years. We included 794 children in the present analysis for whom consent was given to use their DNA for genetic research, and whose DNA was of adequate quality and quantity for EWAS analysis. Of these, 182 had a mother with T1D, 253 had a father and/or sibling with T1D and 359 had no first-degree relative with T1D (Extended Data Fig. 1b).
Fr1da
The Fr1da study is an ongoing public health screening programme for islet autoantibodies in Bavaria, Germany51. Between February 2015 and March 2024, 194,696 children aged 1.75‒10.99 years were screened for islet autoantibodies by primary care paediatricians during routine medical check-ups. We used whole-blood samples from 133 children who were identified with 2 or more persistent confirmed islet autoantibodies and of 111 age-matched and sex-matched control children who were islet autoantibody negative at screening (Extended Data Fig. 1c and Supplementary Table 1). All children included in the present analysis had mothers without T1D. For all children, consent was given to use their DNA for genetic research and DNA of adequate quality and quantity was available for methylation analysis.
DNA methylation measurement
After extraction, genomic DNA was purified and 750 ng of bisulfite converted using the EZ-96-DNA methylation kit (cat. no. D5008, Zymo Research). We performed a genome-wide DNA methylation analysis using the Infinium MethylationEPIC (850K) Bead-Chip array (Illumina). The methylation measurements were carried out at Helmholtz Center Munich for samples from the BABYDIAB/BABYDIET studies and at Life & Brain for samples from the POInT and Fr1da studies. Samples were randomized across the chips by age and sex. The Bead-Chips were imaged using an Illumina iScan and the raw fluorescence intensities of the images were extracted using the GenomeStudio Methylation module (Illumina).
Preprocessing and quality control of the BABYDIAB/BABYDIET, POInT and Fr1da studies were identically performed using R software (v.4.3.2) with the R package ENmix. The methylation probes from downstream analysis were excluded if their detection P value was >10−6 and if the bead count was fewer than 3 in at least 5% of the sample. Samples where <99% of the probes had a detection P value of <10−6 and samples that had a sex mismatch were also excluded. Data were normalized using quantile normalization as implemented in ENmix. Last, we excluded probes on sex chromosomes, cross-hybridizing probes and probes located near single nucleotide polymorphisms52,53.
Epigenetic age calculation
We estimated the methylation age of the children according to Horvath54 using the methylclock R package. Among various methods to calculate epigenetic age, Horvath’s method has been shown to be the most accurate for children’s blood DNA methylation55.
Meta-analysis of epigenome-wide associations with maternal T1D
Associations between the maternal T1D status of the offspring (no or yes) and the children’s blood DNA methylation were analysed by robust linear regression using the R package limma. Methylation M values (logit-transformed β values) were used for all analyses. To adjust for potential technical effects, the first three principal components, covering over 95% of the variation, were included (Extended Data Fig. 1d). The blood cell compositions of six blood cell types (CD4+ T cells, CD8+ T cells, B cells, NK cells, neutrophils and monocytes) were estimated from a reference panel using the R package FlowSorted.Blood.EPIC. Regression models were adjusted for age at DNA sample, sex, the first three principal components and the six estimated blood cell types. Extreme methylation outliers were excluded on the basis of the Tukey method, as previously described56. Before the meta-analysis, the quality of the EWAS results was checked using the R package QCEWAS. Correction for bias and inflation was performed using the R package bacon. A fixed-effects invariance-weighted meta-analysis using the 686,621 probes common to all studies was conducted using the R package metafor. The meta-analysis showed good model estimates (inflation λ = 1.04, bias µ = 0.07). We excluded 35,350 CpGs with high heterogeneity between the studies (I2 > 75), leaving a total of 651,271 CpGs for the downstream analysis.
In addition to the individual CpG analysis, the DMRs were examined using the R package dmrff, applying the same models as described above. Among the various tools available for DMR analysis, dmrff has been shown to overcome the limitations of other R packages for DMR analysis and to efficiently control the false-positive rates57. The analysis of DMR was performed using the dmr.meta function with default settings (maximal distance between consecutive CpGs = 500 base pairs) in the dmrff package.
Methylation sites were annotated to the human genome (reference genome GRCh37/hg19) using the R package IlluminaHumanMethylationEPICanno.ilm10b4.hg19. The false-discovery rate (FDR) according to Benjamini–Hochberg was used to correct for multiple testing if not specified otherwise, and PFDR < 0.05 (two-sided) was considered statistically significant.
Validation of CpG methylation by bisulfite pyrosequencing
We performed target-specific bisulfite pyrosequencing on a subset of up to 74 children with or without a mother with T1D matched for age (median 2.08 years, IQR 2.0‒2.2 years) and sex. Target CpGs for validation at HOXA5 and eQTM-confirmed T1D susceptibility genes were selected on the basis of significant methylation differences (nominal P < 0.05) at the array between the analysed children with or without a mother with T1D. One microgram of whole-blood DNA was bisulfite converted using the EZ-96-DNA Methylation Gold kit (cat. no. D5008, Zymo Research). Bisulfite-treated DNA was amplified by target-specific PCR using the HotStarTaq kit (cat. no. 203443, Qiagen) or ZymoTaq Premix (cat. no. E2003, Zymo Research) and the following PCR steps: polymerase activation at 95 °C for 15 min (10 min for ZymoTaq), 45 cycles of denaturation at 94 °C for 30 s, annealing at a variable temperature (below) for 45 s, elongation at 72 °C for 1 min and final elongation at 72 °C for 10 min (7 min for ZymoTaq). Target-specific PCR primers were obtained from Qiagen or designed using the PyroMark Assay Design software v.2.0.1.15. The following primer assays were obtained from Qiagen to target CpGs at HOXA5 (cat. no. 978776, GeneGlobeID: Hs_HOXA5_03_PM, targeting cg04863892, PCR annealing temperature 50 °C; GeneGlobeID: Hs_HOXA5_09_PM, targeting cg17432857, cg00969405, PCR annealing temperature 54 °C). The following primers (original sequence 5′ to 3′) were designed to target CpGs at T1D susceptibility genes (cg00106345 at SKAP2: forward primer TTGGCCCTCTAGGCAAGTAGGTCAG, biotinylated reverse primer GAGGCCCTGAACTGTTCATGGCAT, PCR annealing temperature 52 °C, sequencing primer GGGAGACTGGGTGAA; cg27003765 and cg07357081 at SLC44A4/LY6G5B: forward primer GCCCGCTGGGCACAAAGTTGAGAAGAAGGA, biotinylated reverse primer GAACTAAGGAGAGTACTGTGTCCCTGAGGG, PCR annealing temperature 56 °C, sequencing primer CCAGGTCTCCAGGGCTCCAA; cg26266427 and cg01337207 at TNXB/SKIC2, forward primer AGGATTGCGGTGTGAGGCAGTG, biotinylated reverse primer AGAGCCTCCAGCCAGCGCCTGCCCTGGAGG, PCR annealing temperature 52 °C, sequencing primer AGTGCCCGAATGACTGCAGCCAGCA). Designed primers were obtained from Integrated DNA Technologies. All primers were checked for amplicon specificity using the in silico PCR function provided by the UCSC Genome Browser and by agarose gel electrophoresis. Pyrosequencing on the amplified PCR products was performed on a PyroMark Q24 instrument (Qiagen) using the PyroMark Q24 Gold reagents (cat. no. 970802, Qiagen). Analysis was done using the PyroMark Q24 Analysis software v.2.0.7 (Qiagen). Only pyrosequencing runs that passed the integrated quality assessment were included in the analysis. Because of the high input DNA amount per assay, some assays could not be performed in the whole subset. Methylation levels between the groups were statistically compared using Welch’s t-test or Mann–Whitney U-test, depending on normal distribution of the data, and unadjusted P values are given (two-sided).
Heatmaps of CpGs associated with maternal T1D
Methylation M values were adjusted for the same covariables as in the EWAS model, z-scored and plotted by children with a mother with T1D and children without an affected mother. Heatmaps were created using the R package pheatmap using the Euclidean distance method to cluster the CpGs.
Association analysis between variables and CpG methylation
The potential influence of early-life (maternal-related and pregnancy-related) factors on the observed DNA methylation differences in children born to mothers with T1D was assessed using adjusted linear regression between CpG methylation and maternal age at delivery, maternal HbA1c in the last trimester of pregnancy, parity, delivery by Caesarean section and birth weight. Models were adjusted for age at DNA sample, sex, the first three principal components and the six estimated blood cell types. The maternal-related and pregnancy-related variables were analysed in the BABYDIAB/BABYDIET study.
Enrichment analysis of locations and transcription factor motifs
All CpGs included in the analyses were annotated to their genetic region using the R package annotatr and hg19 as the reference genome. Hypergeometric tests were used to determine region enrichment or depletion on the basis of the CpGs per genetic region among the identified CpGs and all CpGs included in the meta-analyses using the phyper function implemented in R. Annotations to CpG island classifications were extracted from the Illumina annotation file. The P values were adjusted for the six region types or island classifications using Bonferroni correction. The Roadmap ChromHMM marks of primary blood cells from peripheral blood and T1D-relevant tissues58 were used to perform enrichment analysis for regulatory elements. The P values were adjusted for the six regulatory element types using Bonferroni correction. Motif enrichment analysis of the transcription factor binding sites was performed using the ELMER package in R using all differentially methylated CpGs found in the offspring of mothers with T1D. Only the motifs of class A and B (strong confidence) were examined. Motifs with an OR > 1 and PFDR < 0.05 were considered significantly enriched.
Functional evaluation of CpGs
To assess whether the identified methylation sites were associated with blood-specific gene expression, we queried the HELIX cis-eQTM catalogue (adjusted for blood cell types) of blood from 832 children with a mean age of 8.1 years (Illumina 450 K methylation array only)59. The EWAS catalogue was queried for protein traits associated with the identified CpGs60.
Gene ontology analysis
Gene ontology analysis was performed using Panther61. Over-representation of ontologies among the input genes was assessed using Fisher’s exact test (PFDR < 0.05).
Identification of genes, proteins and methylation loci
To determine whether the CpG gene targets were previously identified as T1D susceptibility genes, we queried the Harmonizome database for T1D, which included 144 distinct genes associated with T1D62 and the Genome-Wide Association Studies catalogue, which included 335 genes. For proteins related to T1D development, we searched PubMed using the terms ‘protein biomarkers’ AND ‘type 1 diabetes’ (search included publications until 30 May 2024) and compared the identified biomarker proteins in matching studies35,63,64,65,66,67 with the associated protein traits. Enrichment among the associated CpGs was determined using a hypergeometric test with Bonferroni correction for multiple testing (P < 1.64 × 10−4). Furthermore, we assessed the overlap between maternal T1D-associated CpGs and DMRs and methylation loci in blood previously linked to T1D development and patients with T1D43,68,69,70,71.
Development of the MPS for offspring of mothers with T1D
We generated MPSs for the offspring of mothers with T1D to capture their methylation differences. The CpGs were first stratified into those associated with T1D susceptibility loci and the remainder. For CpGs strongly correlated with each other (Pearson’s r > 0.8), we randomly excluded one correlated CpG. Random-forest recursive feature elimination was applied using the R package caret to identify the CpGs with the highest importance for predicting offspring of a mother with T1D. We split the data by study into training (BABYDIAB/BABYDIET) and test (POInT) sets. The recommended models were then used to generate the MPSs by a weighted sum of the individual methylation levels using the R package tidymodels. We performed logistic regression with age at DNA methylation and sex to assess the association of the MPSs with the islet autoimmunity outcome, that is, persistent confirmed multiple islet antibodies and/or T1D, in the offspring of mothers without T1D. The association between the MPSs and the islet autoantibody outcome was validated using an independent cohort of 244 children from the Fr1da study (Supplementary Table 1). A P value <0.05 (two-sided) was considered statistically significant.
Statistical analysis
Data collection and analysis were not performed blind to the conditions of the experiments. No samples were excluded after quality control of the methylation data (Extended Data Fig. 1a,b). Extreme methylation values, as defined by the Tukey method, were excluded before the EWAS. In addition, methylation sites showing high heterogeneity (I2 > 75) between the meta-analysed studies were excluded, as described above. We used R software (v.4.3.2) for all statistical analysis and graphical illustrations. Data met the assumptions of the respective statistical tests used. Normality was tested using Shapiro–Wilk test and the F-test applied to check for equal variances. Data were not transformed to achieve normal distribution. Respective statistical tests and multiple testing correction methods are reported within Methods. P values were two-sided, except for enrichment analysis (one-sided).
Figure alignment
Inkscape software (v.1.0.2) was used for schematic illustrations and figure alignment.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.