Variability of multi-omics profiles in a population-based child cohort

Background Multiple omics technologies are increasingly applied to detect early, subtle molecular responses to environmental stressors for future disease risk prevention. However, there is an urgent need for further evaluation of stability and variability of omics profiles in healthy individuals, especially during childhood. Methods We aimed to estimate intra-, inter-individual and cohort variability of multi-omics profiles (blood DNA methylation, gene expression, miRNA, proteins and serum and urine metabolites) measured 6 months apart in 156 healthy children from five European countries. We further performed a multi-omics network analysis to establish clusters of co-varying omics features and assessed the contribution of key variables (including biological traits and sample collection parameters) to omics variability. Results All omics displayed a large range of intra- and inter-individual variability depending on each omics feature, although all presented a highest median intra-individual variability. DNA methylation was the most stable profile (median 37.6% inter-individual variability) while gene expression was the least stable (6.6%). Among the least stable features, we identified 1% cross-omics co-variation between CpGs and metabolites (e.g. glucose and CpGs related to obesity and type 2 diabetes). Explanatory variables, including age and body mass index (BMI), explained up to 9% of serum metabolite variability. Conclusions Methylation and targeted serum metabolomics are the most reliable omics to implement in single time-point measurements in large cross-sectional studies. In the case of metabolomics, sample collection and individual traits (e.g. BMI) are important parameters to control for improved comparability, at the study design or analysis stage. This study will be valuable for the design and interpretation of epidemiological studies that aim to link omics signatures to disease, environmental exposures, or both. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-021-02027-z.


Background
Characterizing early indicators of health and disease trajectories during pregnancy and childhood is at the core of the life course approach [1][2][3][4]. Early life englobes the most critical/sensitive periods for organ development, which makes it especially vulnerable to the effects of environmental exposures [5,6]. The integration of multiple omics data-such as epigenomics, transcriptomics, proteomics and metabolomics-is increasingly applied to detect early, subtle molecular responses to environmental exposures because it employs a holistic view on all cellular processes [7][8][9][10]. However, there is an urgent need for further evaluation of stability and variability of omics profiles, between and within healthy children. Epidemiological studies that incorporate omics profiles to monitor healthy individuals over time need to be informed of technical and biological variability in order to interpret changes in omics profiles, even if they are small. Omics variability may be determined by factors hindering subtle biological changes of interest, such as seasonality, individual characteristics (age and BMI), stage of life (i.e. hormones might vary between prepuberty and adulthood stages) as well as by technical variability (due to measurement error and limited precision of analytic tools), which therefore must be controlled at the design of the study [2,5,6].
Temporal variability in omics profiles has been described previously to assess the reliability of single time-point measurements in cross-sectional studies and to understand ageing and disease processes [11,12]. In this paper, we define "intra-individual variability" as the variability estimated within individuals over time and "inter-individual variability" as the betweenindividuals variability. We also define the "short-term" as a time span of hours or days, the "medium-term" as a time span of months, and the "long-term" when considering years. For example, in the short term, the metabolome in urine and blood is assumed to be more dynamic than other omics as it is the downstream result of in vivo substances and environmental factor influence [4]. On the other hand, DNA methylation is generally considered to be the most stable omics profile over short periods of time [13] and could provide more valuable information for environmental epidemiological purposes than other omics, as some methylomics signatures (e.g. smoking signatures) have been shown to persist over time even when the exposure no longer exists [2,[13][14][15][16][17][18][19]. Previous studies have also shown low levels of intra-individual variability in > 95% of the gene expression profile and in 25.5% of the miRNAs analysed, which are proposed as good biomarkers for many human diseases [20][21][22][23][24][25][26][27]. Overall, the blood proteome is considered quite stable over time with strong inter-individual variability due to genetics, although some proteins are highly influenced by body mass composition and acute inflammation [28].
To date, multi-omics platforms have mainly been used in studies with a small sample size that focused on dietary or physical activity interventions rather than following up healthy people from the general population [2,29,30]. They agreed that inter-individual variation, rather than intra-individual variation, was the main explanatory factor for all omics measurements. However, previous multi-omics profile studies have not considered changes related to short and medium term. Especially, there is a lack of evidence regarding children from the general population, nor single or multi-omics studies, and the contribution of several factors such as age, sex and BMI.
In the present study, we estimated intra-and interindividual variability in multi-omics profiles (blood DNA methylation, gene expression, miRNA, proteins and serum and urine metabolites) in 156 children from five European countries at two time points with a 6-month interval. We further aimed to assess interrelationships between the variability in different omics layers by performing a crossomics correlation network analysis. Finally, we aimed to decompose the variance in multi-omics profiles according to (1) inter-individual characteristics: sex, ancestry, age, maternal education, Mediterranean diet quality index (KIDMED score) and zBMI (which do not change in a 6month period); and (2) intra-individual characteristics: hours of fasting before the visit for blood/ urine sampling, heavy exercise practice the same day or the day before sampling, having a cold at the moment of the visit, hour of sampling, day of the week and season (Fig. 1).

Study design and population
The HELIX (Human Early Life Exposome) study is a collaborative project of six-population based cohorts in different European Countries: UK (BiB: Born in Bradford) [31], France (EDEN: Étude des Déterminants pré et postnatals du développement et de la santé de l'Énfant) [32], Spain (INMA: Infancia y Medio Ambiente) [33], Lithuania (KANC: Kaunus cohort) [34], Norway (MoBa: Norwegian Mother and Child Cohort Study) [35] and Greece (RHEA: Mother Child Cohort study in Crete) [6,36]. The aim of the HELIX study was to measure and describe multiple environmental exposures during early life (pregnancy and childhood) in a prospective cohort and associate these exposures with molecular omics signatures and child health outcomes.
From the six existing European longitudinal population-based birth cohorts studies participating in HELIX, a subcohort of 1301 mother-child pairs was selected to be fully characterized for a broad suite of   [38]. The mean difference between the two visits (A and B) was 6.11 months (standard deviation (SD): 2.18 months). In order to characterize in depth the variability of the omics measurements, the study population was further restricted to children with complete information for both visits (A and B) for at least one of the omics (N= 156).
Prior to the start of data collection, national ethics committees had granted all the required permissions that allowed cohort participant recruitment and followup visits. Additionally, all the participants were asked to sign a HELIX specific informed consent.

Sample collection
Biological samples were collected using the same standardized protocols across all five cohorts. Urine samples were collected twice daily (first morning void and bedtime sample) in high-quality polypropylene tubes. The two urine samples were brought by the participants to the centre in cool packs and stored at − 4°C until processing. After aliquoting, the urine samples were frozen at − 80°C under optimized and standardized procedures. A pooled sample of both the morning and the night urine samples was used for the analysis when available (94.9% of individuals in the first visit and 95.5% in the second visit). In visit A, 7 children only had a morning sample available, and 1 child only had night sample. For visit B, this happened in 4 and 3 children for morning and night samples, respectively.
Eighteen milliliters of blood was collected at the end of the clinical examination of the child, ensuring an approximate 3-h fasting time since the last meal (visit A mean: 3: 34 h (SD: 1:11 h); visit B mean: 2:35 h (SD: 1:31 h)). Blood samples were collected using a 'butterfly' vacuum clip and local anaesthetic and processed into a variety of sample matrices for serum, plasma, whole blood for RNA extraction, red cells and a buffy coat for DNA extraction. After processing, these samples were frozen at − 80°C under optimized and standardized procedures [6].
Summary of laboratory processing of omics signatures, quality control and normalization We performed in-depth omics profiling at two time points 6 months apart for all 156 children. Because only 87 children had complete data of all omics analyses at both visits, we decided to analyse each omics profile independently (i.e. all paired samples available for each specific omics layer). The final sample size for each omics layer was 149, 105, 100, 149, 154 and 154 children for DNA methylation, blood gene expression, miRNA expression, proteins, serum metabolites and urine metabolites, respectively. Details on laboratory and data processing methods are available in Additional file 1 -Supplementary Methods [39][40][41][42][43][44][45][46][47][48]. While DNA methylation, gene and miRNA expression screenings were based on genomewide arrays, the other methods were targeted or semitargeted. From now on, we use the term "features" to refer to the omics variables in our study: CpGs, gene and miRNA transcripts, proteins and metabolites. Because our study did not have technical replicates (biological samples systematically aliquoted in two replicates before sample preparation), which would be the ideal way to measure technical variability, potential technical variability was filtered out as much as possible in each omics layers, before fitting the variance partition models.
All samples were randomized in the arrays by sex and cohort, and in addition, the samples from the same individual in the microarray-based platforms were paired in the same plate/array (see Supplementary Methods). In the methylation, gene and miRNA expression data, we corrected remaining technical batch effects and blood cell composition by calculating surrogate variables (SVs) while protecting for cohort, sex and age with the SVA and SmartSVA methods [44,49]. We used residuals from the correction process to analyse blood DNA methylation, gene and miRNA expression. We excluded probes for CpGs that did not reach a 62.5% interclass correlation coefficient (ICC) to minimize technical variability, based on a previous analysis with technical replicates [46]. Gene and miRNA expression were filtered out based on call rate or other omics based on other technical parameters (see Table 1). All omics measurements were normalized and log2 transformed except for DNA methylation.

Statistical analyses
All statistical analyses were performed using R version 3.6.3 [50].

Linear mixed effect models
Variability present in the different omics layers was calculated with the variancePartition R package [47]. Briefly, it fits a linear mixed effect model to partition the variance attributed to multiple variables in the data. As this analytical process uses a multiple regression model, the effect of each variable is assessed while correcting for the others. Therefore, variancePartition assessed the contribution of each meta-data variable to variation in each feature.
We considered two mixed effect models: (1) a model to estimate the proportion of variance attributable to intra-individual, inter-individual and cohort variability and (2) the same model adjusting for several explanatory variables (inter-and intra-individual variability-related variables, see list below) to determine the proportion of variance they accounted for. Individual IDs were entered in the models to account for inter-individual variability, whereas we took residuals as a measure of intraindividual variability.
The following explanatory variables were added to the model as a measure of (1) inter-individual variation: sex, ancestry of the child, age, maternal education-as a general measure of socio-economic status-KIDMED score as a measure of healthy diet pattern [51] and zBMI, as we did not observe significant changes in the 6-month period; and of (2) intra-individual variation: time to last meal (hours of fasting), heavy exercise practice the same day or the day before the sample collection, having a cold at the moment of sampling, hour of sample collection, run order for the urine metabolome model, day of the week and season at which the samples were collected. These were biological traits and sample collection parameters that were obtained through questionnaires. All omics, except the urine metabolome, were corrected for omics platform technical variables; therefore, we only included the run order as a covariate in the urine metabolome model. Time to last meal and hour of sample collection were not included in the urine metabolome model because we used pooled samples (morning and night).
Before running the models, we ensured the absence of collinearity between the explanatory variables by obtaining a collinearity score: if this score were to be > 0.99, the variance partition estimates would produce misleading results and overestimate the contribution of variables modelled as fixed effects [47]. No variables were eliminated due to collinearity.
In the case of the methylome, we aimed to assess the amount of variance attributed to differences in the immunological cell type composition. For this purpose, an extra model was performed with a dataset corrected for batch effect using the ComBat method [52] instead of using SVA method, in order to keep the effect of cell type composition.
Additionally for the methylome, we checked whether CpGs located in the 4th quartile for intra-and interindividual variability were enriched for CpG island relative position (island, shore, shelf, open sea) and for overlap with CpGs associated with exposures/traits in the EWAS Atlas [53].

Gaussian graphical modelling (GGM)-network analysis
A GGM was used to assess direct associations between changes in omics features and elucidate biologically  [35], CV coefficient of variation, BLD below limit of detection relevant associations [54][55][56][57][58]. GGMs were built on the delta matrix calculated as the change in omics features between visits (i.e. the change in feature X for a child between time points is correlated with the change in Y, where the correlation is again calculated across all children). The omics features included: CpG sites with 100% intra-individual variability and proteins, serum and urine metabolites in the highest quartile for intra-individual variability. Gene expression and miRNAs were excluded because the number of participants with complete data including these layers was significantly lower (n=87) and would penalize the identification of biologically relevant associations. The data matrix contained 139 samples × 13,167 features (13,103 CpGs, 9 proteins, 44 serum metabolites and 11 urine metabolites). We computed GGMs using the ggm.estimator.pcor function from the R package "GeneNet" [48]. This function estimates pairwise partial correlation coefficients conditioned against all remaining variables, allowing to filter out indirect associations that may appear in omics data [48,54]. We considered significant partial correlations between features those with p values below the false discovery rate (FDR) threshold of 0.05 (p < 1.28 × 10−7). To construct and visualize the resulting GGM networks, we used Cytoscape 3.8.2 [59]. Edges connecting the nodes represent significant partial correlations and they were weighted using partial correlation coefficients (PCCs). The opacity of each node is based on its connectivity degree (number of edges connecting a particular feature).

Study participants
A total of 156 children from five cohorts across Europe (BIB in the UK, EDEN in France, KANC in Lithuania, RHEA in Greece and INMA in Spain) were followed up for this study, with demographic data detailed in Table 2 Table 2 shows the description of other explanatory variables measured in the study.
Variance partition analysis shows large heterogeneity between and within omics A large heterogeneity in terms of the proportion of variance explained by cohort, inter-and intra-individual variability between omics layers and within the same omics layer was found ( Fig. 2 and Table S1). Overall, omics features presented little variability due to cohort (median variability of features ranging from 0% for the methylome to 15.7% for the proteome).

DNA methylation
Variation in blood DNA methylation was mainly due to intra-individual variability with a median of 62.2% across all CpG sites and a median inter-individual variability of 37.6% (Q1: 4.1%; Q3: 65.8%). We note however a large heterogeneity between the CpGs with the lower and upper quartiles ranging from 33.9 to 95.6% for intraindividual variability and from 4.1 to 65.8%, for inter. This heterogeneity was partially expected due to the intra-experimental variation in Illumina BeadChip data. Indeed, intra-individual variability of each CpG site in our study was associated to previously reported ICCs, described in the same tissue and the same array, but in adults [46] (Additional file 1 - Figure S1). CpG sites with the highest inter-individual variability were enriched for CpG shores, whereas CpG sites with high intra-individual variability were enriched in open sea areas. We further aimed to evaluate the importance of blood cellular composition in blood DNA methylation variability. For this, we applied the same model for variance partition but without residualizing the effect of blood cell proportions (Additional file 1 - Figure S2).

Gene expression
Intra-individual variability explained the majority of the variance in most of the transcript clusters-groups of probes that define the expression of a gene (median: 93%; Q1: 78%; Q3: 100%).

MiRNAs
MiRNAs presented a variance partition pattern similar to the gene expression, although not as pronounced.  Fig. 4A-E, respectively). CpGs cg16076587 and cg08510264 present in the same connected component as glucose are annotated to INPP5A (Inositol polyphosphate-5-phosphatase A) and IRS2 (Insulin receptor substrate 2) genes, respectively.
Biological traits can help to interpret some of the omics variability, especially in the serum metabolome We further aimed to evaluate the association of omics variability with several anthropometric and dietary traits.
Overall, the inclusion of explanatory variables accounted for up to 9% of the serum metabolite inter-individual variability (change from a median of 43% to 34% with additional explanatory variables) and up to 3-4% (median change) of the intra-individual variability in the gene expression, miRNA and proteins (Additional file 1 - Table S1). Percentage of omics features explained by each explanatory variable per omics dataset, considering three different thresholds: ≥ 1%, ≥2% and ≥ 5% of variance explained are also provided (Additional file 1 - Table S2). On average, for all the omics, intra-individual variability was negligibly affected by the inclusion of the explanatory variables. However, these variables explained a large percentage of variance in some particular features (Fig. 5) as described further below. Considering all the omics features as a whole, we identified that age, zBMI and hour of sample collection had a major effect on feature variability.
Generally, DNA methylation was poorly explained by the explanatory variables: only 3% of the CpGs had more than 2% of the variance explained by individual traits; the KIDMED score explained the most variability (≥ 2% of the variance in 9.6% of CpGs).
In gene expression, age, maternal education, KIDMED score, week day and hour of sample collection were responsible for at least 2% of the variance in more than 5% of the transcripts each, with the KIDMED score showing the largest influence. MiRNAs were mainly influenced by the hour of sample collection, age, ancestry, and KIDMED score. Concretely, ancestry and hour of sample collection explained more than 5% of the variance in 4.6% and 8.7% of the miRNAs, respectively. Furthermore, the KIDMED score explained more than 2% of variance in 10.6 % of miRNAs.
Proteins were mostly influenced by zBMI, ancestry, and age, where each trait explained more than 5% of the variance in 22.2%, 16.7% and 16.7% of the proteins, respectively. Among the proteins largely explained by zBMI were leptin and insulin. Similarly, these same proteins showed sex differences, found in higher concentrations in females. Ancestry explained 25-12% of the variability in these proteins (in order of magnitude): interleukin (IL)8, TNF alfa, BAFF, insulin and HGF (see Additional file 2). Insulin appeared to be the only protein significatively influenced by the hours of fasting and CRP was associated with having a cold.
Serum metabolites were mostly affected by ancestry and age, explaining more than 5% of the variance in 24.3% and 16.4% of the metabolites, respectively. By contrast, ancestry and age explained more than 5% of variance in just 2.3% and 9.1% of the urine metabolites, respectively. Our results showed that time to last meal explained more than 2% of the variance in only a 10.7% of the serum metabolites. Among the metabolites most influenced by sex, BMI and hours of fasting, we found sphingomyelin C18:1 and tyrosine for Fig. 3 Network representation of the Gaussian graphical model (GGM) of the DNA methylome, proteins, serum and urine metabolites with high intra-individual variability measured in 157 children from five European countries. Blue nodes represent CpG sites. Red and yellow nodes represent serum and urine metabolites, respectively. The opacity of the nodes is dependent on their degree (number of edges connecting a particular feature). The edge thickness was weighted based on the partial correlation coefficients (PCCs) obtained from the GGM sex; glucose and 4-deoxyerythronic acid for BMI zscore; and alanine for hours of fasting.

Discussion
The current study offers a multi-omics perspective of medium-term omics variability in childhood. We modelled the variability of 6 different types of omics data (blood DNA methylation, gene expression, miRNA, proteome, serum and urine metabolomes) for 156 children from five European countries at two time points with a 6-month interval, and found a large range of intra-and inter-individual variability between and within each omics profiles. We pointed out that overall intraindividual variation accounted for the largest part of the total variation in all omics. While DNA methylation and serum metabolites exhibited stronger stability over time for many features (median inter-individual variability: 37.6 and 43.4%, respectively), gene expression was the less stable omics profile in average (median interindividual variability: 6.6%) and proteins and urinary metabolites were somewhat in the middle, with strong heterogeneity between features. Consequently, DNA methylation and serum metabolites (targeted assay) will better inform epidemiological studies that rely on single measurements to compare individuals in the search of biomarkers, whereas less stable omics profiles such as gene expression will give more reliable information to studies that assess individual trajectories over time (multiple time point measurements). We evidenced that variability of omics features comes from several sources. Besides technical or analytical variability, which we tried to control for, we identified physiological patterns in intra-individual variability through inter-omics network analysis. In all omics profiles, features with high interindividual variability were identified, which can be ascribed to biological (between individuals) variability. While it is always preferable to adhere to standard sample collection conditions, this is not always possible, and omics features with substantial biological variation are potentially robust enough to yield meaningful findings in Fig. 4 Main connected components of the Gaussian graphical model (GGM) network that involve direct associations between features from different omics layers. Blue nodes represent CpG sites. Red and yellow nodes represent serum and urine metabolites, respectively. The opacity of the nodes is dependent on their degree (number of edges connecting a particular feature). The edge thickness was weighted based on the partial correlation coefficients (PCCs) obtained from the GGM spite of collection inconsistencies. The small proportions of variability attributed to cohort demonstrated that standardized sample collection protocols can produce robust results in large-scale omics studies across different countries. Biological traits and sample collection variables, easy to collect in cohort studies, might help to account for the unwanted variability, in particular for metabolomics.
We found, in the case of the methylome, that the most stable CpG sites were enriched in functional methylation regions of the genome. CpG sites with the highest interindividual variability were enriched for CpG shores, which are regions 0-2 kb from CpG islands (CpG rich regions) [33]. On the other hand, CpG sites with high intra-individual variability were enriched in open sea areas, which are isolated CpG sites in the genome that have been linked to chromosomal instability and loss of imprinting [33,34]. Recent studies point out that phenotypically relevant CpGs tend to be located in CpGs shores [13,16]. These CpGs with high inter-individual variability are especially relevant for large-scale epidemiological studies since these probes could be used as reliable biomarker candidates [16]. Our results reinforce that strong methylation differences between individuals already appear in childhood as previously reported [18], even in our population of similar European ancestry children. On the other hand, gene expression may require repeat sampling to account for intra-individual and technical variability, in order to generate stable enough markers to be deployed in epidemiological studies. Previous studies in healthy individuals show that gene expression profiles are mostly stable and repeatable in the short/medium term (< 5% of transcripts with high intra-individual variability) [20][21][22]24]. Our results, in contrast, attributed the majority of the variation to intraindividual effects, potentially due to RNA quality that might differ between visits A and B. Effort has to be put on the initial sample preparation (DNA, RNA extraction and quality) and its harmonization across different centres or time point collection, since it strongly determines the quality of omics measurements and might hinder real biological response. To our knowledge, only two previous studies have estimated miRNA variability in terms of intra-and inter-individual effects, but these studied a longer time span (5 years) [26] or a daily time span (48 h) in cerebrospinal fluid [27]. Despite these differences in study design, both studies agree that there is diversity within the microRNome in terms of stability of its features. Previous proteome studies in healthy human volunteers have demonstrated moderate inter-individual variability (CV ranging from 30% to 50%) [35,36], similar to the urine proteome in seven adult donors [37]. A study that estimated intra-individual variation of plasma adipokines concluded that they may be useful biomarkers of inflammation in population-based studies of obesity-related disease due to their stability over time [38]. These corroborate our results for IL-1β, IL-6, IL-8, leptin, adiponectin, hepatocyte growth factor (HGF) and CRP which presented the highest stability in our study. Finally, metabolomics studies comparing serum and urine metabolomes corroborate in children and adult populations strongly corroborate our findings that the serum metabolome is more stable and captures more inter-individual specific variance, compared to the urine metabolome [39][40][41][42][43].
Biological traits, such as body weight, sex and age, accounted for inter-individual variability in gene expression, similar to results obtained by Hughes et al. (2015) [25], where roughly 2% of total gene expression variation was explained by these traits in placenta samples. Age and hour of sample collection have also been found to be significant sources of variation in blood gene expression patterns of healthy individuals [60]. Previous studies on miRNA variability identified the strong influence of age and sample storage time [15,23], but the time of the day or dietary intake has not been previously studied to our knowledge. Among the proteins largely explained by zBMI were leptin and insulin, hormones related to food intake and fat storage [61] and the pro-inflammatory cytokines IL-6 and IL-1 beta, both known to be elevated in subjects with obesity or with serum lipid concentrations abnormalities, leading to a state of chronic inflammation [62]. Similarly, these same proteins showed sex differences, in particular leptin and IL-1 beta, known to be higher in females and resulting in a higher risk to develop obesity than males [63]. Interestingly, circulating IL-8, a pro-inflammatory cytokine, was strongly influenced by ancestry (6% of children had Pakistani ancestry) in our cohorts (25% of variance explained). IL-8 was previously found to be influenced by genetic polymorphisms in an eastern Indian population, potentially driving individual variations in the host's immune response, in particular to infectious diseases [64]. Intra-individual factors such as hours of fasting and having a cold gave the expected results: insulin appeared to be the only protein significatively influenced by the hours of fasting and CRP was associated with having a cold. This is in line with a previous study that considered the fasting/postprandial state of the samples and revealed that, on average, it explained less than 2% of the total variance [41]. Associations of serum and urine metabolites with sex, BMI and hours of fasting (only for serum) was also previously reported in the cross-sectional study of 1300 HELIX children [39]. Interestingly, high levels of 4deoxyerythronic acid in children, found correlated to higher BMI in our study, have been previously related to early-onset type I diabetes although further understanding of its metabolism is required [65].
A few recent studies have reported changes in multiple omics profiles in clinical settings and after dietary wellness coaching intervention or physical activity [30,66,67], revealing omics signatures that may serve as potential diagnostic markers. However, variability of omics profiles in healthy population and "normal" living conditions remains under-studied. Large observational studies would benefit enormously from this information, because it allows the interpretation of changes that do not match a pattern. This is especially important for earlylife studies that focus on the origin of diseases in children because omics profiles are able to capture very early and subtle molecular responses, even before physiological manifestations appear. Variations in omics profiles due to factors such as BMI, age, physical activity, fasting time and sampling conditions need to be well characterized in order to interpret subtle changes related to environmental exposures such as air pollution or endocrine-disrupting chemicals, the ultimate goal of recent exposome initiatives [60]. Omics measurements in our study are of great clinical relevance as they provide the basis for the discovery of new biomarkers of multiple medical conditions such as cancer, heart disease, neurological disorders or inflammatory diseases [61][62][63][64]. Here, we provide insight into which omics features are stable within individuals and demonstrate sufficient interindividual variation in order to reduce chance findings when conducting epidemiological studies with follow-up for disease outcomes [16].
We emphasize the fact that children from five European countries took part in the study. The multiple locations of our participants allow to generalize the influence of factors such as lifestyle habits, seasonal influences and maternal education, which vary greatly between countries/cultures and not many studies consider.
A major strength of our study lies in the cross-omics approach. We created a network of interacting omics features of four different omics layers (DNA methylome, proteome, serum and urine metabolomes) using Gaussian graphical models (GGMs). GGMs circumvent the selection of indirect associations that usually appear between omics measurements, as Pearson correlations are generally high in these data. GGMs are based on partial correlation coefficients and provide an estimate of conditional dependencies between variables, elucidating direct associations [54]. This enabled us to obtain a holistic view of the biological significance of our results. By studying the interaction of multiple omics features, we exploited the data to their full potential for further disease prediction and prevention studies [56]. Moreover, it allowed us to go a step further by identifying interactions of features across omics layers, which is the main goal of the recent discipline interactome within systems biology. Despite not finding many dependencies across omics, a few metabolites were related to groups of CpGs. These included TMAO, a compound generated by the gut microbiota from diet-derived components, hence with high variability and strongly determined by diet, gut microbial flora and drug administration [65]. Elevated levels of TMAO positively correlate to cardiovascular disease through the development of atherosclerosis in previous studies [68][69][70][71][72][73]. Variation in four CpGs was related to glucose in urine (pool morning and night urine), which level depends on lifestyle factors like diet and exercise behaviours and its dysregulation related to obesity and type 2 diabetes (T2D) in children [74]. Interestingly, two of these CpGs are within genes related to obesity and T2D [53]. This supports previous studies on GGMs that show their ability to reconstruct metabolic pathways [54,75,76], including in this case across omics.
Our study had some limitations. In the first place, we studied omics variability in 156 children and a larger population size would have provided greater statistical power to our model. However, this sample size allowed us to analyse 6 different molecular layers in the same individual and at the same time-point, twice, data rarely obtained in the past. Our study used targeted approaches to measure the proteome and the serum metabolome profiles (semi-targeted for urine). While this approach provided reduced biological coverage, we are confident that this insured reliable annotation and quantification of markers. In contrast, we report high intraindividual effects in a large proportion of genome-wide omics features that are potentially due to technical variability. Despite considering quality control parameters (e.g. filtering for call rate) in order to minimize this effect, we note that precision in the intra-individual variability apportionments can be strengthened by adding technical replicates. This, together with increasing the number of individuals, would be the ideal way to strengthen the biological signal. Moreover, despite having measured many biological traits in the children under study, there are relevant missing variables, such as recent dietary intake in the last 24 h, which is expected to influence the variability of omics such as the urine metabolome. It would be interesting to study if and how other factors account for variability within omics. Our study could also benefit from a more exhaustive collection of repeated samples across short periods of time. For example, having morning and night samples in the same individual at different time points, as done previously for metabolomics [77,78] or miRNAs [27].

Conclusions
We assessed omics profiles variability over the mediumterm in child cohorts from the general population, using a multi-omics approach. We found large heterogeneity within and between omics profiles. Intra-individual variability presented the highest median variability in all cases. The cross-omics analysis we performed provides global insights into how different omics features vary over time, within and between individuals and among cohorts from different countries. This study thereby provides a valuable framework for future epidemiological studies that aim to detect omics signatures linked to disease, environmental exposures or both.

Availability of data and materials
The summarized results generated during the current study are available in supplementary material. The raw data supporting the current study are available from the corresponding author on request subject to ethical and legislative review.

Declarations
Ethics approval and consent to participate Local ethical committees approved the studies that were conducted according to the guidelines laid down in the Declaration of Helsinki. The ethical committees for each cohort were the following: BIB: Bradford Teaching Hospitals NHS Foundation Trust, EDEN: Agence nationale de sécurité du médicament et des produits de santé, INMA: Comité Ético de Inverticación Clínica Parc de Salut MAR, KANC: LIETUVOS BIOETIKOS KOMITETAS, MoBa: Regional komité for medisinsk og helsefaglig forskningsetikk, Rhea: Ethical committee of the general university hospital of Heraklion, Crete. Informed consent was obtained from a parent and/or legal guardian of all participants in the study.

Consent for publication
Not applicable.