Effects of smoking and smoking cessation on human serum metabolite profile: results from the KORA cohort study

Background Metabolomics helps to identify links between environmental exposures and intermediate biomarkers of disturbed pathways. We previously reported variations in phosphatidylcholines in male smokers compared with non-smokers in a cross-sectional pilot study with a small sample size, but knowledge of the reversibility of smoking effects on metabolite profiles is limited. Here, we extend our metabolomics study with a large prospective study including female smokers and quitters. Methods Using targeted metabolomics approach, we quantified 140 metabolite concentrations for 1,241 fasting serum samples in the population-based Cooperative Health Research in the Region of Augsburg (KORA) human cohort at two time points: baseline survey conducted between 1999 and 2001 and follow-up after seven years. Metabolite profiles were compared among groups of current smokers, former smokers and never smokers, and were further assessed for their reversibility after smoking cessation. Changes in metabolite concentrations from baseline to the follow-up were investigated in a longitudinal analysis comparing current smokers, never smokers and smoking quitters, who were current smokers at baseline but former smokers by the time of follow-up. In addition, we constructed protein-metabolite networks with smoking-related genes and metabolites. Results We identified 21 smoking-related metabolites in the baseline investigation (18 in men and six in women, with three overlaps) enriched in amino acid and lipid pathways, which were significantly different between current smokers and never smokers. Moreover, 19 out of the 21 metabolites were found to be reversible in former smokers. In the follow-up study, 13 reversible metabolites in men were measured, of which 10 were confirmed to be reversible in male quitters. Protein-metabolite networks are proposed to explain the consistent reversibility of smoking effects on metabolites. Conclusions We showed that smoking-related changes in human serum metabolites are reversible after smoking cessation, consistent with the known cardiovascular risk reduction. The metabolites identified may serve as potential biomarkers to evaluate the status of smoking cessation and characterize smoking-related diseases.


Background
Smoking is responsible for 90% of all lung cancers, accounts for 25% of cancer deaths worldwide [1][2][3] and is a significant risk factor for cardiovascular disease (CVD) [4][5][6][7]. The benefits of smoking cessation are remarkable. Risk of CVD is reduced in former smokers (FS) compared with current smokers (CS) [8][9][10]; mortality and future cardiac events both decline in FS [11,12]. Nevertheless, for cancers, especially for adenocarcinoma, the risk remains high in FS compared with never smokers (NS) [13,14]. Studies have made attempts to find the molecular basis for the influence of smoking and smoking cessation on cardiovascular risks. For instance, smoking is associated with the increase of several CVD-related inflammatory markers, for example, creactive protein and fibrinogen [15][16][17], and smoking cessation could largely reduce the level of these markers [18]. However, there is also evidence that other molecular changes associated with smoking are permanent, for example, loss of heterozygosity and hypermethylation in the promoter regions of cancer-related genes [19][20][21][22][23].
The metabolomics approach provides a functional readout of activities located downstream of the gene expression level that are more closely related to the physiological status [24] and, thus, may be particularly useful for the study of environmental influences, namely the 'exposome' [25]. Studying a strong environmental factor, for example a lifestyle-related exposure to smoking, may be considered a very powerful approach for understanding the links between environmental exposure and the metabolome. In human lung epithelial cells, it has been shown that metabolite concentration changes in various pathways, for example, the urea cycle and polyamine metabolism and lipid metabolism under smoke exposure [26]. In a pilot study with 283 male participants from the Cooperative Research in the Region of Augsburg (KORA) F3 in Germany, we have shown that levels of diacyl-phosphatidylcholines (PCs) were higher in 28 CS compared with 101 NS, except for acyl-alkyl-PCs [1]. The reduced ratios of acyl-alkyl-to diacyl-PCs in CS may be regulated by the enzyme alkyl-dihydroxyacetone phosphate in both ether lipid and glycerophospholipid pathways [1]. However, little has been reported about the reversibility of the metabolite profile upon smoking cessation, which is important for comprehensive understanding of smoking effects. It is also known that metabolite profile is different between men and women [25], but whether lifestyle factors such as smoking may induce different metabolite patterns in men and women is still unknown.
In this study, we analyzed the association between smoking and the concentration of metabolites in 1,241 serum samples from the KORA baseline survey 4 (S4) and follow-up (F4) study, aiming to extend the knowledge of smoking-associated metabolites beyond our pilot study by including female CS at two time points over seven years, to investigate whether smoking-associated changes in metabolite profile are reversible after smoking cessation, and to provide insights into the pathophysiological consequences of smoking in proteinmetabolite networks.

Ethics statement
Written informed consent was obtained from KORA S4 and F4 participants. The KORA study was approved by the ethics committee of the Bavarian Medical Association in Munich, Germany.
For metabolite profiles, serum samples from 1,614 people aged 55 to 74 years old were available [29]. Participants with non-fasting status (N = 216) or missing values (N = 22) were excluded from the analysis. We further excluded 145 people in KORA S4 and 116 people in the longitudinal data of KORA S4 F4, whose spouses were CS, to rule out passive smoking effects. Furthermore, metabolite concentrations of serum samples from 1,036 participants were measured in both KORA S4 and F4.

Metabolite measurements
Liquid handling of serum samples (10 μl) was performed with Hamilton star robot (Hamilton Bonaduz AG, Bonaduz, Switzerland) and prepared for quantification using the AbsoluteIDQ P180 and P150 kits (BIOCRATES Life Science AG, Innsbruck, Austria) for the KORA S4 and F4 surveys, respectively. This allowed simultaneous quantification of 188 or 163 metabolites using liquid chromatography and flow injection analysis mass spectrometry as described previously [32,33]. The complete analytical process was monitored by quality control steps, reference samples and the MetIQ software package, which is an integral part of the Absolute IDQ kit.
Because the two datasets were generated by different platforms, different quality control processes were introduced. The metabolite data quality control procedure for the KORA S4 samples was described in our recently published work [29]. There were 140 metabolites that passed the two quality controls: one hexose, 21 amino acids, eight biogenic amines, 21 acylcarnitines, 13 sphingomyelins (SMs), eight lysoPCs, 33 diacyl-PCs (PC aa Cx:y) and 35 acyl-alkyl-PCs (PC ae Cx:y). Lipid side chain composition is abbreviated as Cx:y, where × denotes the number of carbons in the side chain and y the number of double bonds. The precise position of the double bonds and the distribution of the carbon atoms in different fatty acid side chains cannot be determined with this technology. Concentrations of all analyzed metabolites are reported in μmol/L (μM). The data cleaning procedure for the KORA F4 samples has previously been described in detail [24,30]. In total, 121 metabolites were measured in both S4 and F4, and used for the prospective study.

Statistical analysis
Differences in population characteristics (CS, FS and NS) were tested by a two-tailed student's t-test. The metabolite concentrations were log transformed for normalization. We tested cross-sectional association of each metabolite with smoking using logistic regression models adjusted for age, body mass index (BMI) and alcohol consumption (see Figure 1). To correct for multiple testing, false discovery rate (FDR) was calculated using the Benjamini-Hochberg method [34] and the cut-off for statistical significance was set at FDR <0.05.
Linear regression models were used to investigate whether smoking intensities measured in pack years and cessation time are associated with metabolite concentrations. In the case of CS, the years of smoking were calculated as the time period from starting smoking until the start of the survey. Pack year was calculated as the number of cigarettes per day multiplied by smoking duration and divided by 20 [35]. Cessation time (in years) was calculated according to the questionnaire. The models contained the log-transformed metabolite concentrations as the dependent variable and the smoking intensities as the explanatory variable, with age, BMI and alcohol consumption as covariates. Every unit change of one covariate corresponds to a relative change of the metabolite concentration by Δ (%): where β i indicates the estimate of ith covariate in the model.
To assess the role of smoking cessation for the quitters, who were CS at S4 but FS at F4, we fitted the linear mixed models to the longitudinal data of KORA S4 F4. The models contained the fixed effect of smoking status (CS, FS and NS), age, BMI and alcohol consumption with a random effect assigned to each participant. All calculations were performed in R (version 2.14.1).

Network and pathway analysis
We retrieved protein-protein interactions from the databases of the Search Tool for the Retrieval of Interacting Genes/Proteins [36] and the relationships between enzymes and metabolites from the Human Metabolome Database [37] to construct protein-metabolite networks containing links between metabolites, enzymes and smoking-related genes. Genes and metabolites were connected allowing for at most one intermediate enzyme by Dijkstra's algorithm [38], and optimized by eliminating edges with Search Tool for the Retrieval of Interacting Genes/Proteins scores less than 0.7. Each edge in the networks was manually checked. We have implemented this method in our previous studies [29,39]. The analysis was performed using the R package igraph [40]. The network was visualized using Cytoscape [41]. Pathway analysis was performed by MetaboAnalyst [42].

Characteristics of participants of the cross-sectional KORA S4
Participants were divided into three groups according to their self-reported smoking status. Population characteristics are shown in Table 1. On average, CS were two to three years younger and had a lower BMI than FS and NS. Male CS showed higher alcohol consumption than male NS, but there was no significant difference observed in women. Furthermore, the statistics showed differences in lifestyle factors between men and women. Alcohol consumption was higher in men than women (P = 1.5e -11 (CS); P = 2.2e -18 (FS); P = 9.5e -17 (NS)), and smoking intensity (in pack years) was higher in male than in female CS (P = 6.0e -6 ).

Metabolomic differences between current, former and never smokers
We identified 18 metabolites in men and six in women that were significantly different (FDR <0.05) between CS and NS. Three metabolites (PC ae C34:3, PC aa C36:1 and glutamate) were identified in both men and women showing the same pattern of variation (higher or lower) ( Table 2). Compared with FS and NS, in male CS the concentrations of four unsaturated diacyl-PCs (PC aa C34:1, PC aa C36:1, PC aa C38:3 and PC aa C40:4) and five amino acids (arginine, aspartate, glutamate, ornithine and serine) were higher, whereas three saturated diacyl-PCs, one lysoPC and four acyl-alkyl-PCs, as well as kynurenine showed lower concentrations. In female CS, we found higher levels of carnitine and PC aa C32:1, and a lower level of hydroxysphingomyeline (SM (OH)) C22:2.
Among the 21 smoking-related metabolites (18 in men and six in women), 19 were found to be reversible (that is, significant difference between FS and CS but without  significant difference between FS and NS; FDR <0.05).
No irreversible metabolite was observed (that is, significant difference between FS and NS). Serine and PC aa C32:3 in men were not classified because their concentrations were not significantly different between CS and FS or between FS and NS ( Table 2). A heat map representing the concentration profiles of the 21 identified metabolites in CS, FS and NS is shown in Figure 2, demonstrating the reversibility of metabolites after smoking cessation. In women, SM (OH) C22:2 was significantly associated with cessation time (FDR <0.05); however, there was no such significant metabolite in men (Table S1 in Additional file 1), indicating a non-linear relationship between cessation time and the reversion of metabolite profile. In addition, we grouped the FS by stratified cessation years (0 to 10, 11 to 20, 21 to 30, 31 to 40, over 40 years). For some metabolites (for example, PC ae C38:0, PC aa C36:0 and ornithine), the greatest change of concentration occurred within the first 10 years of cessation compared with CS ( Figure 3).
Within CS, we found kynurenine and PC ae C34:3, PC ae C38:0 and PC ae C38:6 in men, and PC aa C36:1 in women showing significant association with pack years.
In the linear regression model, pack years showed a negative relation (parameter estimation β <0) to these five metabolites (Table 3) (for example, one pack year increase will lead to a decrease of the kynurenine level in CS by 0.33%).
Prospective change of metabolite profiles (from KORA baseline S4 to follow-up F4) The prospective dataset included 40 CS, 432 NS and 49 quitters (people who were CS in KORA S4 but FS in KORA F4) ( Table 4). Among the 16 reversible metabolites in men, 13 (except kynurenine, glutamate and aspartate) were also measured in KORA F4 using a different kit (see Methods). We employed a linear mixed effect model to investigate the effects of smoking cessation on metabolite concentrations. Among these 13 metabolites, 10 metabolites showed a significant variation in quitters, with a period of smoking cessation from one to seven years, which indicated a reverting process. The arginine level decreased by 11.3% and ornithine by 14.8% in quitters compared with CS, whereas PC aa C36:0 increased by 18.5%. Figure 4 shows the prospective changes of the significant metabolites. For women, the same analysis was conducted.   five metabolites that were measured in both KORA S4 and F4 showed borderline significance (P <0.05). However, none of these metabolites was found to be significant considering FDR <0.05 (see Table 5).

Smoking effects on metabolic network
Enrichment analysis of the 21 identified smoking-related metabolites on Kyoto Encyclopedia of Genes and Genomes pathways showed enrichment in a set of amino acid and lipid metabolism pathways (ether lipid, glycerophospholipid, arginine and proline metabolism). In addition, we analyzed the impact of the smoking-related metabolites in each pathway by measuring their structural importance (see Methods). These metabolites had high betweenness centrality and a strong impact on the enriched pathways ( Figure 5 and Table S2 in Additional file 2). To systematically investigate how the effects of smoking propagate over the metabolic networks, we evaluated the association between 175 smoking-related genes, previously reported [23], and the 21 smoking-related metabolites we found in this study by analyzing protein-metabolite networks (see Methods). In men, 15 metabolites (lysoPC a C18:2, PC aa C32:3,PC aa C34:1, PC aa C36:0, PC aa C36:1, PC aa C38:0, PC aa C38:3, PC aa C40:4, PC ae C34:3, PC ae C38:0, PC ae C38:6, PC ae C40:6, arginine, glutamate and serine) were found to be linked with 11 genes (ADH7, AKR1B1, DHRS3, FTL, GALE, GPC1, KRAS, S100A10, SLC7A11, SULF1, PLA2G10) by related enzymes. In women, four metabolites (PC aa C36:1, PC ae C34:3, PC aa C32:1 and glutamate) were closely linked with nine genes (ADH7, AKR1B1, DHRS3, FTL, GALE, GPC1, S100A10, SULF1, PLA2G10) ( Figure 6A and Table S3 in Additional file 3). Similar to enrichment analysis, the network in men and in women could be generally divided into glycerophospholipids and tightly associated proteins as well as amino acids and the associated genes and enzymes. A description of the protein-metabolite and protein-protein interactions was listed in Table S3 in Additional file 3. The smoking effects on the networks were reversible. With regards to gene expressions, with the exception of SULF1 and PLA2G10, all changes in the networks were reversible after smoking cessation [23]. All changes in metabolites in the network were also reversible, except serine.

Discussion
In this study, we have used an 'omics' approach to investigate the association of metabolite concentrations with smoking, delineated the reversion of metabolite variations after smoking cessation and demonstrated the results using protein-metabolite networks. We identified strong associations of various metabolites with smoking, and confirmed part of the findings of our pilot study [1]. Among the 23 smoking-related metabolites identified in the pilot study, 11 metabolites were measured in this study, five of which (four unsaturated diacyl-PCs and one acyl-alkyl-PC) were validated in men, based on about five-fold larger CS samples. Consistent patterns of smoking effects on metabolite profile were observed in the current study. Among all the smoking-related metabolites, in CS we found higher unsaturated diacyl-PCs, but lower acyl-alkyl-PCs and saturated diacyl-PCs, which may indicate generally increased levels of unsaturated fatty acids in CS. Unsaturated fatty acids are more Results of linear regression of smoking intensity (pack years) on metabolite concentrations in men and women, adjusted for age, body mass index and alcohol consumption. All smoking-related metabolites presented in Table 2 are listed (*P <0.05). aa: diacyl-; ae: acyl-alkyl-; CS: current smokers; FS: former smokers; lysoPC: acyl-phosphatidylcholine; NS: never smokers; PC: phosphatidylcholine; SM (OH): hydroxysphingomyeline. vulnerable to lipid peroxidation and influence the risk of different diseases [43,44].

Smoking-related metabolites and cardiovascular disease
The study results implied the potential of metabolomics in revealing the role of an environmental factor, for example a smoking lifestyle, in the pathogenesis and prognosis of CVD. One study on the peripheral blood metabolite profile showed an association of coronary artery disease and urea cycle-related metabolites, including arginine and glutamate [45], which were also identified in our study as smokingrelated metabolites. By scrutinizing the smoking-related metabolites in metabolic pathways, we found further support for the pathophysiological relation between these metabolites and CVD. Previous findings indicated that the glutamate transporter in human lung epithelial cells, encoded by the SLC7A11 gene, is activated in CS [23,46], which increases the transportation of glutamate and subsequently raises the levels of the downstream metabolites, arginine and ornithine ( Figure 6B). The activation of the cysteine-glutamate transporter (encoded by SLC7A11) and the increased glutamate level as a response to oxidative stress is also of great importance to endothelial dysfunction involved at all stages of atherosclerotic plaque evolution, which leads to CVD [47,48].
Ether lipid and glycerophospholipid metabolisms are associated with smoking [1,49]. The decreased level of  Result of smoking status on metabolite concentrations using linear mixed model for S4 F4 longitudinal data, adjusted for age, BMI, and alcohol consumption. PC: phosphatidylcholine; aa: diacyl-; ae: acyl-alkyl-; lysoPC: acyl-phosphatidylcholine; SM (OH): hydroxysphingomyeline. a FDR<0.05. lysoPC a C18:2 reflects the inhibition of upstream synthesis and activation of downstream hydrolysis. As shown in Figure 6C, upregulation of S100A10 and GPC1 inhibits cytosolic phospholipase A2, which plays a role in the synthesis of lyso-PCs. The lysophospholipase I isoform, which hydrolyses lysoPC into glycerophosphocholine, is upregulated in CS [23]. Interestingly, one recent study showed that a disorder of phosphatidylcholine metabolism would promote CVD [50], which may establish a link between smoking-related phosphatidylcholine variation and cardiovascular events. For example, the phosphatidylcholine hydroperoxide will promote angiogenesis in endothelial cells that are associated with atherosclerotic development [51].
The reversibility of metabolite concentrations in a small time window may reveal a reduced risk of smoking-related diseases after stopping smoking. Concentrations of arginine and glutamate that are associated with both smoking and coronary artery diseases quickly returned to normal levels (within seven years) after smoking cessation, which is in line with epidemiological findings that the smoking effects on CVD are quickly and largely reduced after smoking cessation [8,9,52]. The reversed glutamate level indicates reduced oxidative stress after smoking cessation, and the reversion of arginine and ornithine reflects a reversion of functioning in the urea cycle. Our findings provide metabolic insight into the reduced risk of CVD after smoking cessation and provide support for the remarkable benefits people would gain by stopping smoking.

Concordance of reversibility in metabolic network
The protein-metabolite interaction network shows that the reversibility of metabolite concentrations also coincided with gene expression ( Figure 6A). Arginine and glutamate were quickly reversed after smoking cessation, which was in line with the quick reversibility of SLC7A11 expression. Expression of enzyme coding genes for the hydrolysis of diacyl-PCs and acyl-alkyl-PCs, for instance lysophospholipase, cytosolic phospholipase A2 and S100 calcium binding protein A2, were quickly reversible and smoking-related diacyl-PCs and acyl-alkyl PCs shared the same reverse pattern.

Gender-specific effects of smoking
In this study, we found gender-specific effects of smoking on metabolite profiles (Table S1 in Additional file 1). This result supports the assumption that differences in smoking effects on men and women are not solely based on smoking intensity but are also gender-specific. Glutamate was higher in both male and female CS, however, the levels of arginine and ornithine were only higher in male CS. According to a previous study of the metabolomic and genetic biomarkers on sexual dimorphisms [30], the CPS1 gene, which regulates the formation of arginine, has a gender-specific manner in certain single nucleotide polymorphisms, with stronger effects in women than in men. The gender-specific genetic effect might cause a lower efficiency in women in regard to the transformation of extra glutamate to citrulline ( Figure 6C).

Strengths and limitations
We used a systematic targeted metabolomics approach with 140 metabolites in a large population-based cohort. Analyzing the effects of smoking and smoking cessation in this prospective manner (follow-up of seven years) provides more power to investigate smoking effects by ruling out individual differences. However, our study is based on a limited range and number of metabolites and cannot fully represent the whole metabolome. Thus, an improved metabolomics technique measuring more metabolites is urgently needed for a comprehensive understanding of both reversible and permanent effects of smoking on human metabolism. It would be interesting for future studies to also include data on other environmental factors such as diet and lifestyle, which are known to have effects on the human metabolome [53,54].

Conclusions
Our study shows the power of the metabolomics approach in investigating the molecular signature of lifestyle-related environmental exposures. We demonstrated that smoking is associated with concentration variations in amino acids, ether lipid and glycerophospholipid metabolism at an 'omics' level. The smoking-related changes in the human  serum metabolite profile are reversible after stopping smoking. This indicates the remarkable benefits of smoking cessation and provides a link to CVD benefits. Furthermore, linking metabolomic knowledge to other 'omics' approaches, for example, transcriptomics, may have the potential to identify novel biomarkers as well as new risk assessment tools.
Additional file 2: Table S2: Enrichment and impact of smokingrelated metabolites in Kyoto Encyclopedia of Genes and Genomes pathways. Table shows the enrichment and impact scores of smokingrelated metabolites in Kyoto encyclopedia of Genes and Genomes pathways. The pathway analysis consists of enrichment and a structural impact analysis both based on Kyoto Encyclopedia of Genes and Genomes database. The -log (P) was considered as the enrichment score. Impact, scored between 0 and 1, indicated the pathway topological importance of the metabolites. In particular, the parameter Total is the total number of compounds in the pathway; the parameter Hits is the actual number of metabolites with significant variations in the pathway; the Raw P was the original P-value calculated from the enrichment analysis; the FDR was calculated as the P-value adjusted using Benjamini-Hochberg method.
Additional file 3: Table S3: Links between smoking-related metabolites, enzymes and genes. The table describes the links showed in Figure 6 of the main text. The smoking-related metabolites, enzymes and genes are listed in the first and second columns. The score of interaction is given according to the definition by the Search Tool for the Retrieval of Interacting Genes/Proteins [1]. A reference for each link and a short description is provided. The Column of reaction shows the possible biochemical reaction of the corresponding link or the type of protein interaction. The enzymes includes, phospholipase A2, membrane associated (GIIC sPLA2), cytosolic phospholipase A2 (cPLA2), group 10 secretory phospholipase A2 (PLA2G10), lysophospholipase I (LYPLA1), apolipoprotein A-V (APOA5), uteroglobin (SCGB1A1), lecithin retinol acyltransferase (LRAT), nitric oxide synthase 1 (NOS1), solute carrier family 3 member 2 (SLC3A2), serine dehydratase (SDH), 3-hydroxybutyrate dehydrogenase, type 1 (BDH). The smoking-related gene/protein includes, S100 calcium binding protein A10 (S100A10), glypican 1 (GPC1),