Genomic variation in myeloma: design, content, and initial application of the Bank On A Cure SNP Panel to detect associations with progression-free survival
- Brian Van Ness1Email author,
- Christine Ramos1,
- Majda Haznadar1,
- Antje Hoering2,
- Jeff Haessler2,
- John Crowley2,
- Susanna Jacobus3,
- Martin Oken4,
- Vincent Rajkumar5,
- Philip Greipp5,
- Bart Barlogie6,
- Brian Durie7,
- Michael Katz8,
- Gowtham Atluri9,
- Gang Fang9,
- Rohit Gupta9,
- Michael Steinbach9,
- Vipin Kumar9,
- Richard Mushlin10,
- David Johnson11 and
- Gareth Morgan11
© Van Ness et al; licensee BioMed Central Ltd. 2008
Received: 29 July 2008
Accepted: 08 September 2008
Published: 08 September 2008
We have engaged in an international program designated the Bank On A Cure, which has established DNA banks from multiple cooperative and institutional clinical trials, and a platform for examining the association of genetic variations with disease risk and outcomes in multiple myeloma.
We describe the development and content of a novel custom SNP panel that contains 3404 SNPs in 983 genes, representing cellular functions and pathways that may influence disease severity at diagnosis, toxicity, progression or other treatment outcomes. A systematic search of national databases was used to identify non-synonymous coding SNPs and SNPs within transcriptional regulatory regions. To explore SNP associations with PFS we compared SNP profiles of short term (less than 1 year, n = 70) versus long term progression-free survivors (greater than 3 years, n = 73) in two phase III clinical trials.
Quality controls were established, demonstrating an accurate and robust screening panel for genetic variations, and some initial racial comparisons of allelic variation were done. A variety of analytical approaches, including machine learning tools for data mining and recursive partitioning analyses, demonstrated predictive value of the SNP panel in survival. While the entire SNP panel showed genotype predictive association with PFS, some SNP subsets were identified within drug response, cellular signaling and cell cycle genes.
A targeted gene approach was undertaken to develop an SNP panel that can test for associations with clinical outcomes in myeloma. The initial analysis provided some predictive power, demonstrating that genetic variations in the myeloma patient population may influence PFS.
The draft sequence of the human genome published in 2001 [1, 2], followed by the more recent improved sequence release of the International Human Genome Consortium , have shown that there are large genetic variations in the human genome (polymorphisms). Unlike somatic mutations, polymorphisms are stable and heritable. Polymorphisms include single nucleotide polymorphisms (SNPs), and micro- and minisatellites, and may include heritable insertions and deletions (indels). Significantly, SNPs account for over 90% of genetic variation in the human genome . An important principle that has emerged from the consideration of genetic variation is that disease risk and clinical outcomes can be influenced by individual genetic backgrounds. Thus, while many diseases may have their unique genetic signatures, individual patient outcomes are dependent on heritable variations in a wide variety of genes and pathways affecting cellular functions and drug responses. Moreover, genetic variations in such global functions as inflammation, immunity and cellular signaling in the tumor microenvironment can have an impact on diverse clinical responses.
Multiple myeloma (MM) is a universally fatal disease characterized by the accumulation of malignant plasma cells in the bone marrow . It accounts for 2% of all cancer deaths and 15% of all hematologic malignancies, with about 13,000 deaths per year in the USA . While there are certain common clinical features such as anemia, bone lesions, hypercalcemia, immunodeficiency and renal failure, the disease shows significant heterogeneity with regard to morphology, disease progression, response to therapy and incidence of secondary malignancies. This heterogeneity likely is due, in part, to differences in genetic abnormalities within the malignant clone, as shown in many studies on chromosomal abnormalities  and gene expression profiles [6–8].
The growth of MM plasma cells is dependent on a complex interplay among various growth factors, adhesion molecules and other factors in the tumor microenvironment. Thus it might be expected that genetic variations in this interplay could have a profound influence on disease initiation, progression, associated bone complications, and response. Moreover, genetic variation in immunity and inflammation is an important consideration, as are variations in genes coding for drug metabolism and transport. Indeed, death from MM commonly results from infections associated with a severely compromised immune system resulting, in part, from therapeutic toxicities that may be related to variable rates of drug metabolism .
In order to address these issues we have engaged in an international program designated as the Bank On A Cure (BOAC). A cooperative program was established to bank DNA from multiple cooperative groups and institutional trials, and to develop a platform for examining the association of genetic variation with disease risk and outcomes. BOAC receives samples through Material Transfer Agreements, and clinical outcomes are provided through agreements with the Cancer Research and Biostatistics Group (Seattle) and the University of Minnesota (with Institutional Review Board, IRB, approval). Currently, the bank has over 2100 samples from the USA, representing six different clinical trials, patient-provided BOAC buccal cell kit samples, and unaffected controls accumulated since 1987. In this report we describe the development of a novel custom SNP panel based on the Affymetrix/Gene Chip Targeted Genotyping Platform, which contains 3404 SNPs representing variations in a variety of cellular functions and networks, and its initial application to myeloma DNA samples collected in the BOAC bank. We examined population frequencies in affected and unaffected individuals among different ethnic groups, and we developed some novel early approaches in using the SNP panel to determine whether genomic variations in the patient population influence survival.
Control and patient samples
DNA was prepared from 102 Coriell cell lines , representing 31 Caucasian, 24 African American, 23 Hispanic, and 24 Asian racial groups (unaffected by myeloma). DNA samples were also prepared from 143 myeloma patients enrolled in phase III clinical trials: E9486, n = 52  and S9321, n = 91 , with informed consent; and 34 unaffected, spousal controls (all Caucasian). E9486 patients ranging in age from 55 to 70 years were treated with Vincristine, Busulfan, Melphalan, Cyclophossphamide, Prednisone (VBMCP) followed by randomization to no further treatment, IFN-a, or cylcophosphamide; and, although there was variation in survival among all patients, no significant differences in survival were noted among the three arms of the trial . Patients included in this study from S9321 were in the same age range, and received Vincristine, Adriamycin, Dexamethasone (VAD) induction followed by VBMCP . S9321 patients in the trial arm randomized to high dose melphalan+TBI followed by transplant were not included. Patients for this analysis were selected based on progression-free survival (PFS) of less than 1 year (n = 70) or greater than 3 years (n = 73).
Custom BOAC SNP chip design and content
The Human Gene Mutation Database  contains a searchable database of polymorphisms associated with diseases cited in the literature. This database was used in conjunction with SNP500, SNPper, and MutDB to obtain the SNP id (rs number) of polymorphisms in the gene lists. A systematic search for all non-synonymous SNPs (ie, resulting in coding change) with a validated, minor allele frequency greater than 2% in all of the candidate genes was completed using SNP 500, dbSNP, and Affymetrix databases. SNPs failing to meet a 2% population frequency were included if the frequency was higher than 5% in selected racial subgroups (eg, Asian, African American, Caucasian).
A systematic search of the promoter regions in all the candidate genes for SNPs present in homologous regions between human and mouse with a minor allele frequency greater than 2% were identified using the PromoLign Database . Many of the SNPs selected in this method were seen to lie in or adjacent to transcription binding sites. Some additional promoter SNPs that may affect transcription binding sites were also identified using the FESD  database. Affymetrix provided several in-house validated SNP lists, including: inflammation and immunity, drug metabolism, and cancer lists. Three groups of admixture SNPs, which differ in frequency between Asian, African and European groups, were added to allow corrections in data analyses for racial specific variations . TagSNPs in genes influencing drug metabolism and transport were added from the supplementary data from Ahmadi et al. . The full SNP panel includes 3404 SNPs in 983 genes.
Genotyping was performed using the Affymetrix® GeneChip® Scanner 3000 Targeted Genotyping System (GCS 3000 TG System), which utilizes molecular inversion probes to simultaneously identify the 3404 pre-selected SNPs. The protocol has previously been described . All genotyping experiments were performed in strict adherence to the manufacturer's protocol.
Patients from the Eastern Cooperative Oncology Group (ECOG) and Southwest Oncology Group (SWOG) trials were selected using the following criteria: they were all Caucasian and between 55 and 70 years of age at diagnosis; patients with IgA subtype were excluded (as this is an independent, poor prognosis variable). Patients with the longest progression-free survival (PFS > 3 years) and patients with the shortest progression-free survival (PFS < 1 year) were selected.
Leave-one-out cross-validation . In this approach, the original data set of 143 patients was divided into two groups: one consisting of a single patient and one consisting of the remaining 142 patients. A classification model was built using the 142 patients as a training set and then this classification model was used to classify the single 'left out' patient. For this study, as well as the class label study below, we used a support vector machine (SVM) classifier, as implemented by the Weka package , and specified a liner kernel.
Randomization of class labels . For the original data and labels, we followed the standard practice for building and evaluating a classifier , that is, compare the performance of a classifier using the original and randomly shuffled class labels (permutations). There were 143 subjects, consisting of 73 cases and 70 controls. The training set was created by randomly selecting 50 cases and 50 controls and using the remaining subjects as a test set. One hundred runs were performed for the original data and class labels.
We also analyzed each clinical trial data set separately and used the other clinical trial data set as a validation set. Fisher's exact test was used as a univariate screening tool to rank the SNPs by how strongly they are associated with PFS. The top 50 SNPs of each trial with the smallest p-value were selected and used in a recursive partitioning analysis. For this recursive partitioning analysis we used RPART from the R software package, a language and environment for statistical computing. The tree-based library RPART was developed as described . The regression tree resulting from the analysis was subsequently pruned in order to avoid over-fitting. This regression tree was used on both the trial it was developed on as well as the other trial for validation purposes. Specificity and sensitivity were determined for each data set.
Finally, we attempted univariate ranking and recursive partitioning of the conglomerate data set (both trials combined) using random forests . Validation was examined by randomly mixing survival data sets and determining and comparing the predictive accuracy of true survival subsets and random subsets.
SNP chip panel design
Functional categories on the SNP panel
Cell-To-Cell Signaling and Interaction
Cellular Growth and Proliferation
DNA Replication, Recombination, and Repair
Nucleic Acid Metabolism
Skeletal and Muscular Disorders
Skeletal and Muscular System Development and Function
Signaling Kinase, Phosphatase, Transferase
Inflammation & Immunity
It is noteworthy to compare the content of the BOAC SNP chip to the SNPs represented on the Affymetrix 500K Array genome wide scan. The 500K Array panel is primarily derived from two restriction enzyme cleavage fragmentations, with SNP representation for each fragment, providing a comprehensive, global SNP panel. Well over 95% of the panel is intragenic, non-coding; and thus, its primarily use is to identify copy number, chromosomal regions, and linkage. Indeed, of the 3404 SNPs on the BOAC SNP chip, only 401 are present on the 500K Array panel. Thus, while the BOAC SNP chip does not have gene wide coverage, it does have a higher density of coding and regulatory content.
Samples and quality control assessments
For this study, a total of 279 DNA samples were profiled by the BOAC SNP chip. One hundred and thirty-six unaffected controls from the Coriell panel and spouses of myeloma patients were profiled. The Coriell panel included 31 Caucasian, 24 Asian, 23 Hispanic, and 24 African American samples of unaffected individuals. One hundred and forty-three myeloma samples were profiled, from the phase III clinical trials, ECOG E9486 (n = 52) and the chemotherapy arm of the ECOG-SWOG intergroup trial S9321 (n = 91). Treatment protocols are given in the Methods section. This study was in compliance with the Helsinki Declaration, and approved by the IRB at the University of Minnesota (approval # 0311M53428), with patient consents collected by the clinical cooperative groups' trial offices. Among all samples profiled, we had an average SNP call rate of 96%. The profiles of the Coriell panel allowed us to determine allelic frequencies in racial groups and unaffected populations. Of the 3404 SNPs on the BOAC panel, 786 were contained in the SNP500 cancer database, allowing us to determine concordance between the two Coriell data sets. We found very good agreement between our data set and the national database, with an average of > 97% concordance. We also duplicated the profile of a number of samples (n = 10), and found better than 99.7% reproducibility between duplicate samples. This concordance and duplication rate was also equivalent when comparing the BOAC SNP panel run in the USA and UK facilities, providing a cross validation between BOAC laboratory sites. Finally, for every batch run of 24 samples, the Affymetrix platform includes a control DNA sample, and this provided continuous monitoring and quality assurance across the study.
Allelic variations by race
Allelic variations associated with progression-free survival
Although genetic deregulation within the tumor population has been shown to stratify clinical outcomes [5–8], a significant impact on therapeutic outcomes may result from genetic variations in germline DNA affecting a number of important functions, including drug metabolism, transport, DNA repair, immune response, growth factors, angiogenesis, etc. To explore the SNP associations on the BOAC SNP chip we chose to examine an extreme phenotype comparison in two phase III clinical trials with similar chemotherapeutic treatments. E9486 patients ranging in age from 55 to 70 years were treated with VBMCP followed by randomization to no further treatment, IFN-a, or cylcophosphamide; and, although there was variation in survival among all patients, no significant differences in survival were noted among the three arms of the trial . Patients included in this study from S9321 were in the same age range, and received VAD induction followed by VBMCP . S9321 patients in the trial arm receiving high dose melphalan+TBI, going on to transplant were not included. The goal was to identify SNPs that may distinguish short term (less than 1 year) versus long term progression-free survivors (greater than 3 years).
Predicted vs. actual survival classes for patients.
Actual Patient PFS < 1 year
Actual Patient PFS > 3 year
Predicted Patient PFS < 1 year
Predicted Patient PFS > 3 years
The classification accuracy of the leave-one-out approach is (50+45)/143 = 0.66. If there was no true discriminating signal in the data, then the classifiers built by the leave-one-out procedure should produce a table with a relatively evenly distributed number of entries among the four cells, since the classes are of roughly the same size and the predictions should be random. However, the observed table is far from that random distribution. By using Fisher's exact test it is possible to compute the probability (p-value) for obtaining a table with the same or better accuracy of prediction by random chance. Specifically, the p-value is 7.7 × 10-5, which strongly indicates that the result is not due to random chance. The calculated odds ratio (OR) for survival is 3.9 CI (2.0, 7.8). We subsequently focused on SNP subsets that might provide more directed functional associations and found that the best predictor of survival was achieved when just the non-synonymous SNPs and the promolign SNP subset in introns was used. The accuracy of prediction increased to 75.5% OR = 9.6 CI (4.5, 20.5).
Another approach that is commonly used for classification is the generation of random subsets for training and validation. A classification model is built on the training subset, and is evaluated on a separate test set. The process is repeated and yields a distribution of accuracy on the data set labels (eg, short versus long PFS). This is then repeated with random shuffling of the survival data sets to determine whether there is true signal accuracy. For this analysis, we again used a support vector machine classifier, as implemented by the Weka package , using a preset linear kernel option). The training set consisted of repeated samples of 50 short term and 50 long term survivors, with the remaining patient samples used for test validation. One hundred runs were performed for the short and long term classifiers, and the average accuracy on test set analysis was 61.4% +/-7.1%. One hundred runs of random mixed set comparisons generated an average accuracy of 47.5% +/-7.3%. This further suggests that there are true differences in the genotypes that impact survival classification. A t-test was performed to evaluate the difference between the classification results based on the original and randomized class labels, resulting in a p-value for survival classification of less than 0.0001. This further indicates that, as a group, the SNPs are providing a measure of true discrimination of survival.
Top SNPs ranked by univariate analysis for trial S9321
Flavin containing monooxygenase 3
PTEN induced putative Kinase 1
Glutathione S-transferase A4
Cyclin-dependent kinase 2
ATP-binding cassette, sub-family G (WHITE), member 8 (sterolin 2)
P21 (CDKN1A)-activated kinase 7
Ligase III, DNA, ATP-dependent
Dual specificity phosphatase 1
ATP-binding cassette, sub-family B (MDR/TAP), member 1
Interleukin 12A (natural killer cell stimulatory factor 1, cytotixic lymphocyte maturation factor 1, p35)
ATPase, Cu++ transporting, beta polypeptide (Wilson disease)
Polymerase (DNA directed), beta
Sarbohydrate (chondroitin 6) sulfotransferase 3
Tumor necrosis factor receptor superfamily, member 17
Superoxide dismutase 2, extracellular
Nuclear factor of activated T-cells, cytoplasmic, calcineurin-dependent 2
Top SNPs ranked by univariate analysis for trial E9486.
unknown, TAG ERROR
Farnesyl-diphosphate farnesyltransferase 1
Cytochrom P450, family 4, subfamily F, polypeptide 2
Protein kinase STYK1
5, 10-methylenetetrahydrofolate reductase (NADPH)
Cyclin-dependent kinase 5
Tumor necrosis factor (ligand) superfamily, member 10
Cytochrome P450, family 1, subfamily A, polypeptide 1
Intercellular adhesion molecule 1 (CD54), human rhinovirus receptor
Steroid-5-alpha-reductase, alpha polypeptide 1 (3-oxo-5 alpha-steroid delta 4-dehydrogenase alpha 1)
Conserved helix-loop-helix ubiquitous kinase
TNF receptor-associated protein 1
B-cell CCL/lymphoma 6 (zinc finger protein 51)
ATP-binding cassette, sub-family C (CFTR/MRP), member 1
Plasminogen activator, urokinase
Colony stimulating factor 1 (macrophage)
Steriodogenic acute regulator
5, 10-methlyenetetrahydrofolate reductase (NADPH)
In the univariate ranking, we did not correct for multiple comparisons; that would certainly reduce the p-value significance, but would not alter the rank order comparison. This approach does examine the association of each individual SNP; it is more likely that complex interactions may drive association of groups of SNPs not revealed by univariate ranking. Nevertheless, among the top ranked individual SNP variations in both trials were those associated with drug metabolism/detoxification/transport, including: cyp genes, multiple variants of GSTA4, SLCO, UGT1, NAT2, ABCB genes; as well as genes impacting cellular response, including: BMP2 (inducing myeloma apoptosis) , cathepsin B (inducing IL-8 dependent cellular migration and angiogenesis [31, 32], XRCC5 (DNA repair); and genes associated with proliferative responses (PCNA, MAPK, cyclin kinase). The association of multiple alleles of GSTA4 is particularly compelling, suggesting consistency in its impact across several variant alleles. In addition, several alleles are in linkage disequilibrium, appearing as a cluster in the list – providing quality controls (as linked genes would be expected to show the same association).
As an exploratory approach, we combined the data sets from both trials, then used Fisher's exact test as a univariate screening tool to determine the association of each SNP with survival. When we treated the top 165 SNPs (univariate p < 0.02) as a set for short versus long PFS classification prediction using a random forest multiple sampling approach , we found a 79% correct classification rate. However, similar classification accuracy could be achieved with random class labels, demonstrating the potential of false positive associations in such complex data sets.
We have designed a novel SNP panel, containing 3404 genetic variations associated with 983 genes involved in a variety of cellular functions that could impact population variations in tumor progression and response (Table 1 and ). This approach is distinct from using genome-wide SNP arrays of 500,000 SNPs. The Affymetrix 500K SNP Panel is based on restriction enzyme cleavage sites and representative spacing on the chromosomes. While having significantly greater content, over 90% of the SNPs on the whole genome array are intragenic; and the chip is most often analyzed for linkage associations. The multiple comparison false positive error rate is large, and the technology considerably more expensive. Indeed, of the 3404 gene-associated SNPs on the BOAC SNP panel, only 401 are contained on the 500K SNP panel.
There are limitations to the BOAC SNP panel as well. The public and Affymetrix databases used to construct the chip content are constantly updating, so that missing elements may be noted. While we targeted SNPs in non-synonymous coding sequences or highly conserved regulatory sequences, many of the SNPs have not yet been functionally documented for effects. As such, SNP associations in the BOAC panel represent a first step in exploring the genome for clinically relevant genetic variations that will require both extensive validations as well as functional assays to confirm their effect.
We made a considerable effort to ensure that quality controls were in place. The Affymetrix platform provided a high call rate (96%) as well as very high concordance in replicate samples, even those run at different facilities. The concordance extended to 786 SNPs on the panel that were documented for the Coriell cell lines we have included . All of the samples we analyzed had high quality DNA (A260/280 ratios > 1.7, and little DNA degradation). In subsequent unrelated studies, we found that even highly degraded DNA provides robust, high call rates and reproducibility (not shown); probably because the initial amplifications are across 100–150 bp of DNA. The most likely source for quality control error may come from sample misidentification or placement in multi-well plates. To control for this, we routinely incorporate randomly positioned controls and replicates.
Within the Coriell cell line panel is a distribution of racial groups. It is striking how much allelic frequencies differ in the African American vs. Caucasian racial groups. It is likely there is more refinement of allelic variations associated with more geographical based lineages , as racial definitions are somewhat subjective and often self reported. Importantly, as the BOAC database increases, multiple comparisons can be done with appropriate corrections for allelic variations among races. It will be important to include the full spectrum of patients as the database expands.
Disease progression, response and survival vary widely among patients. There are a number of studies that have examined variations in tumor cell chromosomal abnormalities  and gene expression profiles [6–8]. The evidence strongly suggests that patient outcomes are impacted by these tumor cell variations. However, patient populations show considerable germline variation that could influence the microenvironment, immune status, and drug metabolism or transport. For example, the authors (DJ, GM) have presented evidence that germline variations in GSTP1 show alterations in melphalan metabolism, and have been associated with different outcomes in patients receiving high dose melphalan therapies . Numerous examples of variations in drug metabolism, transport, and DNA repair have been documented, with emerging associations on therapeutic outcomes.
Our approach was to provide a more global germline analysis that was driven by bioinformatic searches for potentially relevant variations in multiple genes and gene functions. This is still an exploratory approach to identify potential variations of functions that impact upon therapeutic responses and disease progression that may result in differences in survival outcomes. Rather than a linear progression of survival, we chose to examine two extreme ends of the PFS spectrum, to maximize the first steps in identifying potential functional variations. Patients were stratified by short (< 1 year) versus longer (> 3 years) PFS groups. Nevertheless, it is likely that survival is a complex endpoint resulting from both tumor progression and therapeutic failure that may impact upon multiple organ systems. Moreover, we recognized that a) tumor variation among patients may have dominant effects that are associated with survival; b) the trials we examined used multi-drug regimens, and each drug response may be impacted upon by complex genetic variations in transport, metabolism, and export; and c) sample number is still limiting statistical power. Thus, our initial approaches in this study were to determine whether germline variations had any measurable influence on survival.
We felt it was important to determine, first, if there were any true discrimination of the SNP panel in the two PFS groups, when the complete SNP profile was considered. Using a variety of methods that were tested against randomly mixed sample analysis, we found the SNP panel had true signal to discriminate the short and long progression-free survivors, although the accuracy did not reach the level of prediction that would allow clinical application. Notably, a smaller subset increased the predictive power. Significantly, no individual genetic variation provided a strong, independent prediction of survival. This likely reflects the fact that individual germline variations may impact upon response, but are not solely responsible; and it is likely that such variations are the result of complex interactions. Indeed, genetic variations in the tumor cell may play a dominant role in response and survival. Thus, patient responses are likely to involve interactions affecting multiple functions within the tumor cell as well as external factors affecting tumor progression and drug response. Nevertheless, our analysis of the SNP panel as a group suggests it is likely that germline variations impact upon patient survival and deserve further attention.
Recognizing the limited statistical power to detect single SNPs associated with PFS, we did perform a univariate analysis to rank order the SNPs that individually best discriminated the groups in the two similar phase III clinical trials. We did not correct for multiple comparisons, which would certainly reduce the p-value significance but would not alter the rank order comparison. This approach also assumes association for the individual SNP. It is more likely that complex multi-SNP groupings influence response. Nevertheless, among the top SNP variations in both trials were those associated with drug metabolism/detoxification/transport, including: cyp genes, multiple variants of GSTA4, SLCO, UGT1, NAT2, ABCB genes; as well as genes impacting cellular response, including: BMP2 (inducing myeloma apoptosis), cathepsin B (inducing IL-8 dependent cellular migration and angiogenesis [31, 32], XRCC5 (DNA repair); and genes associated with proliferative responses (PCNA, MAPK, cyclin kinase). The association of multiple alleles of GSTA4 is particularly compelling, suggesting consistency in its impact across several variant alleles. In addition, several alleles are in linkage disequilibrium, appearing as a cluster in the list – providing quality controls (as linked genes would be expected to show the same association). Surprisingly absent from the SNP association lists are cytokines, growth factors and receptors that might be expected to cause variations in disease progression and resistance, with the exception of IL-10, which has been reported in previous studies .
While still an exploratory analysis, the paired SNPs identified by recursive partitioning in each trial have some intriguing possible connections to PFS. COMT (catechol-O-methyltransferase) metabolizes catechol drugs, and has been linked to breast cancer risk and survival ; GHRL has been shown to stimulate angiogenesis  and regulate bone formation through osteoblasts [38, 39]; FDFT is the farnesyl transferase that may regulate important signaling (eg, ras) [40, 41]; and ABCC is among a class of transporters that may influence multi-drug resistance . It is noted that strong association in one trial was significantly reduced in the validation trial. Nevertheless, the functional impact of these genetic variations may warrant further investigation.
The exploratory analyses provide some of the first attempts to use larger, targeted SNP panels to develop models of genomic variations that may influence treatment outcomes, and that may deserve further analysis of functional significance. Not surprisingly, among the most significant variations correlating with survival were genes that could be functionally categorized as pharmacologic. However, the group analysis suggests various functions may interplay in disease progression and response. It is important to consider the fact that we could not identify a small driver set of SNPs that strongly associated with survival, particularly with the limited sample size. However, we note that, as a group, germline genomic variations do have impact on event-free survival. As the Bank On A Cure data set is expanding, SNP associations are being analyzed for more specific phenotypes in response, disease complications (eg, bone disease), and adverse or toxic drug effects (eg, thrombolytic events associated with thalidomide).
Heterogeneity in tumor gene deregulation certainly contributes to variation in disease outcome. It would seem appropriate to consider combining an understanding of tumor heterogeneity (chromosomal and expression profiles) with germline variations (eg, SNP variations associated with pharmacologic functions or disease complications) that can lead to development of more individualized therapies that take into account both tumor and population variations.
Bank On A Cure
Eastern Cooperative Oncology Group
Institutional Review Board
single nucleotide polymorphism
support vector machine
Southwest Oncology Group
Vincristine, Adriamycin, Dexamethasone
Vincristine, Busulfan, Melphalan, Cyclophossphamide, Prednisone.
This work was supported by the International Myeloma Foundation; a grant from the National Institutes of Health to the Eastern Cooperative Oncology Group PO1 CA62242 (to BVN) and the Southwest Oncology Group 5U10CA038926 (to JC); and NSF Grant CNS-0551551 (to VK).
- Lander ES, from the International Human Genome Sequencing Consortium, et al: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921.PubMedView ArticleGoogle Scholar
- Venter JC, et al: The sequence of the human genome. Science. 2001, 291: 1304-1351.PubMedView ArticleGoogle Scholar
- International Human Genome Consortium: Finishing the euchromatic sequence of the human genome. Nature. 2004, 431: 931-945.View ArticleGoogle Scholar
- Kyle RA, Rajkumar SV: Plasma cell disorders. Cecil textbook of medicine. Edited by: Goldman L, Ausiello DA. 2004, Philadelphia: W.B. Saunders, 1184-1195. 22Google Scholar
- Bergsagel PL, Kuehl WM: Molecular pathogenesis and consequent classification of multiple myeloma. J Clin Oncol. 2005, 23 (26): 6333-6338.PubMedView ArticleGoogle Scholar
- Shaughnessy JD, Zhan F, Burington BE, Huang Y, Colla S, Hanamura I, Stewart JP, Kordsmeier B, Randolph C, Williams DR, Xiao Y, Xu H, Epstein J, Anaissie E, Krishna SG, Cottler-Fox M, Hollmig K, Mohiuddin A, Pineda-Roman M, Tricot G, van Rhee F, Sawyer J, Alsayed Y, Walker R, Zangari M, Crowley J, Barlogie B: A validated gene expression model of high-risk multiple myeloma is defined by deregulated expression of genes mapping to chromosome 1. Blood. 2007, 109 (6): 2276-2284.PubMedView ArticleGoogle Scholar
- Jenner MW, Leone PE, Walker BA, Ross FM, Johnson DC, Gonzalez D, Chiecchio L, Dachs Cabanas E, Dagrada GP, Nightingale M, Protheroe RK, Stockley D, Else M, Dickens NJ, Cross NC, Davies FE, Morgan GJ: Gene mapping and expression analysis of 16q loss of heterozygosity identifies WWOX and CYLD as being important in determining clinical outcome in multiple myeloma. Blood. 2007, 110 (9): 3291-3300.PubMedView ArticleGoogle Scholar
- Chng WJ, Kumar S, Vanwier S, Ahmann G, Price-Troska T, Henderson K, Chung TH, Kim S, Mulligan G, Bryant B, Carpten J, Gertz M, Rajkumar SV, Lacy M, Dispenzieri A, Kyle R, Greipp P, Bergsagel PL, Fonseca R: Molecular dissection of hyperdiploid multiple myeloma by gene expression profiling. Cancer Res. 2007, 67 (7): 2982-2989.PubMedView ArticleGoogle Scholar
- Kay NE, Leong TL, Bone N, Vescole DH, Greipp PR, Van Ness B, Oken MM, Kyle RA: Blood levels of immune cells predict survival in myeloma patients: results of an Eastern Cooperative Oncology Group phase 3 trial for newly diagnosed multiple myeloma patients. Blood. 2001, 98 (1): 23-28.PubMedView ArticleGoogle Scholar
- Packer BR, Yeager M, Burdett L, Welch R, Beerman M, Qi L, Sicotte H, Staats B, Acharya M, Crenshaw A, Eckert A, Puri V, Gerhard DS, Chanock SJ: SNP500 Cancer: a public resource for sequence validation, assay development, and frequency analysis for genetic variation in candidate genes. Nucleic Acids Res. 2006, D617-621. 34 database
- Oken MM, Leong T, Lenhard RE, Greipp PR, Kay NE, Van Ness B, Keimowitz RM, Kyle RA: The addition of interferon or high dose cyclophosphamide to standard chemotherapy in the treatment of patients with multiple myeloma: Phase III Eastern Cooperative Oncology Group Clinical Trial EST 9486. Cancer. 1999, 86 (6): 957-968.PubMedView ArticleGoogle Scholar
- Barlogie B, Kyle RA, Anderson KC, Greipp PR, Lazarus HM, Hurd DD, McCoy J, Moore DF, Sakhil SR, Lanier KS, Chapman RA, Cromer JN, Salmon SE, Durie B, Crowley JC: Standard chemotherapy compared with high-dose chemoradiotherapy for multiple myeloma: final results of phase III US Intergroup Trial S9321. J Clin Oncol. 2006, 24 (6): 929-936.PubMedView ArticleGoogle Scholar
- Hoffmann R, Krallinger M, Andres E, Tamames J, Blaschke C, Valencia A: Text mining for metabolic pathways, signaling cascades, and protein networks. Sci STKE. 2005, pe21-283
- Pharmgkb. [http://www.pharmgkb.org/search/pathway/pathway.jsp]
- BioCarta. [http://www.biocarta.com]
- KEG. [http://cgap.nci.nih.gov/Pathways/Kegg_Standard_Pathways]
- Stenson PD, Ball EV, Mort M, Phillips AD, Sheil JA, Thomas NS, Abeysinghe S, Krawszak M, Cooper DN: The Human Gene Mutation Database (HGMD®): 2003 Update. Hum Mutat. 2003, 21: 577-581.PubMedView ArticleGoogle Scholar
- Zhao T, Chang LW, McLeod HL, Stormo GD: PromoLign: A Database for Upstream Region Analysis and SNPs. Human Mutation. 2004, 23: 534-539.PubMedView ArticleGoogle Scholar
- Kang HJ, Choi KO, Kim BD, Kim S, Kim YJ: FESD: a Functional Element SNPs Database in human. Nucleic Acids Res. 2005, 33 (1): D518-522.PubMedPubMed CentralView ArticleGoogle Scholar
- Miller RD, Phillips MS, Jo I, et al: High-density single-nucleotide polymorphism maps of the human genome. Genomics. 2005, 86 (2): 117-126.PubMedPubMed CentralView ArticleGoogle Scholar
- Ahmadi KR, Weale ME, Xue ZY, Soranzo N, Yarnall DP, Briley JD, Maruyama Y, Kobayashi M, Wood NW, Spurr NK, Burns DK, Roses AD, Saunders AM, Goldstein DB: A single-nucleotide polymorphism tagging set for human drug metabolism and transport. Nat Genet. 2005, 37 (1): 84-89.PubMedGoogle Scholar
- Hardenbol P, Baner J, Jain M, Nilsson M, Namsaraev EA, Karlin-Neumann GA, Fakhrai-Rad H, Ronaghi M, Willis TD, Landegren U, Davis RW: Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat Biotechnol. 2003, 21 (6): 673-678.PubMedView ArticleGoogle Scholar
- Kearns M, Ron D: Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Proceedings of the Tenth Annual Conference on Computational Learning theory (Nashville, Tennessee, United States, July 06 – 09, 1997). COLT '97. 1997, ACM, New York, NY, 152-162.View ArticleGoogle Scholar
- Dinu Valentin, Zhao Hongyu, Miller Perry: Integrating domain knowledge with statistical and data mining methods for high-density genomic SNP disease association analysis. Biomedical Informatics. 2007, 40: 750-760.PubMedView ArticleGoogle Scholar
- Witten Ian, Frank Eibe: "Data Mining: Practical machine learning tools and techniques". 2005, Morgan Kaufmann, San Francisco, 2Google Scholar
- Therneau Terry, Atkinson Elizabeth: An introduction to recursive partitioning using the rpart routines. Technical report 61, Mayo Clinic. 1997, http://mayoresearch.mayo.edu/mayo/research/biostat/techreports.cfm, R package available at http://cran.r-project.org/src/contrib/Descriptions/rpart.htmlGoogle Scholar
- Agreti A: An introduction to categorical data analysis. 1996, Wiley, NJGoogle Scholar
- Breiman L: Random forests. Machine Learning. 2001, 45: 5-32.View ArticleGoogle Scholar
- Efron B, Tibshirani R: An introduction to the Bootstrap. 1994, Chapman & Hall/CRC, FLGoogle Scholar
- Kaplan EL, Meier P: Nonparametric estimation from incomplete observations. J Am Stat Assoc. 1958, 53: 457-481.View ArticleGoogle Scholar
- Shriver MD, Mei R, Parra EJ, Sonpar V, Halder I, Tishkoff SA, Schurr TG, Zhadanov SI, Osipova LP, Brutsaert TD, Friedlaender J, Jorde LB, Watkins WS, Bamshad MJ, Gutierrez G, Loi H, Matsuzaki H, Kittles RA, Argyropoulos G, Fernandez JR, Akey JM, Jones KW: Large-scale SNP analysis reveals clustered and continuous patterns of human genetic variation. Hum Genomics. 2005, 2 (2): 81-89.PubMedPubMed CentralGoogle Scholar
- Kawamura C, Kizaki M, Ikeda Y: Bone morphogenetic protein (BMP)-2 induces apoptosis in human myeloma cells. Leuk Lymphoma. 2002, 43 (3): 635-9.PubMedView ArticleGoogle Scholar
- Schraufstatter IU, Trieu K, Zhao M, Rose DM, Terkeltaub RA, Burger M: IL-8-mediated cell migration in endothelial cells depends on cathepsin B activity and transactivation of the epidermal growth factor receptor. J Immunol. 2003, 171: 6714-6722.PubMedView ArticleGoogle Scholar
- Yanamandra N, Gumidyala KV, Waldron KG, Gujrati M, Olivero WC, Dinh DH, Rao JS, Mohanam S: Blockade of cathepsin B expression in human glioblastoma cells is associated with suppression of angiogenesis. Oncogene. 2004, 23: 2224-2230.PubMedView ArticleGoogle Scholar
- Bamshad M: Genetic influence on health: does race matter?. JAMA. 2005, 294: 937-946.PubMedView ArticleGoogle Scholar
- Dasgupta RK, Adamson PJ, Davies FE, Rollinson S, Roddam PL, Ashcroft AJ, Dring AM, Fenton JA, Child JA, Allan JM, Morgan GJ: Polymorphic variation in GSTP1 modulates outcome following therapy for multiple myeloma. Blood. 2003, 102 (7): 2345-2350.PubMedView ArticleGoogle Scholar
- Zheng C, Huang D, Liu L, Wu R, Bergenbrant Glas S, Osterborg A, Bjorkholm M, Holm G, Yi Q, Sundblad A: Interleukin-10 gene promoter polymorphisms in multiple myeloma. Int J Cancer. 2001, 95 (3): 184-188.PubMedView ArticleGoogle Scholar
- Long JR, Cai Q, Shu XO, Cai H, Gao YT, Zheng W: Genetic polymorphisms in estrogen-metabolizing genes and breast cancer survival. Pharmacogenet Genomics. 2007, 17 (5): 331-338.PubMedView ArticleGoogle Scholar
- Li A, Cheng G, Zh GH, Tarnawaski AS: Ghrelin stimulates angiogenesis in human microvascular endothelial cells: Implications beyond GH release. Biochem Biophys Res Commun. 2007, 353 (2): 238-2343.PubMedView ArticleGoogle Scholar
- Kim SW, Her SJ, Park SJ, Kim D: Ghrelin stimulates proliferation and differentiation and inhibits apoptosis in osteoblastic MC3T3-E1 cells. Bone. 2005, 37 (3): 359-69.PubMedView ArticleGoogle Scholar
- Fukushima N, Hanada R, Teranishi H, Fuku Y: Ghrelin directly regulates bone formation. J Bone Miner Res. 2005, 20 (5): 790-798.PubMedView ArticleGoogle Scholar
- Alvarado Y, Giles FJ: Ras as a therapeutic target in hematologic malignancies. Expert Opin Emerg Drugs. 2007, 2: 271-284.View ArticleGoogle Scholar
- Hu L, Shi Y, Hsu JH, Gera J, Van Ness B, Lichtenstein A: Downstream effects of oncogenic ras in multiple myeloma cells. Blood. 2003, 101 (8): 3126-3135.PubMedView ArticleGoogle Scholar
- Pol van der MA, Broxterman HJ, Pater JM, Feller N, Mass van der M, Weijers GW, Scheffer GL, Allen JD, Scheper RJ, van Loevezijn A, Ossenkoppele GJ, Schuurhuis GJ: Function of the ABC transporters, P-glycoprotein, multidrug resistance protein and breast cancer resistance protein, in minimal residual disease in acute myeloid leukemia. Haematologica. 2003, 88 (2): 134-147.PubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1741-7015/6/26/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.