Databases
The California Cancer Registry (CCR) is the statewide population-based cancer surveillance system which maintains records about all malignancies diagnosed in California, with the exception of basal and squamous cell carcinoma of the skin. The CCR estimates that >98% of cancer diagnoses in California are captured. CCR data includes the date of diagnosis, primary anatomic site, histologic type, American Joint Committee on Cancer (AJCC) stage, initial type of treatment, and sociodemographic information.
The pesticide use reporting (PUR) program contains records of all agricultural pesticide use in California since 1990 and is maintained by the California Department of Pesticide Regulation. Under this state-mandated program, all pesticide applications to parks, golf courses, cemeteries, rangeland, pastures, along roadside and railroads, and all postharvest pesticide treatments of agricultural commodities must be reported monthly. Primary exceptions to reporting requirements include residential and some industrial and institutional applications.
For this study, we utilized CalEnviroScreen 3.0, California’s science-based mapping tool developed by the Office of Environmental Health Hazard Assessment and California Environment Protection Agency which uses the PUR database to obtain scores of average pollution burden for every census tract in the state.
In addition, data from PUR was integrated with data from land-use surveys [20, 21] (based on California’s Public Land Survey System (PLSS), specifying the exact location of crops on which pesticide was most likely used) to estimate cumulative exposure to specific pesticides. Specifically, the locations of agricultural pesticide applications were reported according to the PLSS, a grid that parcels land into sections with an area of approximately one square mile and is used in the 30 westernmost states formed from lands in the public domain. To improve the scale of available estimates, we combined this data with information from California land-use surveys, the countywide, large-scale surveys (1,24,000, or 1 inch = 2000 feet) of land use and crop cover conducted every 7–10 years. These data are available electronically with land-use types existing as contiguous polygons that are individually linked to their respective attribute information (e.g., land use type, acreage) in a database table. The reconciliation of these datasets was completed using geospatial software that allows for the generation of point estimates (and estimates of their variation) across a continuous spatial surface using the combined datasets. For each location (e.g., a residential address) we derived a summation of pesticide use from the surrounding area.
Patient cohort
Using the CCR, we identified patients with a first primary diagnosis of NHL from 1/1/2010 to 12/31/2016 using specific International Classification of Diseases-Oncology, 3rd edition (ICD-O-3) codes [22,23,24]. Patients with missing data, including date of diagnosis or date of follow-up (n=462), without a known cause of death (n=322) and those with human immunodeficiency virus infection (HIV)/acquired immune deficiency syndrome (AIDS) (n=1338) or a second malignancy (n=3083) after NHL diagnosis were excluded.
Estimating pesticide exposure
Identified NHL patients were merged by census tract for CalEnviroScreen 3.0 pesticide data. Production agricultural pesticide exposure to 70 chemicals was obtained from 2012 to 2014; the total pounds of selected active ingredients for each census tract was divided by each census tract area to obtain total pounds (lbs) per square mile, averaged over the 3 years [25].
In addition, NHL patients were merged with data from PUR and land-use surveys to estimate for cumulative exposure to specific pesticides previously associated with increased NHL incidence, glyphosate [26, 27], organophosphorus [28,29,30], carbamate [12, 31], phenoxyherbicide [32,33,34], and 2,4-dimethylamine salt [30, 35]. Cumulative pesticide exposure was measured as lbs of pesticide applied per acre/month within 2000 meters from residence at diagnosis, between 10 years prior up to 1 year after NHL diagnosis to include a significant amount of lead time prior to diagnosis through treatment of NHL. Pesticide exposure was then categorically grouped into levels of low, mid, and high exposure based on the tertile distribution of exposure levels for each pesticide.
Covariates
From the CCR, we obtained patient demographics which include gender, race/ethnicity, age and stage at diagnosis, modality of initial therapy, health insurance status, neighborhood socioeconomic status (SES), and rural or urban medical service study area (MSSA) at diagnosis. Initial treatment was identified as either chemotherapy and/or radiation therapy. In addition, patients were broadly categorized into the NHL subtypes of DLBCL, follicular lymphoma, Burkitt lymphoma, mantle cell lymphoma, marginal zone lymphoma, small lymphocytic lymphoma, lymphoblastic lymphoma, other B-cell lymphomas, T/NK-cell neoplasms, and unspecified lymphomas.
Statistical analyses
Multivariable Cox proportional hazards regression models were used to identify the association between pesticide exposure and lymphoma-specific and overall survival. Models considered five common types of pesticide (glyphosate, organophosphorus, carbamate, phenoxyherbicide, and 2,4-dimethylamine salt) and NHL subtype separately. Models included variables with a priori reasons for inclusion based on previous studies: gender, race/ethnicity, age and stage at diagnosis, modality of initial therapy, health insurance status, neighborhood SES and rural or urban MSSA [36,37,38]. We also examined whether associations between pesticide exposure and survival differed by race/ethnicity in multivariable Cox models performed separately in Hispanic/Latino, non-Hispanic white, Asian/Pacific Islander, and African American patients.
Lymphoma-specific survival was measured from the date of diagnosis to the date of death from lymphoma whereas overall survival considered death from all causes. Patients who died from causes other than lymphoma were censored at the time of death in the analysis of lymphoma-specific survival. Patients alive at the study end date (12/31/2016) were censored at this time or at the date of last known contact. Median follow-up time was calculated using the reverse Kaplan-Meier method [39, 40].
For all regression analyses, the proportional hazard assumption was assessed using Schoenfeld residuals [41]. Variables, including the stage at diagnosis and modality of initial treatment, that violated the proportional hazard assumption were included as stratification variables. Survival analyses were done using SAS 9.4 and all statistical tests were two-sided; a P-value of less than 0.05 was considered statistically significant. This study was approved by the University of California, Davis Institutional Review Board.