Study population
This study was conducted with data from MESA and its ancillary studies. MESA is a prospective cohort assembled initially to understand the relationship between subclinical cardiovascular disease markers and the incidence of cardiovascular disease. At baseline, the MESA cohort comprised 6814 non-institutionalized adults who were cardiovascular disease-free, aged 45-84 years, who self-identified as white, black, Hispanic, or Chinese from 6 study sites: New York, NY, Baltimore, MD, Forsyth County, NC, Chicago, IL, St. Paul, MN, and Los Angeles, CA [17]. In this analysis, we further restricted the analytic sample to the 5574 participants who were diabetes free at baseline. The study included follow-up phone questionnaires every 9-12 months. The baseline exam occurred from July 2000 through August 2002, and five follow-up in-person exams occurred an average of 1.6, 3.2, 4.8, 9.5, and 15.8 years later. The cohort included 5413; 5184; 5080; 4155; and 2962 participants without diabetes at each of the follow-up exams, respectively. Informed consent was obtained from all participants, and the study was approved by the institutional review boards at each study site.
Outcome
Type 2 diabetes status was ascertained at each follow-up clinical visit (exam) for study participants, and all participants with prevalent type 2 diabetes at baseline were excluded from this analysis. Using criteria from the American Diabetes Association [18], incident cases were defined as having a fasting glucose level of at least 126 mg/dL at the exam, use of insulin, or use of hypoglycemic medication. This study considered all incident cases as type 2 diabetes; however, these cases may include patients who developed Latent Autoimmune Diabetes in Adults (LADA). Additionally, because participants were evaluated for diabetes at exam intervals, the diabetes diagnosis dates were subject to interval censoring. Due to the uncertainty of the exact onset of diabetes, we assigned the date of diabetes diagnosis as the midpoint between the prior exam and the exam where the diagnosis was given.
Exposure
Greenspace was characterized using the Normalized Difference Vegetation Index (NDVI), updated at each exam to match changes in residence. NDVI is a commonly used measure of greenspace derived from satellite imagery [19,20,21]. For our analysis, NDVI from 2006 satellite imagery with a 250 m resolution was considered on a scale from 0 to 255-pixel brightness [22]. On this scale, areas with dense vegetation have values around 200, and non-vegetative surfaces such as roads and buildings have values less than 50.
We obtained participant residence information from the MESA Neighborhood Ancillary study. Home addresses were obtained at baseline and during all follow-up phone calls during the study period. At each participant residence, we characterized greenspace within a varying radius (buffer) to capture greenspace in the participant’s residential neighborhood. For our primary analysis, we used the annual median NDVI within a 1 km radius as our a priori exposure, based on prior research indicating that a 1 km radial buffer had the highest correlation with self-described neighborhood for the corresponding spatial resolution of the NDVI data [21]. NDVI was linked to the address reported at each exam to account for differences in exposure across the study period due to a change in residential address. In sensitivity analyses, we examined annual median NDVI within 500 m and 2.5 km, as well as high vegetation season (April 1 – September 30) median NDVI within a 1 km radius. These two time periods were used because NDVI can vary dramatically based on season.
Covariates
MESA ascertained important potential confounders through a combination of interviews, exams, and census data [17]. We included several individual sociodemographic characteristics: sex, age, race and ethnicity (non-Hispanic white, non-Hispanic black, Hispanic/Latinx and Chinese), education level (less than or equal to high school, some college, bachelor’s degree or higher), employment status (working outside the home or not) and income level. Participant income was calculated from the midpoint of total annual household income specified in categories divided by number of household members (<$12,000; $12,000-24,999; $25,000-39,999, $40,000-74,999, >$75,000). We also included several neighborhood-level characteristics to control for neighborhood-level confounding: neighborhood deprivation index (NDI) [3, 4], social cohesion [23,24,25], safety [26,27,28], walkability score [3, 6], and proportion urban all measured at the census-tract level. NDI is comprised of several variables from U.S. Census and American Community Survey data representing characteristics of education, occupation and employment, household income and wealth, poverty, and housing, and was assessed at exams 1, 3, and 4 [29,30,31,32]. Higher index values indicate greater neighborhood deprivation (i.e., lower neighborhood SES) [32]. Social cohesion, safety, and walkability scores were weighted averages by census tract based on survey responses from MESA participants at baseline and a random sample of people living in the same census tracts as MESA participants in 2004, as in prior MESA research [33]. Social cohesion was measured with a four-item scale including items about willingness to help neighbors, getting along with neighbors, neighborhood trust, and sharing values with neighbors. Safety was measured using a two-item scale about feeling safe in one’s neighborhood and violence in the neighborhood. Walkability was measured using a four-item scale about the perceived ease of walking to places in the neighborhood, seeing others walking or exercising, and pleasantness of walking in the neighborhood. Proportion urban measures the percentage of the census tract population living in an urban area, measured using U.S. Census data [30]. Finally, our primary model is additionally adjusted for study site due to the differing population characteristics and NDVI by site.
We additionally adjusted for type 2 diabetes risk factors including family history of diabetes, chronic stress, alcohol consumption, smoking status, body mass index (BMI), and physical activity score. Family history of diabetes was coded dichotomously from responses to a family medical history questionnaire answered at exam 2. Chronic stress was determined from the Chronic Burden scale, a six-item questionnaire given at exams 1 and 3 [34]. Participants were categorized as chronically stressed if they had indicated 6 or more months of strain or difficulties from a relationship, job, or finances, or 6 or more months of medical-related issues for themselves or someone close to them. As in prior MESA research [35], alcohol consumption was categorized as heavy, moderate, and not current drinker and smoking status was categorized as current, former, and never using data available at exams 1-6. BMI was assessed at exams 1-6. Physical activity score was assessed at exams 1-3 and 5, and is the sum of a reverse measure of sedentary behavior, in quartiles (0 = most sedentary, 3 = least sedentary) and a measure of moderate-vigorous physical activity (MVPA), in quartiles (1 = least MVPA, 4 = most MVPA), resulting in a physical activity score from 1 to 7 [36]. Additionally, several of the model covariates varied over each of the 6 exams in order to more accurately capture and control for possible confounding factors. These include: chronic stress, income, employment status, alcohol consumption, smoking status, BMI, physical activity score, and NDI. For all time-varying covariates, data were carried forward if they were missing from future follow-up exams.
Statistical approach
We conducted a survival analysis examining the association between greenspace exposure and incident diabetes in the MESA cohort. We used Cox proportional hazards regression with time-varying covariates with calendar time as the time scale to estimate the hazard ratio (HR) of incident diabetes risk for an interquartile (IQR) increase in NDVI [35, 37]. Participants were considered in the risk set until diagnosis of diabetes, last follow-up visit, or at the end of the study period (exam 6), whichever occurred first. In a sensitivity analysis, we re-ran the models to estimate the HR for a 0.1 increase in NDVI, on a − 1 to 1 scale to compare with results in the literature [5, 38].
We determined confounders a priori, and they were added in stages. Model 1 adjusted for age, sex, race/ethnicity, study site, education level, income, employment status, neighborhood deprivation index, neighborhood social cohesion, walkability, safety, and percent urban. The primary model (Model 2) adjusted for the covariates in Model 1 as well as family history of diabetes, BMI, physical activity score, chronic stress, drinking, and smoking. We ran an additional model (Model 3) excluding all potential mediators (BMI, physical activity score, walkability, safety, neighborhood social cohesion) to assess any suggestion of mediation by these factors.
We hypothesized a priori that the association between NDVI and incident diabetes may vary importantly by site, based on the skewed nature of NDVI by site, varying urbanicity, and differing population characteristics in each site. To assess the adequacy of the adjustment for study site using a fixed effect, we conducted a sensitivity analysis using a stratified Cox model with stratification by MESA study site. A stratified Cox model allows the baseline hazard to differ by MESA study site.
We evaluated the assumption of proportional hazards by plotting and testing Schoenfeld residuals and assessing the trend over time. In sensitivity analyses, we examined additional NDVI exposures (500 m and 2.5 km buffers, and high vegetation season NDVI), and effect modification by urbanicity and by neighborhood deprivation index based on prior work showing evidence that the association between greenspace and diabetes is different for urban versus rural areas [14], and by measures of neighborhood socioeconomic status [7, 12, 14]. We also examined effect modification by sex to compare results with prior studies [11, 16].