SGS exhibited convergent validity through its clear association with objective measures of green neighborhood qualities, although the agreement was low. Concurrent validity was also demonstrated, but it should be stressed that the association between individual SGS and vitality could not be replicated when objective measures of the neighborhood were used. In contrast, area-aggregated SGS yielded associations consistent with the objective measures and may therefore be a useful approach to avoid bias due to confounded self-assessments.
Strengths and limitations of the study
To the best of our knowledge, this is one of the first studies assessing validity of individual as well as area-aggregated self-reports of green neighborhood qualities in relation to objective assessments and concurrent health questions in a large and representative sample with detailed adjustment for socio-demographic variables. The multilevel (ecometric) analysis takes the varying sampling size across areas into account and also facilitates adjustment for confounding from socio-demographic factors in the self-reports. Another strength of the study was the focus in SGS on perceived qualities of the green neighborhood. Our results suggest that perceived qualities are likely to be relevant for health and well-being in addition to more simple constructs such as perceived availability of a green open space or forest area. An attractive feature of the area-aggregated SGS is that it captures perception of the green environment while being stronger correlated with objective measures and less susceptible to single-source bias compared to the individual self-reports.
An important limitation of the study was that we were not able to validate self-reports in the most urbanized inner city areas. The perception and the relative importance of green neighborhood qualities on health may very well be different in inner city areas. Furthermore, the most urbanized city areas are likely to accommodate large groups of individuals who could be more dependent on the (green) neighborhood environment they live in, e.g. people who spend a larger amount of their time at home  and tenants, who often lack access to an own garden .
Another limitation was the cross-sectional study design that limited the ability to assess temporal associations, i.e. predictive validity of the SGS [34, 35]. The low agreement with the self-assessments of green qualities, and the relatively weak association between GIS index score and neighborhood satisfaction after including SGS in the model, may also raise concerns about the GIS-based assessments regarded as gold standard in our study. These objective assessments, developed by experts in landscape planning, show clear associations with neighborhood satisfaction and physical activity , but are not validated constructs. Main points of concern are i) the data sources reflect physical attributes, e.g. land use, while originally the definitions of the qualities are based on individual preferences, and ii) the assessments may suffer from inaccuracies and lack of sufficient detail in the land cover classification .
In order to limit the number of analyses we restricted the use of area-aggregated measures to the index score (SGS). With equal weights for all five qualities in SGS, the inherent assumption is that the qualities are all equally important for the health indicators under investigation (i.e., more qualities is always better for health), but this assumption can of course be questioned. The green qualities are distinct entities, e.g. wild environments (plants seem self-sown, lichen and moss-grown rocks, old paths etc.; see Additional file 1 : appendix 1) are clearly different from e.g. environments rich in culture (a historical place offering fascination with the course of time; historical sights and remains etc.). Identification of specific elements (aspects, qualities) in natural environments that promote human health is an issue of great interest currently within landscape planning and environmental health [16, 20, 27]. In the present paper, we calculated an area-aggregated proportion for each quality in a two-level model (individual and area) rather than in a three-level model (item, individual and area; ). Such a two-level model can be used in future studies to assess which attributes are most important for health and well-being. Our previous work on the qualities included in the SGS suggests that these qualities may not be equally associated with health indicators such as neighborhood satisfaction and physical activity .
One could argue about the choice to regard missing self-assessments as negative (counted as zero in SGS) and about the choice to use areas of 1,000 square meters. However, associations of the (aggregated) SGS with neighborhood satisfaction and vitality remained similar when individuals with missing assessments were excluded (results not presented). Secondly, a sensitivity analysis showed that the 1,000 square meter assessments correlated strongly with the 500 (Spearman's rank correlation = 0.91; N = 24,480) and the 2,000 square meter assessments (Spearman's rank correlation = 0.92; N = 24,636; not in results). Estimates from the ordinal regression model also remained similar when we used self-assessments aggregated to 500 and 2,000 meter areas (results not shown). Though, the boundaries of our grids may not correspond with the boundaries that delimited the true collective (e.g. neighborhood) that influences individual health .
The results in relation to other studies
The correlations between objectively assessed and self-reported green qualities may seem weak (Spearman's rank correlation range = 0.15-0.32) but are in line with correlations found in cross-sectional settings where exposure-response associations indeed are strong (e.g. correlation between GIS modeled residential road noise and self-reported annoyance, Spearman's rank correlation r = 0.20 ).
A recent report showed that the perceived environment correlated stronger with adolescents' physical activity behavior than the objectively assessed environment . In our study, the association between the original (unstandardized) GIS-index score and neighborhood satisfaction was similar to the association with individual SGS while the association with area-aggregated SGS was more pronounced. However, this can be explained by different scaling and spread in the assessments; low spread tends to inflate while high spread tends to decrease the odds ratios. When we used standardized measures to compare the three scores we indeed found a more pronounced effect of individual SGS and similar effects of area-aggregated SGS and the GIS-based index score. The effect of the GIS-based index score decreased markedly when area-aggregated SGS was included in the model, which could suggest that perceived attributes of the green environment are more important for neighborhood satisfaction. However, alternative explanations for this finding such as differences in spatial resolution between the GIS-based index score (300 meter from the individual residences) and area-aggregated SGS (residences aggregated in 1,000 square meter areas) cannot be ruled out.
Results for vitality did suggest confounding (single-source bias) since a clear association was present for individual SGS only. This bias was most likely caused by individual characteristics affecting self-reporting behavior which were not fully captured by the included socio-demographic factors. How self-reports aggregated to narrow area units might decrease bias from unmeasured determinants of self-reporting behavior has been demonstrated by simulations . Aggregated self-reports have been used as exposure measure in practice when monitoring or assessing health effects of e.g., neighborhood characteristics (e.g. resources for physical activity, safety, crime, dissatisfaction with green space, availability of parks) [20, 22, 25, 39], air pollution [40–42], traffic noise , and job strain .
Agreement between perceived and objectively assessed availability of the individual green neighborhood qualities was low and comparable to previous studies [22, 45]. However, our study looked at specificity and sensitivity as separate measures of agreement. The sensitivity of the self-reports was generally satisfactory whereas the specificity was low, implying that the perceived availability of green neighborhood qualities within 5-10 minutes walking distance was considerably higher than objectively assessed availability within 300 meters from the residence. One explanation for the low agreement could be that what is perceived as "5-10 minutes walking distance" may vary extensively among study subjects. However, changing distance from 300 to 100 or 500 meters in the GIS-based assessments did not increase agreement noticeably. Another explanation for the low agreement could be that the definitions used for the GIS-assessments were more extensive, and consequently more restrictive, than the phrasings used in the survey questions.
Socio-demographic factors were associated with the number of perceived green qualities in the neighborhood, which might also contribute to the low agreement. Such associations have also been demonstrated for self-reports of neighborhood attractiveness and safety  and other environmental factors . Negative perceptions could be related to factors found more prevalent in groups with low compared to high socio-economic status, i.e. low social capital, poor health and a more pessimistic world view, but could also be due to the possibility that objective measures not necessary capture all environmental attributes that participants take into account in their perceptions . Compared to the individual self-reports, the area-aggregations were stronger correlated with the GIS-based assessments, indicating lower amount of confounding and/or random misclassification error.
Implications for further research
Assessing the green qualities on an ordinal rather than binary scale using GIS, also in urban areas, would facilitate a more detailed validation of the self-reported items and would provide opportunities for index scores with wider ranges. Qualities of neighborhood green space in relation to health outcomes merits further investigation in longitudinal settings.