Low dose radiation risks for women surviving the abombs in Japan: generalized additive model
 Greg Dropkin^{1}View ORCID ID profile
DOI: 10.1186/s1294001601913
© The Author(s) 2016
Received: 5 July 2016
Accepted: 26 October 2016
Published: 24 November 2016
Abstract
Background
Analyses of cancer mortality and incidence in Japanese Abomb survivors have been used to estimate radiation risks, which are generally higher for women. Relative Risk (RR) is usually modelled as a linear function of dose. Extrapolation from data including high doses predicts small risks at low doses. Generalized Additive Models (GAMs) are flexible methods for modelling nonlinear behaviour.
Methods
GAMs are applied to cancer incidence in female low dose subcohorts, using anonymous public data for the 1958 – 1998 Life Span Study, to test for linearity, explore interactions, adjust for the skewed dose distribution, examine significance below 100 mGy, and estimate risks at 10 mGy.
Results
For all solid cancer incidence, RR estimated from 0 – 100 mGy and 0 – 20 mGy subcohorts is significantly raised. The response tapers above 150 mGy. At low doses, RR increases with ageatexposure and decreases with timesinceexposure, the preferred covariate. Using the empirical cumulative distribution of dose improves model fit, and capacity to detect nonlinear responses. RR is elevated over wide ranges of covariate values. Results are stable under simulation, or when removing exceptional data cells, or adjusting neutron RBE. Estimates of Excess RR at 10 mGy using the cumulative dose distribution are 10 – 45 times higher than extrapolations from a linear model fitted to the full cohort. Below 100 mGy, quasipoisson models find significant effects for all solid, squamous, uterus, corpus, and thyroid cancers, and for respiratory cancers when ageatexposure > 35 yrs. Results for the thyroid are compatible with studies of children treated for tinea capitis, and Chernobyl survivors. Results for the uterus are compatible with studies of UK nuclear workers and the Techa River cohort.
Conclusion
Nonlinear models find large, significant cancer risks for Japanese women exposed to low dose radiation from the atomic bombings. The risks should be reflected in protection standards.
Keywords
Low dose radiation Cancer Women Thyroid Uterus Abomb GAMBackground
Analyses of cancer mortality and incidence in Japanese Abomb survivors continue to play a key role in establishing radiation risk estimates. The 1958–1998 incidence cohort was analysed by Preston and coworkers [1], whose main model assumes a linear dose response. Their study found some evidence of radiation risk at doses below 150 mGy, and higher risks for women. Previously Pierce and Preston [2] found a significant dose response for all solid cancer incidence in the 0 – 100 mGy range. However, according to the EPA Blue Book [3] “it has not been possible to demonstrate and quantify risk in the Lifespan Study (of Abomb survivors) at doses below about 100 mGy”. The ICRP [4] recommended annual dose limit for occupational exposures is 20 mSv (whole body dose).
Some cancer studies find nonlinear responses to radiation [5–8] and there are some significant results at low doses [9–16]. Low dose radiation risk was reviewed in 2003 [17] and remains a focus of research [18].
I consider cancer incidence in female Japanese Abomb survivors in the dose ranges 0 – 500 mGy, 0 – 100 mGy, and 0 – 20 mGy, using the grouped anonymous (public) incidence data for the 1958 – 1998 cohort.
Attained age is the single most powerful predictor of cancer incidence. Models can be refined using birth cohort, ageatexposure or timesinceexposure, city, urban/rural residence, and dose. The Not In City (NIC) group of Hiroshima and Nagasaki residents who were out of the area at the time of bombing can be included.
The analysis uses the Generalized Additive Model [19] implemented in R [20] through the programme mgcv [21]. GAM is a flexible technique for fitting nonlinear responses and covariate interactions. Models are investigated for all solid cancers, and then applied to the most common anatomical or histological groups and specific sites within the female incidence data: digestive, genital, respiratory, adenocarcinoma, other epithelial, squamous, stomach, colon, liver, pancreas, gallbladder, rectum, uterine, cervical, uterine corpus, ovarian, lung, breast, thyroid, and central nervous system.
The analysis aims to test for linearity of the dose response, explore the interaction of dose with attained age, ageatexposure and timesinceexposure, adjust for the highly skewed distribution of dose, test whether the response is significant below 100 mGy, and estimate the risks at 10 mGy.
Methods
The public dataset LSSinc07 [22] was downloaded from the Radiation Effects Research Foundation (RERF) [23]. Records with unknown dose were excluded, leaving 25,570 cells with grouped data on 17,448 solid cancers among 105,427 survivors with 2,764,735 personyears of followup. The covariate py denotes personyears observation in each cell. logage is the log of age, the attainedage covariate which RERF obtained by weighting each individual contribution by its py within the cell. Likewise agex is the py weighted mean ageatexposure, since is the difference between year, the py weighted mean calendar time, and the exposure date (depending on city), and birth is defined as year  age. The dataset also contains sex, city, and distcat, the latter taking the values 1 (urban), 2 (rural), and 3 (Not in City). The grouped data is stratified by these (and other) categories. The dataset is then restricted by sex = 2, and by distcat < 3 if the NIC group is excluded; city and distcat are converted to factor variables fc and fd. For convenience, covariate names are introduced in bold, but then used in normal font.
When modelling all solid cancers, the instantaneous flash dose is the py weighted mean colon dose in mGy, defined as coldose = 1000*cola02w10 using the DS02 dosimetry and a Relative Biological Effectiveness (for neutrons) of 10, unless otherwise specified. The dataset is then restricted by coldose < 100 (or < 20 or < 500). For convenience A^{+}, B^{+} and C^{+} refer to 0 – 20 mGy, 0 – 100 mGy and 0 – 500 mGy subcohorts including or excluding NIC. Weighting each coldose by the py of its data cell, the mean coldose in A^{+}, B^{+} and C^{+} is 2.43, 3.70, 9.84, 13.85, 36.44, 48.93 mGy respectively.
Case numbers by site and dose range
Site  A^{}  A^{+}  B^{}  B^{+}  C^{}  C^{+}  Mid age  Mid agex  Mid since 

Solid  4239  6271  5474  7506  6712  8744  68.4  33.1  36.9 
Adeno  2477  3722  3188  4433  3928  5173  67.0  28.3  38.6 
Othepi  1016  1460  1310  1754  1583  2027  73.3  38.4  32.7 
Squam  550  823  718  991  881  1154  63.5  33.2  28.6 
Digest  2206  3263  2825  3882  3418  4475  71.7  33.5  38.6 
Genital  660  1008  871  1219  1035  1383  61.8  31.6  28.6 
Resp  324  487  441  604  558  721  73.2  36.5  38.6 
Stom  997  1459  1285  1747  1564  2026  71.6  36.6  33.6 
Colon  373  578  471  676  570  775  71.7  29.3  44.1 
Liver  287  416  356  485  433  562  71.7  32.6  41.4 
Gall  181  266  234  319  273  358  73.2  36.8  38.6 
Panc  134  192  172  230  208  266  73.4  36.6  38.6 
Rect  183  283  245  345  291  391  68.5  31.6  41.3 
Uterus  519  806  689  976  821  1108  61.7  31.5  28.6 
Cervix  384  598  509  723  603  817  61.8  31.7  28.5 
Corpus  84  132  108  156  126  174  61.1  23.1  38.5 
Ovary  115  170  143  198  171  226  63.5  31.6  33.6 
Breast  441  667  554  780  713  939  59.1  22.4  38.6 
Lung  290  437  396  543  503  650  73.3  36.4  38.6 
Thyroid  158  219  203  264  271  332  66.7  28.2  31.9 
Cns  93  117  125  149  151  175  66.2  26.7  37.0 
Modelling seeks to predict the cancer risk, i.e. the expected value of obs/py, as a function of covariates and their interactions. Each data cell is regarded as an independent source of Poisson distributed incident cases, and penalised regression with cubic splines, using a log link, is used to estimate the expected value of obs. Briefly (see [21]):
obs_{j} ∼ Pois(μ _{j}) where j indexes the data cells
log(μ _{j}) = \(\sum \) _{i} β _{i}f_{i}(x_{j}) where i indexes the model components f and coefficients β; f_{i} is a function of covariates x, whose form is specified in the model but whose value varies with x_{j} in cell j; and β _{i} is a numerical coefficient of f_{i}, independent of j, to be estimated during model fitting.
The f_{i} can include offsets, whose β _{i} is fixed at 1; factors; parametric terms; splines derived from the covariates; and tensor products of such splines.
A cubic regression spline with components f_{0}, f_{1}…f_{k1} is obtained from covariate x by placing k knots, for example at equally spaced quantiles of x, forming the natural cubic splines with value 1 at the specified knot and 0 at the other knots, using a centering constraint to achieve \(\sum \) _{j}f_{i}(x_{j}) = 0 when i > 0, and setting f_{0} = 1.
During model fitting, the β _{i} are estimated by minimising the penalised deviance, Dev + β ^{T}S β, where the positive semidefinite penalty matrix S is a direct sum of blocks generated from the splines, each block incorporating a smoothing parameter λ _{b}. For tensor product splines, there are separate smoothing parameters for each margin. Dev is the Deviance, given in the Poisson case (omitting subscripts j) by
2\(\sum \)[obs ·log(μ) μ  I_{obs} ·(obs ·log(obs)obs)] where I_{obs} = 1 if obs > 0 and 0 otherwise
Minimising the penalised deviance is a compromise between approximating the data (lowering Dev), and the “wiggliness” of the fitted model as measured by the penalty. Higher values of the smoothing parameter(s) impose stronger penalties, making the fitted model smoother with lower “effective degrees of freedom” (a penalised version of the number of independent parameters), but allowing it to stray further from the individual data points. As λ _{b} →∞, the model becomes loglinear (only the linear components of the spline terms are unpenalised by S).
Model fitting involves optimising the smoothing parameters affecting S, through a criterion such as Generalized Cross Validation, Restricted Maximum Likelihood, or Marginal Likelihood (see [21, 24] for details). After fitting, inference is based on the Bayesian posterior covariance of β.
With the fitted \(\hat {\beta }_{i}\), the expected value of obs_{j} is μ _{ j }=\(\exp (\sum \) _{i} \(\hat {\beta }\) _{i}f_{i}(x_{j})). The fitted model also predicts the expected value μ _{j,0} if the dose covariate is reset to 0 while preserving all other covariates and the \(\hat {\beta }\) _{i}. In each cell, baseline risk is estimated internally from the fitted model as μ _{j,0} / py_{j} and Relative Risk is RR_{j} = μ _{j} / μ _{j,0}.
The reverse arrow < defines the model by the subsequent expression, with gam denoting a generalized additive model. Here, obs is regarded as a sample from a Poisson distribution whose mean, varying across the data cells, is specified with a (default) log link. The offset term, with coefficient fixed at 1, ensures that the fitted value of obs/py is independent of py. The factor variables and their first order interaction are included as (fc+fd)ˆ2. The smooth term ti(logage) refers to an unknown linear combination of splines derived from logage, and the tensor product term ti(logage, birth) allows for interaction of logage with birth. In each smooth, the basis dimension is set at 10. By default, “cubic regression splines” are used with the ti terms. The smoothing parameter optimisation method indicated by "ML" is Marginal Likelihood. For details see [21] and ?gam and ?ti (within mgcv). Setting family = "quasipoisson" allows a meanvariance relation V= μ ϕ where V is the variance about mean μ and ϕ>0 is a scale factor. For the Poisson family, ϕ = 1, but any positive scale can be set, or estimated as below.
The dose covariate is highly skewed and, in the grouped data, multimodal. Over 70% of py in the full data has dose < 20 mGy, or over 60% if NIC cells are excluded. Likewise over 84% has dose < 100 mGy, or 79% if NIC cells are excluded. The grouped data cells are also stratified by dose category, an analytical decision embodied in the anonymous public data. For example, each cell in A^{} or A^{+} will contain records with dose either below 5 mGy or between 5 and 20 mGy.
Model definitions
Model  Family  ϕ  Model  Family  ϕ  Terms 

P0  poisson  1  Q0  quasipoisson  >0  offset(log(py))+(fd+fc)ˆ2+ti(logage) +ti(birth)+ti(logage,birth) 
P1e  poisson  1  Q1e  quasipoisson  >0  .+ecdos 
P2e  poisson  1  Q2e  quasipoisson  >0  .+ti(ecdos) 
P3e  poisson  1  Q3e  quasipoisson  >0  .+ti(ecdos)+ti(ecdos,logage) 
P4ae  poisson  1  Q4ae  quasipoisson  >0  .+ti(ecdos)+ti(ecdos,agex) 
P4se  poisson  1  Q4se  quasipoisson  >0  .+ti(ecdos)+ti(ecdos,since) 
P5ae  poisson  1  Q5ae  quasipoisson  >0  .+ti(ecdos)+ti(ecdos,logage) +ti(agex)+ti(ecdos,agex) 
P5se  poisson  1  Q5se  quasipoisson  >0  .+ti(ecdos)+ti(ecdos,logage) +ti(since)+ti(ecdos,since) 
P1d  poisson  1  Q1d  quasipoisson  >0  .+dose 
P2d  poisson  1  Q2d  quasipoisson  >0  .+ti(dose) 
P3d  poisson  1  Q3d  quasipoisson  >0  .+ti(dose)+ti(dose,logage) 
P4ad  poisson  1  Q4ad  quasipoisson  >0  .+ti(dose)+ti(dose,agex) 
P4sd  poisson  1  Q4sd  quasipoisson  >0  .+ti(dose)+ti(dose,since) 
P5ad  poisson  1  Q5ad  quasipoisson  >0  .+ti(dose)+ti(dose,logage) +ti(agex)+ti(dose,agex) 
P5sd  poisson  1  Q5sd  quasipoisson  >0  .+ti(dose)+ti(dose,logage) +ti(since)+ti(dose,since) 
The nested chains P0 ⊂ P2e ⊂ P3e ⊂ P5ae, P0 ⊂ P2e ⊂ P4ae ⊂ P5ae, P0 ⊂ P2e ⊂ P3e ⊂ P5se, and P0 ⊂ P2e ⊂ P4se ⊂ P5se give models of increasing complexity using ecdos with agex or since respectively. Corresponding nested chains P0 ⊂ P2d etc. are defined using dose instead of ecdos. P2 refers to the pair P2e and P2d, etc.
The scale for a quasipoisson model M is estimated iteratively via Pearson \(\chi ^{2} = \sum (obsfit)^{2}/fit\), where fit is the fitted value, by seeking \(\hat {\phi } \simeq \chi ^{2}/dfres\) where dfres is the residual degrees of freedom (# of records – effective degrees of freedom).
Additional file 1: Appendix A describes methods for Relative Risk RR, confidence intervals based on the Bayesian posterior distribution of β, dose response curves, and model selection. Simulation tests used to choose the preferred method of smoothing parameter optimisation and to validate the models are outlined in Additional file 1: Appendix B.
Numerical results are given at 2.5 and 10 mGy in A^{+}, 2.5, 10, 20, 40, 60, 80 mGy in B^{+}, and at 2.5, 10, 100, 200, 300, and 400 mGy in C^{+}. Confidence intervals are 90% (lower and upper 95% confidence limits LCL95% and UCL95%) unless otherwise specified, and “borderline” denotes 0.98 ≤ LCL95% ≤ 1.
Initially, no minimum value of the smoothing parameters was imposed. However, if the visual output was “too wiggly”, indicating overfitting, min.sp was increased until the output appeared fairly smooth, a subjective judgement.
In each data cell, a fitted model gives predicted values μ, with \(\sum {\mu } = \sum {obs}\), and predicts values μ _{0} if dose had been reset to 0, or ecdos reset to ecdf(dose)(0). The attributable fraction \(1  \sum {\mu _{0}}/\sum {\mu }\) is the predicted number of additional cases above the baseline, as a proportion of those observed. \(\sum {\mu }/\sum {\mu _{0}}\) estimates the Standardized Incidence Ratio from the fitted model.
The age distribution of cases varies between cancer sites. Define the “mid” age (and agex, since) for a site and dose range as that below which half the cases occur. Table 1 includes the mid values by sites in B^{+}; values for other dose ranges are similar. Results are reported at the mid age rounded to the nearest 5 years, and where appropriate at the rounded mid agex or since.
For a fitted model M, set I_{M} = 1 on any data cell for which LCL95% > 1, and 0 otherwise. The proportion of obs (or py) for which I_{M} = 1 measures the extent to which RR > 1 is significant across the dose range. Since I_{M} = 0 if dose = 0, the proportions are obtained for D_{0}, those cells with dose > 0, or for D_{0.5} (dose > 0.5 mGy). Corresponding proportions are obtained on S_{0.5} (since < 35 and dose > 0.5 mGy), or T_{0.5} (agex > 35 and dose > 0.5 mGy), or S_{0.5} ∩ T_{0.5}. Likewise, the indicator J_{M} = 1 on any data cell for which UCL95% < 1, and 0 otherwise. The proportion of obs (or py) for which J_{M} = 1 measures the extent to which RR < 1 is significant across the dose range.
Selected models for all solid cancers were refitted after excluding exceptional data points as detected by Cook’s Distance, or with thin plate regression splines, or for urban residents, or with RBE as obtained in [25].
For a fitted model M, influence(M) is a vector of leverage at each cell, and plotting influence(M1)/influence(M2) against dose compares model sensitivities in different dose regions. The ability of ecdos and dose models to detect hypothetical doseresponses was compared by simulation (Appendix B). Further simulations considered the uncertainty in dose.
These intensive computations were applied only for all solid cancers. Elsewhere, the Bayesian posterior CIs (Appendix A) are reported without further validation. Some codes with commentary are shown as additional files.
Results
P5ae, P5se, P5ad and P5sd were crossvalidated (Appendix B) in A^{+}, to test the proportion of deviance explained and the estimated RR _{10 mGy} with 90% CIs, see Additional file 1: Table S1.
Additional file 1: Figure S1 shows the dose response in B^{+}. All plots show RR significantly raised, although the response is lower with the dose models. Plots with REML or GCV.Cp optimisation methods are similar. For the ecdos models, the appearance of the response reflects the relation between ecdos and dose (see Fig. 1); corresponding curves of RR against ecdos are regular.
Coverage by P2e and P2d of true dose response
Name  ERR  β  σ  τ  γ  Form  Mean (lhe)  Mean (lhd) 

N1  g(d,β,σ,τ)  0  0  null  95.18%  94.90%  
L1  g(d,β,σ,τ)  0.002  0  linear  81.46%  94.56%  
H1  f(d,β,σ,τ)  0.004  0.2  0.05  hormesis  75.08%  69.86%  
P1  f(d,β,σ,τ)  0  0.2  0.05  plateau  93.36%  69.68%  
C1  f(d,β,σ,τ)  0.0012  0.08  0.5  two phase  90.88%  50.08%  
C2  f(d,β,σ,τ)  0.015  0.75  0.2  two phase  94.80%  13.98%  
R1  g(d,β,σ,τ)  0.002  0.01  0.08  two phase  93.64%  68.14%  
R2  g(d,β,σ,τ)  0.002  0.01  0.04  two phase  93.98%  66.10%  
T1  h(d,γ)  50  threshold  88.40%  87.66%  
T2  h(d,γ)  25  threshold  86.88%  86.90% 
The increased sensitivity of ecdos models at low doses is seen in Additional file 1: Figure S2, plotting the ratio of influence with P5sd to that with P5se, against dose. In B^{+} and C^{+} low doses have less influence on P5sd, reducing its sensitivity to their effects.
Splines constructed from ecdos have better variability than dose splines at very low doses, lowering the internal baseline estimates. Additional file 1: Figure S3 compares splines from P2d, P2e, and P2g constructed from gdos, a smooth approximation to ecdos, when modelling over B^{+}. Although the P2g response curve is simpler, the parameters of gdos vary with dose range and its distribution is multimodal, whereas ecdos has a universal definition and an approximately uniform distribution.
Poisson model selection methods (Appendix A) were applied to choose preferred models. P5se is preferred in all six dose ranges by method 1 (lowest ML score). The three methods chose models with ecdos in 17 of 18 cases. Additional file 1: Table S2 shows the preferred Poisson models for each range and selection method. As appropriate, RR is shown at age 70, agex 35, or since 35. All preferred models show RR elevated at 2.5, 10, 20, 40, 60, 80, 100, 200, 300, and 400 mGy, with LCL95% > 1, except for method 2 in B^{+} where P3d is preferred, with LCL95% ≥ 1. In all cases Deviance < dfres. The proportion of Deviance explained varies from 46.1% in C^{} to 68.1% in A^{+}. The attributable fraction varies from 1.4% (method 2 in B^{+}) to 9.4% in A^{}.
Additional file 1: Table S2 also shows the percentage of cases and py in D_{0} for which RR > 1, LCL95% > 1, or UCL95% < 1; likewise for cells in D_{0.5}, S_{0.5}, T_{0.5}, or S_{0.5} ∩ T_{0.5} (see “Methods”). For each of these preferred models, cell and dose ranges, RR > 1 holds for over 80% of cases and py, and UCL95% < 1 never occurs. Except with method 2 in B^{+}, LCL95% > 1 holds for over 70% of cases in T_{0.5}, over 75% of cases in S_{0.5}, and over 90% of cases in S_{0.5} ∩ T_{0.5}.
The dose models themselves give nonlinear responses over C^{+} (see Fig. 3). With P2d, RR rises approximately linearly below 150 mGy, before tapering. Anova comparison with P1d, in which dose appears as a loglinear unsmoothed term, gives p = 0.015. Below 100 mGy, the Excess Relative Risk (ERR = RR  1) is about 70% higher with P2d than with P1d. With P5sd RR tapers above 150 mGy as with P2d. The term ti(since) has e.d.f. = 3.658 and p = 0.037, and anova(P3d, P5sd) = 0.008 while anova(P4sd, P5sd) = 0.07. Thus P5sd is decidedly nonlinear in dose, whose interaction with logage is borderline, but since has significant impact.
The dose models also show RR raised in A^{+}. For P1d and P2d, in both ranges RR _{10 mGy} = 1.09 (0.99, 1.19). P4sd is preferred amongst dose models by methods 2 and 3. At since 35, in A^{} RR _{10 mGy} = 1.11 (1.00, 1.22) with significant elevation when since < 35. In A^{+} RR _{10 mGy} = 1.09 (0.99, 1.19) with significant elevation when since < 32. In A^{+} P5sd is preferred amongst dose models by method 1. At age 70, since 35 in A^{} RR _{10 mGy} = 1.11 (1.00, 1.23) whilst in A^{+} RR _{10 mGy} = 1.11 (1.01, 1.22) and RR is significantly raised for 37.2% (28.2%) of cases (py) in D_{0.5}, 75.0% (42.3%) in S_{0.5}, and 68.4% (79.6%) in T_{0.5}.
In general, point estimates and CIs are primarily affected by the dose range (less by the inclusion of NIC) and the choice of dose or ecdos.
The stability of these results was tested, first by using thin plate regression splines. RR and CIs were similar to those shown in Additional file 1: Table S2, but cubic regression splines gave lower ML scores. If exceptional cells detected visually by Cook’s Distance (CD) are deleted, RR values and 90% CIs were slightly altered from those in Additional file 1: Table S2, but still significantly raised (borderline for B^{+} method 2). Deletion affected the choice of preferred models only for C^{}, where P5ae was preferred by all methods, and C^{+}, where P5se was preferred by all methods. When preferred models were refitted after excluding cells with the highest 5% of CD values, RR was raised with at least borderline significance across each dose range.
The combined impact of sampling variability and lognormal dosimetry errors was simulated (see Appendix B) using P5se and P5sd in A^{}, shown in Additional file 1: Figure S4. Geometric mean fitted values and 90% CIs are below the values derived from the original data, but still raised with borderline significance; while bootstrapt [26] and stretched CIs derived from the simulated data remain significantly raised.
Models were fitted with RBE as the function of neutron dose D_{n} shown in [25] (Fig. 4 and Eq. (2)). This raises the dose assigned, as RBE > 50 in the ranges considered here (for which D_{n} < 0.003 Gy). Results are shown in Additional file 1: Table S3. In all dose ranges and with all three methods of Poisson selection, RR is elevated with at least borderline significance. In 15 out of 18 cases, and for 6 of 6 with method 1, the same models are selected, and RR and CIs are slightly lower but still comparable to those in Additional file 1: Table S2.
Excluding genital or, separately, respiratory cancers, the remaining solid cancers were modelled over A^{+}, B^{+}, and C^{+}, with results in Additional file 1: Table S4. RR estimates are mainly similar to those in Additional file 1: Table S2. The heightened significance when agex > 35 or since < 35 generally survives these exclusions.
For urban residents in A, B, or C, the preferred Poisson models show RR elevated with at least borderline significance, and a tapering response at higher doses (see Additional file 1: Table S5). With method 3, RR is significantly elevated in each range.
In each dose range, the factor variables fc and fd have significant interaction, lowering the fitted values for rural Nagasaki residents. Modelling separately by city (Additional file 1: Table S6), for Hiroshima, models with since are preferred for every dose range and selection method. None show RR significantly or borderline raised at rounded mid age and since over A^{+} or B^{+}, but over C^{+} RR is significantly raised (method 1) or borderline (at 10 mGy, methods 2 and 3) at age 70 and since 35, and significantly raised (all methods) for high percentages of cases and py.
Several sites, including the uterus, show nonlinear responses. For all solid, genital, respiratory, gallbladder, thyroid, and cns, the response is significantly elevated for both models in all cells in D_{0}. With the preferred covariate, RR is also significantly elevated throughout D_{0} for the uterus and corpus, with weaker evidence for squamous, cervix, and rectum. For lung cancers, using the preferred covariate RR is elevated with borderline significance for over 80% of py in D_{0}.
These 12 sites, along with the breast and liver which are radiosensitive at higher doses but show (nonsignificant) negative response in Fig. 10, were analysed with interaction models. In Additional file 1: Table S8, method 4 is used for each site and dose range, to choose the preferred quasipoisson model. Poisson selection by methods 1, 2, and 3 is also shown. Using the preferred quasipoisson models, RR and 90% CIs are shown along with the percentages of cases and py in D_{0}, D_{0.5}, S_{0.5}, T_{0.5}, and S_{0.5} ∩ T_{0.5} for which RR is elevated, significantly elevated, or significantly lowered.
RR is significantly elevated across the dose range for all solid cancers in A^{}, A^{+} (borderline), B^{+}, and C^{+}. Squamous cancers show significantly elevated RR in A^{}, B^{}, B^{+} (borderline 80 mGy), and C^{+}. The corpus has significant, highly elevated RR in A^{+} and B^{+}. Thyroid RR is significantly elevated in A^{+}, B^{+}, and C^{+}.
Respiratory RR is significantly lowered for 37.2% of py in A^{} (all with agex < 18), 13.7% of py in A^{+} (all with since > 44) and 11.9% of py in C^{} (all with agex < 8); it is significantly raised in S_{0.5} and T_{0.5} in B^{+}, and in D_{0.5}, S_{0.5} and T_{0.5} in C^{+}. For the breast, RR is significantly lowered only for 10.9% of py in B^{}. There is little significant elevation in A^{+} or B^{+}, but RR is significantly raised in D_{0.5}, S_{0.5} and T_{0.5} in C^{}, and for all py in C^{+}. For the liver, RR is significantly lowered for 24.8% of py in A^{}, 26.2% in A^{+}, 8.7% in B^{}, 5.6% in B^{+}, 10.1% in C^{}, and 2.2% in C^{+}.
Additional file 1: Figures S5–S10, for respiratory, uterus, and thyroid, correspond with Figs. 7 and 8 for all solid cancers. For the uterus, RR is a Ushaped function of age. For the thyroid, RR peaks around age 60 and decreases with agex.
Whilst the preferred models vary over the sites, dose ranges, and selection methods, ecdos models are chosen in 215 out of 336 cases, with a similar proportion for each selection method. Models with since are chosen in 194 cases and models with agex are preferred in 80. Among the 24 models for all solid cancers, ecdos is chosen in 22, since in 18, and agex only once. Overall, there is a clear preference for ecdos and since, and models with both covariates are preferred in 140 of 336 cases in total.
The male cohorts in A^{+} and B^{+} were fitted with Poisson models. In A^{+}, P5ad (method 1), P4ae (method 2) and P2e (method 3) are preferred. With P5ad at age 70 and with P4ae, the response increases with agex but lacks borderline significance. With P2e the response is insignificant. In B^{+}, P5se (method 1), P4se (method 2) and P2e (method 3) are preferred. With P5se at age 70, the response decreases with since but fails to reach borderline significance. With P4se, the response decreases with since and is borderline for since ≤ 20. With P2e, the result is insignificant.
The combined cohorts in A^{+} and B^{+} were fitted with Poisson models, controlling for gender, city, distcat and their first order interactions, obtaining a gender averaged dose response. In A^{+}, P5sd (method 1) and P4sd (methods 2 and 3) are preferred. For P4sd at 10 mGy RR is significantly raised (90% CIs) when since ≤ 33. At since 30, RR _{10 mGy} = 1.09 (1.02, 1.17). Results from P5sd at age 70 are nearly identical. In B^{+}, P5se (method 1) and P4se (methods 2 and 3) are preferred. For P4se at since 30, RR _{10 mGy} = 1.07 (1.01, 1.12), but significance is lost for 19 ≤ since ≤ 27.
Discussion
Low dose radiation has significant effects at cell level including some nonlinear responses [27], but cancer risk is typically modelled as a linear function of dose and the risk below 100 mGy is still under debate. The body’s multiple defences may filter out most abnormalities and the doseresponse for disease could be linear.
In analysing the Japanese Abomb survivors incidence or mortality, the Lifespan studies and other reports [1, 3, 28, 29] use a form of Poisson regression in which ERR = RR  1 is modelled as β D, where β may be modified by sex, age and ageatexposure through other parameters; alternatively, ERR ∼β D+γ D ^{2}. If the response were known to take a specific form with the same parameter values at all doses, a model fitted to the full data could predict risk at 0.01 Gy. Otherwise, a model which estimates risk at 1 Gy may be inappropriate at lower doses, as several studies recognise [2, 8, 30].
In one approach to detecting nonlinearity, numerical data can be reduced to categories before smoothing the results [1]. The outcome may depend on the cut points, and information is discarded by categorising data. The Generalized Additive Model [19, 21] is a more sophisticated method. Using GAM, covariates and interactions enter the model through unknown linear combinations of splines.
The public data released by RERF is stratified by essentially arbitrary dose categories with cutpoints 0, 5, 20, 40 mGy …as well as categories of age and ageatexposure, city, sex and location. However the data also contains pyweighted mean values of dose, age, ageatexposure, and calendar year (giving timesinceexposure). The analysis uses these values, whose distribution, particularly for dose, is far from uniform. Stratification causes clumping of mean dose, but the individual exposures are also concentrated at low doses. As Preston observed [1] “One aspect of the LSS cohort that is often neglected is the highly skewed dose distribution and the large number of survivors who received lowdose, albeit at a high dose rate, exposure.”
To compensate for stratification and skew, dose was converted to its empirical cumulative distribution, a monotone (invertible) transformation. The resulting variable ecdos is approximately uniformly distributed. Models using ecdos gave better fit for all solid cancers and many specific sites. This transformation may still be relevant with the individual data as dose remains skewed, though no longer clumped.
If the true doseresponse is linear then modelling with dose is preferable. Whilst there is little difference in detecting a threshold, ecdos models are better able to capture alternatives, including plateau, hormesis, and a variety of curves with rapid initial rise and a smaller increase at higher doses. One such curve (Fig. 4 panels c and d) is loosely based on in vivo data for γH2AX Foci (see [31] (Fig. 1)).
The models have several sources of nonlinearity, including the log link, the relation between ecdos and dose, and the splines. In fact, the fitted models are often loglinear in ecdos as a main effect. However, the interaction of ecdos or dose with logage and since or agex can introduce further nonlinearities.
For all solid cancers, in A^{+}, B^{+} and C^{+} all three Poisson selection methods yield models with RR raised across the dose range at the rounded mid values of covariates, with at least borderline significance. Results are stable when deleting exceptional cells, simulating random errors in dosimetry, adjusting RBE, using different splines, or excluding genital or respiratory cancers. For all solid cancers, results are similar with Poisson or quasipoisson models, which explain around 68% of the deviance in A^{+}, 58% in B^{+}, and 50% in C^{+}.
For all solid cancers in A^{+}, all the dose or ecdos Poisson models show effects which are at least borderline (0.98 ≤ LCL95% ≤ 1) or significant (LCL95% > 1) at 2.5 and 10 mGy, at the rounded mid values of covariates. The preferred models with ecdos refine the results. Significance persists over a wide range of covariates.
With the Nagasaki data, the Poisson model chosen to minimise ML shows RR elevated across the cohort in A^{+}, B^{+}, and C^{+}; and in B^{+} ERR _{10 mGy} is more than twice the value for the combined cohort. Using a very different approach, Sasaki [7] found an exceptional dose response for Nagasaki (male and female).
For all solid cancers, risk estimates are similar with or without the Not In City data. NIC cells are not distinguished by the urban/rural status of the individual members’ normal residence, which makes a significant contribution to the model. Even without including NIC, extremely low nonzero doses arise from distant rural residents. However, including NIC gives more cases and greater statistical power.
For all solid cancers and most individual sites, timesinceexposure was a better predictor than ageatexposure. The fitted dependence of risk on since is not governed by a latency threshold, and for all solid cancers, RR decreases with since at low doses. Risk could rise with since < 12.4 years, the minimum value in the cohort.
Shuryak [32] attributes childhood radiation risk to initiation when cells are proliferating, while older adult risk arises from progression of previously acquired mutations. For all solid cancers, RR increases with agex at low doses, suggesting that for women, low dose radiation is acting primarily via progression.
Estimates of ERR at 10 mGy, age 70, agex 35, since 35
Sex  Range  Method  Model  Scale  Covariates  ERR _{10 mGy, Female}  

90% CI, age 70,  
agex 35, since 35  Ratio  
MF  0–4 Gy  LinExt from [1]  1  s,c,d,dose,age,agex  0.0053 (0.0037, 0.0067)  1.0  
MF  0–0.5 Gy (urban)  WtMovAv from [2]  1  s,dose,age,agex  0.02^{a} (estimate)  3.8  
F  A^{}  1  P5se  1  c,d,ecdos,age,since  0.242 (0.057, 0.459)  45.7 
F  A^{}  2  P4se  1  c,d,ecdos,since  0.225 (0.045, 0.437)  42.5 
F  A^{}  3  P4se  1  c,d,ecdos,since  0.225 (0.045, 0.437)  42.5 
F  A^{}  4  Q4se  0.901  c,d,ecdos,since  0.228 (0.055, 0.430)  43.0 
F  A^{+}  1  P5se  1  c,d,ecdos,age,since  0.230 (0.054, 0.436)  43.4 
F  A^{+}  2  P5se  1  c,d,ecdos,age,since  0.230 (0.054, 0.436)  43.4 
F  A^{+}  3  P4se  1  c,d,ecdos,since  0.202 (0.032, 0.401)  38.1 
F  A^{+}  4  Q4sd  0.868  c,d,dose,since  0.088 (0.001, 0.185)  16.6 
F  B^{}  1  P5se  1  c,d,ecdos,age,since  0.088 (0.017, 0.164)  16.6 
F  B^{}  2  P4se  1  c,d,ecdos,since  0.085 (0.015, 0.160)  16.0 
F  B^{}  3  P4se  1  c,d,ecdos,since  0.085 (0.015, 0.160)  16.0 
F  B^{}  4  Q4se  1.141  c,d,ecdos,since  0.077 (0.011, 0.148)  14.5 
F  B^{+}  1  P5se  1  c,d,ecdos,age,since  0.068 (0.021, 0.117)  12.8 
F  B^{+}  2  P3d  1  c,d,dose,age  0.013 (0.000, 0.026)  2.5 
F  B^{+}  3  P2e  1  c,d,ecdos  0.061 (0.015, 0.110)  11.5 
F  B^{+}  4  Q2e  1.102  c,d,ecdos  0.061 (0.013, 0.112)  11.5 
F  C^{}  1  P5se  1  c,d,ecdos,age,since  0.064 (0.038, 0.090)  12.1 
F  C^{}  2  P5ae  1  c,d,ecdos,age,agex  0.068 (0.045, 0.091)  12.8 
F  C^{}  3  P3e  1  c,d,ecdos,age  0.055 (0.034, 0.076)  10.4 
F  C^{}  4  Q3e  1.079  c,d,ecdos,age  0.051 (0.034, 0.068)  9.6 
F  C^{+}  1  P5se  1  c,d,ecdos,age,since  0.091 (0.055, 0.128)  17.2 
F  C^{+}  2  P4se  1  c,d,ecdos,since  0.057 (0.040, 0.075)  10.8 
F  C^{+}  3  P4se  1  c,d,ecdos,since  0.057 (0.040, 0.075)  10.8 
F  C^{+}  4  Q4se  1.066  c,d,ecdos,since  0.056 (0.039, 0.073)  10.6 
However, the GAM Poisson models give estimates comparable with [1] when fitted over J^{+} and evaluated at 1 Gy. Over this range, the preferred model (method 1, 2 or 3) is P5sd. At 1 Gy, age 70 and since 35, RR = 1.80 (1.60, 2.03) while for P5ad at agex 35 RR = 1.56 (1.40, 1.73), close to Preston’s estimate and CI. The dose response in J^{+} for P5ae, P5ad, P5se, or P5sd is roughly comparable with the genderaveraged ERR shown in [1] (Fig. 3, using smoothed category estimates from the 0 – 2 Gy data). Similar results are obtained over J^{} (excluding NIC).
However, the interactions at lower doses (see Figs. 5, 6, and 8) conflict with the main model in [1], which has RR decreasing with agex. At low doses, RR generally decreases with age when agex is fixed, but increases with agex when age is fixed. Likewise, above 100 mGy in C^{+} RR decreases with age when since is fixed, but increases with age in A^{+} and B^{+}. At age 70, RR decreases with since in A^{+} and B^{+} and when since > 35 in C^{+}. In all the low dose ranges considered here, RR increases with agex. Even in J^{+}, the curve for 100 mGy does not decrease with agex (see Fig. 12 panel b) Thus, the decrease of RR with agex (when agex < 40) in [1] is specific to the higher dose range and the response at 500 mGy or above.
Pierce and Preston [2] applied a simplified linear model to the 0 – 500 mSv urban incidence data for both sexes, finding a higher response than linear extrapolation would predict. Their Fig. 1 shows a smoothed curve obtained as a weighted moving average of category estimates. The sexaveraged RR at agex 30 and 100 mSv ∼ 1.09, from which RR _{female, 100mSv} ∼ 1.12 as the fitted model has ERR _{female} ∼ 2*ERR _{male}. At agex 35, RR _{female, 100mSv} ∼ 1.10. Figure 1 suggests that RR _{female, 10mSv} ∼ 1.02. By comparison, the preferred model (method 1) for the urban subcohort of C^{} is P5se, and at age 70, since 35, RR _{100 mGy} = 1.14 (1.09, 1.18), while RR _{10 mGy} = 1.03 (1.02, 1.04). The interaction with since is very weak, and the preferred model with method 2 or 3 is P3e, for which at age 70 RR _{100 mGy} = 1.14 (1.09, 1.19), as with the preferred quasi model Q3e. Models with either dose or ecdos show RR rising rapidly below 150 mGy before tapering, evidence of nonlinearity in the urban data below 500 mGy.
The values listed in Table 4 are obtained from preferred models. P5ad is never selected, and P5ae is chosen only once. Table 4 largely reflects the predominant selection of ecdos models. Nevertheless, if P5ad and P5ae (with their default knot locations) are fitted over A^{+} and evaluated at 10 mGy, age 70 and agex 35, for P5ad RR = 1.10 with 90% CI (1.00, 1.21) while with P5ae RR = 1.21 (1.04, 1.41). P5ae is again preferred to P5ad by ML score and AIC. The default knots for P5ad in A^{+} are 0, 0.12, 0.33, 2.30, 2.57, 9.69, 10.55, 15.58, giving a much richer structure than 0, 5, 20. The estimate at 10 mGy when fitting P5ad over A^{+} is much higher than the estimate when fitting P5ad over C^{+}.
Preston and coworkers [33] analysed the 1958–1998 solid cancer incidence data for exposure in utero or early childhood (0 – 5 years), including NIC. For women exposed in early childhood, at attained age 50 ERR _{1 Sv} = 2.2 (1.3, 3.4)_{95%} (see [33] Table 5). Linear extrapolation is comparable with results from P5sd over C_{i} ^{+} (agex ≤ 5, dose ≤ 500 mGy) at age 50 and since 45, for which RR _{10 mGy} = 1.03 (1.01, 1.05)_{95%}. With the preferred Poisson model (method 1) P5se, RR _{10 mGy} = 1.20 (1.06, 1.35)_{95%}. Methods 2 or 3 select P2e, giving RR _{10 mGy} = 1.14 (1.05, 1.24)_{95%}, while method 4 chooses Q2e giving RR _{10 mGy} = 1.14 (1.06, 1.22)_{95%}.
For sitespecific risks, some studies of other cohorts have found RR significantly raised after low dose exposure to low LET radiation. Some evidence from the literature on the thyroid, uterus and uterine corpus, is compared with results from quasipoisson models applied to appropriate subcohorts here.
Direct evidence on thyroid cancer risk arises from children given medical irradiation, residents affected by Chernobyl, nuclear workers, radiotherapy patients, and the Abomb survivors. Except as indicated, these studies applied a linear ERR model.
Additional file 1: Table S9 shows some results from [1, 6, 7, 11, 13, 15, 34–37]. In each case, an appropriate subcohort of the female incidence data was analysed for comparison, using the simplest quasipoisson dose model Q2d and the preferred quasi model for the subcohort, chosen to approximate the dose and agex range in the study, or with mean dose close to that at which risk was reported.
In most cases, results from the subcohorts using either model are compatible with the reported estimates, as CIs overlap. There is striking agreement with the tinea capitis studies. Sadetzki [34] used a linear model which, at the mean dose, gives RR _{93 mGy, female} = 2.96 (2.07, 4.31)_{95%}. Fitting the quasi models over C^{+} restricted to agex < 16, Q2d gives RR _{93 mGy} = 2.30 (1.36, 3.89)_{95%}, while the preferred model Q2e gives RR _{93 mGy} = 2.95 (2.05, 4.24)_{95%}. The earlier study by Ron [35] included the estimate RR _{10 mGy} = 1.27 (1.15, 1.42)_{95%}. Fitting the quasi models over B^{+} restricted to agex < 16, for which the mean pyweighted dose = 10.7 mGy, Q2d gives RR _{10 mGy} = 1.18 (1.06, 1.31)_{95%}, and at the mean values of agex and age in [35] the preferred model Q5ad gives RR _{10 mGy, agex 7.4, age 37.6} = 1.28 (1.07, 1.53)_{95%}. These close agreements contrast with the differences seen in the pooled study [11], where the tinea capitis estimates appeared exceptionally high. However, the pooled estimates are compatible with GAM modelling on the appropriate subcohorts.
For Chernobyl survivors with agex < 18, Little [6] considered a linearexponential model with lowdose asymptote ERR = αD, estimating α = 9.72 (2.67, 94.31)_{95%}. At the geometric mean dose as reported in [38], this gives an asymptotic RR _{190 mGy} = 2.85 (1.51, 18.92)_{95%}. Fitting the quasi models over C^{+} restricted to agex < 18, Q2d gives RR _{190 mGy} = 2.62 (1.50, 4.57)_{95%}. At the mean attained age from [39], the preferred model Q3e gives RR _{190 mGy, age 23.7} = 4.27 (1.30, 14.02)_{95%}.
Sasaki [7] reanalysed the incidence data in [1], with a very different nonlinear model. For the thyroid, from his Fig. 3(i) ERR _{0.1 mSv, female} ∼ 0.35, though no CI or interactions are shown. Results from C^{+} appear compatible. The occupational study by Sont [13] did not estimate RR for female thyroid cancer, but found a Standardized Incidence Ratio = 1.42 (1.19, 1.69)_{90%}. This was comparable with internal estimates of SIR (see “Methods”), with CIs by nonparametric bootstrap.
The results from GAM models applied to relevant subcohorts are compatible with [6, 11, 13, 15, 34, 35]. They are higher than extrapolations from [1, 36, 37], which concerned higher dose ranges. However they appear compatible with [7].
Several studies found RR decreasing at higher doses. In the radiotherapy studies, doses were high and the downturn was only apparent above 6 Gy [5] or 10 Gy [36], whilst for Chernobyl survivors risk tapered above 1 Gy [6]. Using the nointeraction dose model Q2d, RR tapers above 400 mGy.
Unlike the thyroid, the uterus is not generally recognised as radiosensitive, although [1] notes an excess of cancers of the uterine corpus at young ages. Occupational studies [40–42] found significant excess mortality from uterine cancers in nuclear workers with a radiation record. The mortality rate increased with dose, but with few cases, mainly endometrial (corpus), ERR/Sv was not reported.
Analysing the National Registry for Radiation Workers, Muirhead [43] found that for mortality (internally calculated SMR) amongst solid cancers, only all uterine cancers showed trend with dose significant at the 5% level using a twosided test. For incidence, there was a significant increasing trend with dose for endometrial cancer. RR was estimated with a linear model. For incidence of uterine ca ERR/Sv = 10.523 (0.27, 39.4)_{90%}. Near the mean dose (24.9 mSv), this corresponds to RR _{25 mSv} = 1.26 (1.01, 1.99). For incidence of corpus ca, the Health Protection Agency [44] reported that Muirhead found ERR/Sv = 14.2 (1.06, 58.56)_{90%}, corresponding to RR _{25 mSv} = 1.36 (1.03, 2.46).
For comparison, models were fitted over B^{+}, restricted to 16 ≤ agex < 60. For the uterus, Q2d gives RR _{25 mGy} = 1.07 (0.99, 1.15). The preferred model Q3d has RR _{25 mGy, 60yrs} = 0.96 (0.75, 1.23), while at age 65 RR = 1.23 (0.98, 1.54), and at age 70, 1.29 (1.04, 1.60). For the corpus, Q2d gives RR _{25 mGy} = 1.08 (0.89, 1.31). The preferred model Q4se gives RR _{25 mGy, since 30} = 5.24 (1.22, 22.55), and 4.90 (1.13, 21.24) and 4.00 (0.93, 17.14) at since 25 and 35 respectively.
Davis [16] studied solid cancer incidence in the Techa River cohort, and found significant effects only for the oesophagus and uterus, using a linear model without interactions. Women constitute 57% of the cohort and age of entry begins at 0. The mean dose was 53 mGy and 90% of doses were below 125 mGy. For the uterus, mean age at diagnosis was 57, and the result for ERR/100mGy = 0.21 (0.01, 0.51)_{95%} corresponds to RR _{50 mGy} = 1.11 (1.01, 1.26)_{95%}.
For comparison, models were fitted over B^{+}. Q2d gives RR _{50mGy} = 1.37 (1.09, 1.72)_{95%}. The preferred model Q5se gives RR _{50mGy, age 60, since 15} = 1.96 (1.12, 3.42)_{95%}, and at since = 20, 25, 30, and 35, the corresponding RR and 95%CI = 1.82 (1.10, 3.02), 1.67 (1.01, 2.76), 1.64 (0.99, 2.73), and 1.70 (1.02, 2.84).
Using a linear model without effect modification, Preston [1] estimated for the uterus ERR/Gy = 0.10 (0.09, 0.33)_{90%}, and for the corpus ERR/Gy = 0.29 (0.14, 0.95). However, for women with agex < 20, ERR/Gy for the uterus = 0.37 (0.001, 0.86) and for the corpus 1.00 (0.14, 2.40). Near the pyweighted mean dose (85.3 mGy), with agex < 20 the model predicts for the uterus RR _{80 mGy} = 1.03 (1.00, 1.07)_{90%} and for the corpus 1.08 (1.01, 1.19).
Fitting over C^{+} restricted to agex < 20, for the uterus Q2d gives RR _{80 mGy} = 1.79 (1.19, 2.68)_{90%}. With the preferred model Q5se, RR _{80 mGy, age 50, since 30} = 2.64 (1.17, 5.97)_{90%}, and at since = 35, 40, 45 and 50 the corresponding RR and 90%CI = 3.07 (1.41, 6.67), 4.51 (2.05, 9.94), 3.76 (1.65, 8.59), and 2.72 (1.11, 6.62). For the corpus Q2d gives 1.03 (0.88, 1.21). For the preferred model Q4ad, RR _{80mGy, agex 10} = 0.88 (0.56, 1.37), whilst at agex 12, 14, 16, 18 and 20 the corresponding values are 1.13 (0.74, 1.74), 1.59 (1.05, 2.43), 2.04 (1.33, 3.11), 2.24 (1.40, 3.60), and 2.18 (0.96, 4.98), i.e. the response is higher after puberty.
I have not used another implementation of GAM or the fully Bayesian methods of MCMC. Confidence intervals, based on Waldtype CIs for the linear predictor, were validated by simulation only for Poisson modelling of all solid cancers. Although the models explain nearly 60% of the deviance for all solid cancers (in B^{+}), the figures for individual cancer sites are lower, given the smaller case numbers. Minimum smoothing parameters imposed for some sites depended on a visual judgement of the output. The public data lacks information on other exposures which may interact with radiation. The effects could be due to confounding by some other factor linked to distance from the hypocentres, although modelling did control for urban, rural, or NIC location. The modelling was not designed to test a particular hypothesis for cellular mechanism, or for the higher risks for women or Nagasaki residents.
Although increased risk with ageatexposure (at fixed attained age) was found for many sites, as Boice observed [45] the flash doses all occurred in August 1945, so ageatexposure is confounded with birth cohort and changes in diet, smoking habits or infection rates, may explain apparent ageatexposure (or timesinceexposure) effects. Protracted or fractionated exposures may also have different effects.
The impact of radiation in this cohort may be affected by blast damage. Stewart and Kneale [46] investigated possible selection bias, as subjects were only eligible to enter the (mortality) cohort if they survived to 1950. Incidence data begins in 1958, with the minimum timesinceexposure exceeding 12 years.
Wing and Richardson [47] point out that the dose estimates, revised several times, also depend on “the ability to elicit accurate information on location, position and shielding [which] was affected not only by traumatization of the survivors and their domestic stigmatization but by their distrust of medical teams working under occupation forces [48].” The Abombs annihilated Hiroshima and Nagasaki, destroying their social fabric.
For all these reasons this cohort may not predict risk elsewhere.
Conclusion
The Generalized Additive Model avoids potential problems in fitting a misspecified parametric model. In fact, the linear model is an inadequate description of relative risk for cancer incidence in female Japanese Abomb survivors exposed to low dose radiation. Contrary to the interactions at high doses, for all solid cancers risk increases with ageatexposure and decreases with timesinceexposure, suggesting that low dose radiation acts through progression of previously accumulated damage.
Timesinceexposure models were generally preferred to ageatexposure models. Transforming dose to its empirical cumulative distribution improved the model fit for all solid cancers and many specific sites. Constructing splines in this way improves their capacity to detect responses at very low doses. Covariate interactions were modelled with tensor product splines. Marginal Likelihood was the preferred method of smoothing parameter optimisation, and confidence intervals obtained from the Bayesian posterior covariance performed well.
The results at 10 mGy are much higher than expected. Whilst higher female cancer risks and the highly skewed distribution of dose are recognised, significantly raised Relative Risk for women exposed to low doses, not found for all solid cancers in men, is new. Most other Abomb survivor studies consider much wider dose ranges.
Whilst the results may be specific to this cohort, some studies of other cohorts with other methods give comparable results, notably for thyroid cancer where estimates are consistent with those derived here, and for the uterus.
Direct epidemiological evidence of risks from doses around 10 mGy is a current research priority [18]. Risk estimates derived here from low dose data are much higher than those on which the ICRP has relied when setting recommended annual dose limits [4], including the annual occupational dose limit of 20 mSv. For all solid cancers, almost all estimates here of Excess Relative Risk for women at 10 mGy are 9.6 – 45.7 times the prediction by linear extrapolation from [1].
Generalized additive models applied to the low dose data for female Japanese Abomb survivors show significant Relative Risks, particularly if the heavily skewed doses are transformed to improve their distribution. Evidence of elevated risk at low doses should contribute to higher radiation risk estimates and improve radiation protection for women.
Abbreviations
 χ ^{2} :

Pearson Chisquared statistic
 \(\hat {\phi }\) :

Pearson estimate of scale for quasipoisson model
 A^{+} :

0  20 mGy including (excluding) NIC subjects
 B^{+} :

0  100 mGy + NIC
 C^{+} :

0  500 mGy + NIC
 CD:

Cook’s Distance
 CI:

Bayesian posterior confidence interval
 D_{ δ } :

data cells with dose >δ
 dfres:

residual degrees of freedom
 e.d.f.:

effective degrees of freedom
 ERR:

Excess Relative Risk
 ecdf:

empirical cumulative distribution function
 ecdos:

empirical cumulative distribution of dose
 GAM:

Generalized Additive Model
 GCV.Cp:

Generalized Cross Validation/Mallows Cp
 J^{+} :

0  4 Gy + NIC
 LCL:

Bayesian posterior lower confidence limit
 ML:

Marginal Likelihood
 NIC:

Not In City
 REML:

Restricted Maximum Likelihood
 RERF:

Radiation Effects Research Foundation
 RR:

Relative Risk
 S_{ δ } :

dose >δ and since < 35
 T_{ δ } :

dose >δ and agex > 35
 UCL:

Bayesian posterior upper confidence limit
Declarations
Acknowledgments
Thanks to Simon Wood for help with GAMs and mgcv, and to David Richardson for encouragement. Thanks to the reviewers Richard Clapp and Kyoji Furukawa for editorial comments, and for alerting me to Dr Furukawa’s analysis.
The data described in this report were obtained from the Radiation Effects Research Foundation (RERF) in Hiroshima, Japan. RERF is a private foundation that is funded equally by the Japanese Ministry of Health and Welfare and the US Department of Energy through the US National Academy of Sciences. The conclusions reached in this report are mine and do not necessarily reflect the scientific judgement of RERF or its funding agencies.
Funding
This work was not funded.
Availability of data and materials
The dataset supporting the conclusions of this article is available in the Radiation Effects Research Foundation Downloadable Data respository, as the Life Span Study Solid Cancer Incidence Data, 1958–1998 and can be downloaded from http://www.rerf.jp/library/dl_e/lssinc07.html
Some extracts from the dataset are included within the article’s additional files.
Authors’ contributions
GD designed the study and carried out the computations.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
The study involves only freely available anonymous public data and published literature. I have not sought further Ethics Approval.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Preston DL, Ron E, Tokuoka S, Funamoto S, Nishi N, Soda M, et al. Solid cancer incidence in atomic bomb survivors: 1958  1998. Radiat Res. 2007; 168:1–64.View ArticleGoogle Scholar
 Pierce DA, Preston DL. Radiationrelated cancer risks at low doses among atomic bomb survivors. Radiat Res. 2000; 154:178–86.View ArticleGoogle Scholar
 Blue Book: EPA Radiogenic Cancer Risk Models and Projections for the U.S. Population. Washington DC: U.S. Environmental Protection Agency, Office of Radiation and Indoor Air. 2011. EPA 402R11001.
 ICRP. 2007 Recommendations of the International Commission on Radiological Protection (Users Edition): ICRP Publication 103 (Users Edition). Ann ICRP. 2007;37(2–4).
 Ronckers CM, Sigurdson AJ, Stovall M, Smith SA, Mertens AC, Liu Y, et al. Thyroid cancer in childhood cancer survivors: a detailed evaluation of radiation dose response and its modifiers. Radiat Res. 2006; 166:618–28.View ArticleGoogle Scholar
 Little MP, Kukush AG, Masiuk SV, Shklyar S, Carroll RJ, Lubin JH, et al. Impact of uncertainties in exposure assessment on estimates of thyroid cancer risk among Ukrainian children and adolescents exposed from the Chernobyl accident. PLoS One. 2014; e85723:9.Google Scholar
 Sasaki MS, Tachibana A, Takeda S. Cancer risk at low doses of ionizing radiation: artificial neural networks inference from atomic bomb survivors. J Radiat Res. 2014; 55:391–406.View ArticleGoogle Scholar
 Furukawa K, Misumi M, Cologne J, Cullings H. A bayesian semiparametric model for radiation doseresponse estimation. Risk Anal. 2016; 36:1211–23.View ArticleGoogle Scholar
 Stewart AS, Webb J, Giles D, Hewitt D. Malignant disease in childhood and diagnostic irradiation in utero. Lancet. 1956; 268(6940):447.View ArticleGoogle Scholar
 Wing S, Shy CM, Wood JL, Wolf S, Cragle DL, Frome EL. Mortality among workers at Oak Ridge National Laboratory: evidence of radiation effects in followup through 1984. JAMA. 1991; 265:1397–402.View ArticleGoogle Scholar
 Ron E, Lubin JH, Shore RE, Mabuchi K, Modan B, Pottern LM, et al. Thyroid cancer after exposure to external radiation: a pooled analysis of seven studies. Radiat Res. 1995; 141:259–77.View ArticleGoogle Scholar
 Richardson DB, Wing S. Greater sensitivity to ionizing radiation at older age: followup of workers at Oak Ridge National Laboratory through 1990. Int J Epidemiol. 1999; 28:428–36.View ArticleGoogle Scholar
 Sont WN, Zielinski JM, Ashmore JP, Jiang H, Krewski D, Fair ME, et al. First analysis of cancer incidence and occupational radiation exposure based on the National Dose Registry of Canada. Am J Epidemiol. 2001; 153:309–18.View ArticleGoogle Scholar
 Cardis E, Vrijheid M, Blettner M, Gilbert E, Hakama M, Hill C, et al. Risk of cancer after low doses of ionising radiation: retrospective cohort study in 15 countries. BMJ. 2005; 331(7508):77.View ArticleGoogle Scholar
 Kopecky KJ, Stepanenko V, Rivkind N, Voillequé P, Onstad L, Shakhtarin V, et al. Childhood thyroid cancer, radiation dose from Chernobyl, and dose uncertainties in Bryansk Oblast, Russia: a populationbased casecontrol study. Radiat Res. 2006; 166:367–74.View ArticleGoogle Scholar
 Davis FG, Krestinina LY, Preston D, Epifanova S, Degteva M, Akleyev A. Solid cancer incidence in the Techa River incidence cohort: 1956–2007. Radiat Res. 2015; 184:56–65.View ArticleGoogle Scholar
 Brenner DJ, Doll R, Goodhead DT, Hall EJ, Land CE, Little JB, et al. Cancer risks attributable to low doses of ionizing radiation: assessing what we really know. Proc Natl Acad Sci U S A. 2003; 100:13761–6.View ArticleGoogle Scholar
 NAS Nuclear and Radiation Studies Board. Planning towards the BEIR VIII report. 2014. http://dels.nas.edu/PastEvents/PlanningTowardsBEIRVIIIReport/AUTO01484B?bname=nrsb. Accessed 20 Oct 2016.
 Hastie T, Tibshirani R. Generalized additive models (with discussion). Statist Sci. 1986; 1:297–318.View ArticleGoogle Scholar
 R Core Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2013. http://www.Rproject.org. Accessed 25 Sept 2013.Google Scholar
 Wood SN. Generalized Additive Models: an introduction with R. Boca Raton, London, New York: Chapman &. Hall/CRC: Taylor & Francis Group; 2006.Google Scholar
 Life Span Study Solid Cancer Incidence Data, 19581998. Hiroshima, Japan: RERF; 2007. http://www.rerf.jp/library/dl_e/lssinc07.html. Accessed 20 Feb 2010.
 Radiation Effects Research Foundation (RERF). A Cooperative JapanUS Research Foundation. http://www.rerf.jp. Accessed 20 Oct 2016.
 Wood SN. Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc Series B Stat Methodol. 2011; 73:3–36.View ArticleGoogle Scholar
 Sasaki MS, Nomura T, Ejima Y, Utsumi H, Endo S, Saito I, et al. Experimental derivation of relative biological effectiveness of Abomb neutrons in Hiroshima and Nagasaki and implications for risk assessment. Radiat Res. 2008; 170:101–17.View ArticleGoogle Scholar
 DiCiccio TJ, Efron B. Bootstrap confidence intervals. Statist Sci. 1996; 11:189–228.View ArticleGoogle Scholar
 UNSCEAR. Report to the General Assembly, with scientific annexes. New York: United Nations Scientific Committee on the Effects of Atomic Radiation. 2012;2012.
 Preston DL, Shimizu Y, Pierce DA, Suyama A, Mabuchi K. Studies of mortality of atomic bomb survivors. Report 13: solid cancer and noncancer disease mortality: 1950–1997. Radiat Res. 2003; 160:381–407.View ArticleGoogle Scholar
 National Research Council. Committee to Assess Health Risks from Exposure to Low Levels of Ionizing Radiation. Health risks from exposure to low levels of ionizing radiation: BEIR VII Phase 2. Washington DC: National Academies Press; 2006.Google Scholar
 Nakashima E. Radiation dose response estimation with on low dose range using restricted cubic splines: application to all solid cancer mortality data, 1950–2003, in atomic bomb survivors. Health Phys. 2015; 109:15–24.View ArticleGoogle Scholar
 Beels L, Bacher K, De Wolf D, Werbrouck J, Thierens H. gammaH2AX foci as a biomarker for patient Xray exposure in pediatric cardiac catheterization: are we underestimating radiation risks?Circulation. 2009; 120:1903–9.View ArticleGoogle Scholar
 Shuryak I, Sachs RK, Brenner DJ. Cancer risks after radiation exposure in middle age. J Natl Cancer Inst. 2010; 102:1628–36.View ArticleGoogle Scholar
 Preston DL, Cullings H, Suyama A, Funamoto S, Nishi N, Soda M, et al. Solid cancer incidence in atomic bomb survivors exposed in utero or as young children. J Natl Cancer Inst. 2008; 100:428–36.View ArticleGoogle Scholar
 Sadetzki S, Chetrit A, Lubina A, Stovall M, Novikov I. Risk of thyroid cancer after childhood exposure to ionizing radiation for tinea capitis. J Clin Endocrinol Metab. 2006; 91:4798–804.View ArticleGoogle Scholar
 Ron E, Modan B, Preston D, Alfandary E, Stovall M, Boice JDJ. Thyroid neoplasia following lowdose radiation in childhood. Radiat Res. 1989; 120:516–31.View ArticleGoogle Scholar
 Bhatti P, Veiga LHS, Ronckers CM, Sigurdson AJ, Stovall M, Smith SA, et al. Risk of second primary thyroid cancer after radiotherapy for a childhood cancer in a large cohort study: an update from the childhood cancer survivor study. Radiat Res. 2010; 174:741–52.View ArticleGoogle Scholar
 Richardson DB. Exposure to ionizing radiation in adulthood and thyroid cancer incidence. Epidemiology. 2009; 20:181–7.View ArticleGoogle Scholar
 Likhtarov I, Kovgan L, Masiuk S, Talerko M, Chepurny M, Ivanova O, et al. Thyroid cancer study among Ukrainian children exposed to radiation after the Chornobyl accident: improved estimates of the thyroid doses to the cohort members. Health Phys. 2014; 106:370–96.View ArticleGoogle Scholar
 Tronko MD, Howe GR, Bogdanova TI, Bouville AC, Epstein OV, Brill AB, et al. A cohort study of thyroid cancer and other thyroid diseases after the Chornobyl accident: thyroid cancer in Ukraine detected during first screening. J Natl Cancer Inst. 2006; 98:897–903.View ArticleGoogle Scholar
 Fraser P, Carpenter L, Maconochie N, Higgins C, Booth M, Beral V. Cancer mortality and morbidity in employees of the United Kingdom Atomic Energy Authority, 194686. Br J Cancer. 1993; 67:615–24.View ArticleGoogle Scholar
 Carpenter L, Higgins C, Douglas A, Fraser P, Beral V, Smith P. Combined analysis of mortality in three United Kingdom nuclear industry workforces, 19461988. Radiat Res. 1994; 138:224–38.View ArticleGoogle Scholar
 Atkinson WD, Law DV, Bromley KJ, Inskip HM. Mortality of employees of the United Kingdom Atomic Energy Authority, 194697. Occup Environ Med. 2004; 61:577–85.View ArticleGoogle Scholar
 Muirhead CR, O’Hagan JA, Haylock RG, Phillipson MA, Willcock T, Berridge GL, et al. Mortality and cancer incidence following occupational radiation exposure: third analysis of the National Registry for Radiation Workers. Br J Cancer. 2009; 100:206–12.View ArticleGoogle Scholar
 Independent Advisory Group on Ionising Radiation. Risk of Solid Cancers following Radiation Exposure: Estimates for the UK Population. Health Protection Agency: Chilton; 2011.Google Scholar
 Boice, JD Jr. Models, models everywhere—is there a fit for lifetime risks. J Natl Cancer Inst. 2010; 102:1606–9.View ArticleGoogle Scholar
 Stewart AM, Kneale GW. Abomb survivors: factors that may lead to a reassessment of the radiation hazard. Int J Epidemiol. 2000; 29:708–14.View ArticleGoogle Scholar
 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1241119/pdf/ehp0110a00739.pdf.
 Lindee MS. Suffering made real: American science and the survivors of Hiroshima. Chicago: University of Chicago Press; 1994.View ArticleGoogle Scholar