 Published:
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Reanalysis of cancer mortality in Japanese Abomb survivors exposed to low doses of radiation: bootstrap and simulation methods
Environmental Healthvolume 8, Article number: 56 (2009)
Abstract
Background
The International Commission on Radiological Protection (ICRP) recommended annual occupational dose limit is 20 mSv. Cancer mortality in Japanese Abomb survivors exposed to less than 20 mSv external radiation in 1945 was analysed previously, using a latency model with nonlinear dose response. Questions were raised regarding statistical inference with this model.
Methods
Cancers with over 100 deaths in the 0  20 mSv subcohort of the 19501990 Life Span Study are analysed with Poisson regression models incorporating latency, allowing linear and nonlinear dose response. Bootstrap percentile and Biascorrected accelerated (BCa) methods and simulation of the Likelihood Ratio Test lead to Confidence Intervals for Excess Relative Risk (ERR) and tests against the linear model.
Results
The linear model shows significant large, positive values of ERR for liver and urinary cancers at latencies from 37  43 years. Dose response below 20 mSv is strongly nonlinear at the optimal latencies for the stomach (11.89 years), liver (36.9), lung (13.6), leukaemia (23.66), and pancreas (11.86) and across broad latency ranges. Confidence Intervals for ERR are comparable using Bootstrap and Likelihood Ratio Test methods and BCa 95% Confidence Intervals are strictly positive across latency ranges for all 5 cancers. Similar risk estimates for 10 mSv (lagged dose) are obtained from the 0  20 mSv and 5  500 mSv data for the stomach, liver, lung and leukaemia. Dose response for the latter 3 cancers is significantly nonlinear in the 5  500 mSv range.
Conclusion
Liver and urinary cancer mortality risk is significantly raised using a latency model with linear dose response. A nonlinear model is strongly superior for the stomach, liver, lung, pancreas and leukaemia. Bootstrap and Likelihoodbased confidence intervals are broadly comparable and ERR is strictly positive by bootstrap methods for all 5 cancers. Except for the pancreas, similar estimates of latency and risk from 10 mSv are obtained from the 0  20 mSv and 5  500 mSv subcohorts. Large and significant cancer risks for Japanese survivors exposed to less than 20 mSv external radiation from the atomic bombs in 1945 cast doubt on the ICRP recommended annual occupational dose limit.
Background
Analyses of cancer mortality in survivors of the atomic bombing of Japan in 1945 have played a central role in establishing risk estimates and radiation protection limits for workers and the wider public. While many people exposed at Hiroshima and Nagasaki received fairly high doses, the recommended occupational and environmental dose limits aim to control the risks from much lower doses. Typically, analyses of the full cohort of Japanese survivors, whose doses ranged from 0  4000 mSv or more, are extrapolated downwards to predict the risks from doses below 20 mSv [1]. However, even if a particular model gives a good overall account of the full range of data, fitting that model and extrapolating the results may misrepresent the risks at low doses where the dose response may be quite distinct.
My previous paper [2] considered the cancer mortality data for those survivors whose external (flash) dose from the Abombs was below 20 mSv, the annual occupational dose limit recommended by the International Commission on Radiological Protection (ICRP). I applied several different models, each of which used as their exposure variable the radiation dose lagged by an unknown latency. One model adopted the standard assumption that Excess Relative Risk (ERR) is proportional to dose, in this case the lagged dose. Other models allowed that ERR might be a nonlinear function of dose, as suggested by radiation cell biology studies of the bystander effect. All the models detected ERR values for a dose of 10 mSv which were far higher than predicted by extrapolating the standard analyses of the full cohort. The linear model found large, significant results for the liver, while the preferred nonlinear model found large, significant results for the stomach, liver, and lung. These results occurred over a range of latencies.
The statistical method I used to assess the significance of the findings with nonlinear models was criticised by Mark Little [3]. In this paper the data is reanalysed with other methods of statistical inference for the preferred nonlinear model.
The models use Poisson regression to analyse the grouped data in the 0  20 mSv subcohort of the 19501990 Life Span Study LSS12 [4]. The models take the form λ = λ_{o}(1+ERR) where λ is the cancer risk, λ_{o} depends only on control variables, and ERR depends only on the lagged radiation dose. Profile Likelihood Confidence Intervals for ERR were based in [2] on the Likelihood Ratio Test (LRT) for comparisons between nested models. Once latency is fixed, the models depend smoothly on all other parameters. If Wilks Theorem [5] holds, the asymptotic null distribution of LRT is χ^{2} on d degrees of freedom when the null hypothesis is specified by fixing the values of d parameters in the wider model. For example, the LRT comparison of the linear model ERR = βD with the control model ERR = 0 is asymptotically χ^{2} on 1 d.f. if the control model holds. Here the null hypothesis is β = 0 and Wilks Theorem does apply to this nested pair. As Little [3] pointed out, the nonlinear models are less well behaved. The transient model ERR = σDexp(τD) and the twophase model ERR = βD + σDexp(τD) are indeterminate in τ when σ = 0, so one of the regularity conditions for Wilks Theorem fails. In any case, even if the asymptotic null distribution of LRT were known, the distribution for a given finite set of records may differ from the asymptotic case.
However, the actual distribution of LRT can be estimated by simulation before comparing two competing models or constructing confidence intervals. Bootstrap methods [6, 7] can also be used to construct confidence intervals more directly.
In response to questions raised during peer review I also consider whether the data justifies fitting the twophase model, investigate uncertainties in the "optimal latency" as estimated from the data, extend the analysis to include the covariate "city", and compare results from the 0  20 mSv and 5  500 mSv dose ranges.
Methods
An initial analysis of latency is carried out for all cancer sites with at least 100 cases (deaths) in the 0  20 mSv subcohort. Sites which appear to have a raised dose response are then analysed by the bootstrap and the distribution of LRT is estimated by simulation. A fuller simulation is carried out for the stomach, the site with the largest number of cases in this subcohort. The analysis focuses on the behaviour of the twophase model (defined below) and its relation to the linear model. It aims to compare confidence intervals obtained by bootstrap and LRT simulation methods and to discover without relying on Wilks Theorem, whether the apparent elevation and nonlinearity of the dose response in the 0  20 mSv subcohort are statistically significant. Further analysis considers variation in latency, the impact of "city", male and female subcohorts, and the 5  500 mSv dose range. Computations use the statistical freeware R[8], giving a further check on previous results. Possible sources of error in the Japanese data itself and the potential for confounding by other covariates as discussed in [2], are not reconsidered here.
As previously source data for the 19501990 mortality cohort LSS12 was obtained from RERF [9] via the CEDR [10], and the 0  20 mSv subcohort defined by restricting the weighted adjusted colon dose. The subcohort comprises 3011 data cells with 1690391.75 py. The control model, linear model, and twophase model are defined as in [2]. Briefly, the control model is loglinear in 14 indicator variables for 5 year ageatexposure categories, an indicator for gender, and the numerical variable log mean attained age, each of which is welldefined for the data cells. The control model is specified as λ = λ_{0} = exp(α+Σβ_{j}x_{j}) where α and the β_{j} are unknown parameters, x_{j} are the covariates, exp denotes exponential, and the Poisson parameter in cell i with T_{i} personyears is λ_{i}T_{i} where λ_{i} is evaluated using the covariates in cell i. At this stage, as in [2], "city" (Hiroshima or Nagasaki) is not included as a covariate.
Other models are defined by λ = λ_{0}(1+ERR) where ERR depends only on D_{ϕ}, the radiation dose lagged by a latency parameter ϕ. D_{ϕ} is defined as the weighted adjusted colon dose in 10 mSv units when timesinceexposure ≥ ϕ, and 0 otherwise.
For the linear model, ERR = βD_{ϕ}
For the twophase model, ERR = βD_{ϕ} + σD_{ϕ}exp(τD_{ϕ}) where exp denotes exponential.
The models require 1+ERR > 0 in all cells so that the Poisson distributions are defined. For the twophase model τ > 0, slightly modifying the previous approach where τ ≥ 0. However, τ = 0 reduces to the linear model as does σ = 0.
In fitting a model at latency ϕ by Poisson regression, the unknown parameters α,β_{j} and β, σ, τ (as appropriate to the model) are chosen to maximise the likelihood of the observed data assuming independent Poisson distributions in each cell. Equivalently, fitting minimises the sum over the 3011 data cells of E_{i}  O_{i}ln(E_{i}), where O_{i} and E_{i} = λ_{i}T_{i} are the observed and expected number of cases in cell i.
Latency ϕ is fixed during the minimisation. The models depend smoothly on all other parameters. ERR_{D,ϕ} denotes the Excess Relative Risk evaluated at D for the fitted model with latency ϕ. Note ERR_{1,ϕ} is the estimated risk at 10 mSv lagged by ϕ years. For model I nested within model J, LRT_{JI,ϕ} is the Likelihood Ratio Test computed at latency ϕ.
Code written in R emulates the previous model fitting with ExcelSolver, using the NewtonRaphson minimisation routine "optim" [11]. Results for the stomach at latency ϕ = 5, 6,... 44 are checked against the values obtained previously.
At a given cancer site, the control, linear, and twophase models are fitted at latency ϕ = 5, 6,... 44 to give LRT_{JI,ϕ} for each pair of nested models: linear against control, twophase against control, and twophase against linear. Sites are selected for further analysis if LRT_{lincon,ϕ} > 4 for some ϕ, or if LRT_{2plin,ϕ} > 6 for some ϕ (not assuming that this criterion establishes statistical significance). For sites meeting the criterion for the linear model, Profile Likelihood Confidence Intervals for β = ERR_{1,ϕ} are computed using Wilks Theorem at integer values of ϕ. For sites meeting the criterion for the twophase model, ϕ = ϕ_{mabs} is chosen to maximise LRT_{2pcon,ϕ} (not LRT_{2plin,ϕ}) so that fitting at ϕ_{mabs} gives the absolute Maximum Likelihood Estimate (MLE) for the model. The "optimal latency" ϕ_{m} is chosen to maximise LRT_{2pcon,ϕ} subject to the constraint ERR_{1,ϕ} ≥ 0. Bootstrap confidence intervals and LRT simulations are then computed at ϕ_{m} . Details are given in the next two more technical subsections, and examples with commentary are supplied with Additional File 1. The method for taking account of uncertainty in the "optimal latency" is outlined in a final subsection.
Bootstrap
Fitting the twophase model at ϕ = ϕ_{m} gives a maximum likelihood estimate for the vector of all 20 model parameters and determines fitted values of λ_{i} in each data cell. The parametric bootstrap assumes the observed data arose from sampling this fitted model, and considers the variation arising if other data had been sampled from that same model.
Sampling the independent Poisson distributions with parameters λ_{i}T_{i} gives simulated observations in each data cell. Fitting the twophase model to this simulated data gives simulated parameter values including , and , and therefore a simulated value of ERR = ERR_{D,ϕ} = ). The entire process is repeated B times to give the bootstrap replications ERR*.
There are many wellestablished methods to obtain confidence intervals from bootstrap replications [7, 12]. The two options chosen here are the original "percentile" method, which is easy to describe, and the "Biascorrected accelerated" method which is much faster to compute, and in general more accurate.
The 12α "percentile" confidence interval has endpoints
where denotes the Bα^{th} ordered value of ERR*. With B = 1000 replications and α = 0.025, the endpoints of the 0.95 percentile interval are the 25^{th} and 975^{th} order statistics of ERR*.
For the Biascorrected accelerated (BC_{a}) confidence interval (see [7] Chapter 5 pp. 203207 and p.249) simulation is again carried out using λ_{i}T_{i}. However, the simulated observations are now used to fit the model to the "Least Favourable family" of distributions, obtained by restricting the parameter vector ψ to a line parametrised by ζ, passing through the MLE and defined by where
Here i^{1}() is the inverse of the Fisher Information evaluated at and is the gradient of the function h which defines the variable of interest (in this case ERR) in terms of the model parameters ψ.
For each bootstrap replication, fitting the simulated data along the LF family determines ζ and thus the simulated value ERR = . If ERR*_{ LF }denotes the resulting set of B replications, the BC_{a} confidence interval has endpoints
where
Here Φ is the standard normal cumulative distribution and z^{(α)} is the 100αth percentile point of a standard normal distribution. The parameters w and a are estimated from the bootstrap replicates. The bias correction w is computed as
i.e. the fraction of bootstrap replications along the LF family which are below the original fitted value of ERR, in normal units.
The acceleration a is computed as
where m_{3} and m_{2} are the 3^{rd} and 2^{nd} moments of , the partial derivative of the log likelihood of the simulated data with respect to ζ, evaluated at ζ = 0.
Code for this calculation is shown in Additional File 1, Code File AC1.txt, which calls Additional File 1, Data Files stomdat1.txt, lambda2.txt and theta2.txt. Additional File 1, Commentary File ACom1.doc discusses this code.
Despite the intricate formulas, computation is much faster for BC_{a} than for the simpler percentile method, because optimisation is now restricted to a line within the full 20dimensional parameter space. The percentile CIs are computed at B = 1000, the recommended minimum number, while BC_{a} results are obtained at B = 5000.
Coverage for the percentile and BC_{a} methods is tested using the stomach data. For each of 400 sets of simulated observations generated by sampling the Poisson distributions determined by the original fitted parameters, the model is refitted to the simulated data. Confidence intervals for the simulated ERR are obtained by the bootstrap methods (B = 1000) using the refitted parameters and counted as 1 or 0 if they contain or exclude the "true" value of ERR from the original fitted parameters. To streamline these computations of coverage for the percentile method, data is restricted to the Hiroshima subcohort (N = 1536 cells) and the model simplified to use 10year rather than 5year ageatexposure categories. Estimating coverage for the BC_{a} method uses the full 0  20 mSv dataset and model. The application of both methods to compute CIs from the observed data uses the full dataset and model.
Simulation of LRT
The Likelihood Ratio Test (LRT) for comparing nested models H_{0} ≤ H_{1} (H_{0} contained within H_{1}) given the observed data is LRT = 2[ℒ(H_{1}, )  ℒ(H_{0}, )] where ℒ(H_{i}, ) is the log likelihood of the observed data after fitting the model H_{i}. In wellbehaved situations where Wilks theorem [5] applies, the asymptotic distribution of LRT is known, provided the data was drawn from a distribution which satisfies the null hypothesis H_{0}. In particular, an appropriate critical value t_{0.95} can be chosen for rejecting H_{0} if LRT > t_{0.95} and this will result in a 5% error rate of rejecting H_{0} when it is true. Typically H_{0} fixes the values of d parameters in the full model H_{1} which allows those parameters to vary freely in an open neighbourhood. In this case Wilks theorem says that LRT is asymptotically χ^{2} distributed on d degrees of freedom if the observed data was generated by the model with those d parameters fixed as specified. On that basis, the null hypothesis can be rejected if LRT > t_{0.95} = F^{1}(0.95) where F is the cumulative of a χ^{2} distribution on d degrees of freedom.
Wilks theorem depends on many regularity conditions for the models H_{0} and H_{1}. If these fail, LRT may not be χ^{2} distributed; its distribution may depend on H_{0} rather than just on the number of specified parameters, and there may be no simple universal 95% critical value for rejecting H_{0}. Simulation aims to estimate the distribution of LRT and then to obtain conservative critical values which are sufficient to reject H_{0}. Confidence intervals for functions of the parameters are constructed using these critical values.
For the twophase model is the Likelihood Ratio Test for the hypothesis H_{0}: (β, σ, τ) = (β_{0},σ_{0},τ_{0}) vs. the alternative (β, σ, τ) ≠ (β_{0},σ_{0},τ_{0}), i.e. H_{1} is the union of H_{0} with its alternative, and leaves (β, σ, τ) free to vary, subject only to the defining conditions of the twophase model. For various choices of H_{0} the null distribution of assuming H_{0} holds is sampled by Monte Carlo simulation (see [7] Chapter 4; for another context see [13] p.84) as follows.
One) The model is fitted subject to H_{0}. The resulting parameter estimates give fitted values and therefore T_{i} for each cell, subject to H_{0}.
Two) Poisson distributions with parameters T_{i} are sampled independently to give simulated cases in each cell
Three) The model is fitted to the simulated data subject to H_{0}, yielding new fitted values for each cell. Put .
Four) The model is fitted to the simulated data with (β, σ, τ) allowed to vary freely within the parameter space, yielding new fitted values for each cell. Put K = . The simulated value of is defined as LRT* = K_{0}  K. Note that up to a constant which depends on the simulated data but not on the model, K_{0} = 2ℒ(H_{0}, ) while K = 2ℒ(H_{1}, ).
Five) Steps Two through Four are repeated 500 (or 1000) times to give the sample LRT*
An example of the required code is given in Additional File 1, Code File AC2.txt which calls Additional File 1, Data Files stomdat1.txt, lambda1.txt, theta1.txt, and Additional File 1, Code File AC2s.txt. Additional File 1, Commentary File ACom2.doc discusses this code.
While step Four is timeconsuming, steps One and Three are quick as the model is loglinear in the remaining parameters. Step Two is trivial in R. Note that in step Four the possible values of (β, σ, τ) are still constrained by the requirement that 1+ERR > 0 in all cells. For computability, the constraint is set at 1+ERR ≥ 0.001.
To check whether Wilks theorem holds, LRT* is first tested against a χ^{2} distribution on 3 d.f. by the KolmogorovSmirnov (onesample) test. LRT* is then tested against gamma distributions with unknown shape and rate parameters. A γ distribution is fitted to LRT* using the "fitdistr" subroutine of R (MASS library) [14]. When testing LRT* against the fitted distribution with shape parameter and rate parameter , the Kolmogorov statistic D is obtained from the (one sample) KS test but its pvalue is determined by simulation (as in the Lilliefors test). When LRT* has 500 elements, 500 random deviates are taken from the fitted γ distribution with parameters and to form a sample S, a new gamma distribution Γ is fitted to S, the KS statistic D* for testing S against Γ (onesample) is computed, and the process is repeated 5000× to estimate the probability that D* > D when samples are drawn from γ(, ). R code was adapted from [15].
For the special case (β_{0},σ_{0},τ_{0}) = (0,0, arbitrary) the sample LRT* uses 1000 replications; note that in this case = LRT_{2pcon}. Likewise at the MLE estimate (β_{0},σ_{0},τ_{0}) = (), LRT* is produced during the (percentile) bootstrap with 1000 replications. In both cases the simulations required for the pvalue of D use 1000 random deviates of the fitted γ distribution.
For the liver, lung, pancreas and leukaemia, LRT simulation (×1000) is carried out at (0,0, arbitrary) and (), and (×500) at 3 points chosen at random with σ_{0} = 0 and β_{0} in the intervals 0.5 < β_{0} < 0, 0 < β_{0} < 1, and 1 < β_{0} < 10, and also at 4 points chosen at random in a neighbourhood N_{10} of (). N_{10} comprises those (β_{0},σ_{0},τ_{0}) for which the observed (not simulated) < 10. Each set of simulations gives a sample LRT _{ i }* and corresponding fitted gamma distributions γ_{i}
For the stomach, simulations and corresponding fitted γ_{i} are carried out at the MLE and 50 other random points in the parameter space: (×500) at 11 points with σ_{0} = 0, 4 points in a neighbourhood N_{10} of (), 34 other points in the (β, σ, τ) parameter space, and (×1000) at (0,0, arbitrary) and ().
Estimates of a 95% critical value for the null distribution of LRT are then obtained in three ways. For each cancer site, the fitted gamma distributions and the order statistics of the simulated LRT _{ i }* are used to estimate a global 95% critical value as = max(sup(t_{.95,i}), sup(s_{.95,i})) where t_{.95,i} is the 95^{th} percentile of γ_{i} and s_{.95,i} is the 95^{th} percentile of LRT _{ i }*. That is, at each value of (β_{0},σ_{0},τ_{0}) tested for a particular site, 95% of the simulated LRT values and the 95^{th} percentile of the fitted gamma distribution are both below . An overall conservative estimate t_{.95} is chosen to exceed the corresponding maximum for all simulations and all cancer sites investigated. At each site, a refined estimate = max(sup(t_{.95,j})) is obtained by considering only the fitted gamma distributions and restricting to simulations at points (β_{0},σ_{0},τ_{0}) within N_{10}. At each cancer site ≤ ≤ t_{.95}.
The neighbourhood N_{10} of () is termed "appropriate" if the overall estimate t_{.95} ≤ 10 so that for (β_{0},σ_{0},τ_{0}) outside N_{10}, > t_{.95} . This is not a circular definition; t_{.95} is computed from the behaviour of simulated LRT at points outside as well as inside N_{10} for each site, while for (β_{0},σ_{0},τ_{0}) within N_{10} and even at () simulation may, if N_{10} is inappropriate, give LRT* with 95^{th} percentile > 10.
Profile Likelihood Confidence Intervals for ERR_{1,ϕ} and ERR_{0.025,ϕ} are then constructed using all three estimates for the critical value. With ϕ fixed, let U_{ϕ} be the set of those (β_{0},σ_{0},τ_{0}) which cannot be ruled out with 95% confidence as < t_{.95}(alternatively <, or <) Define the 95%CI for ERR_{D,ϕ} as the range of ERR_{D,ϕ} over U_{ϕ}.
When σ_{0} = 0, is the Likelihood Ratio Test for the hypothesis H_{0}: β = β_{0} and σ = 0 vs. the alternative β ≠ β_{0} and σ = 0, while LRT_{2plin} is the Likelihood Ratio Test for the hypothesis σ = 0 vs. the alternative σ ≠ 0. Note LRT_{2plin} + = . Simulated null distributions of all three LRT's are obtained for points with σ = 0. Conservative, global, and refined estimates T_{.95}, and of the 95% critical value for LRT_{2plin} are then defined in the same fashion as t_{.95}, and for . For each cancer, is the supremum of the 95^{th} percentiles of γ distributions fitted to the simulations of LRT_{2plin}, is the supremum of 95^{th} percentiles of the fitted γ distributions and simulations of LRT_{2plin}, and T_{.95} is an overall conservative value exceeding for all cancer sites considered. Likewise and are defined from the 99^{th} and 99.9^{th} percentiles of the fitted γ distributions. These various estimates are used as critical values for LRT_{2plin} when testing the linear model against the twophase model, given the observed data.
Latency
For latency ϕ = 5, 6,... 44 BC_{a} 95%CIs for ERR_{1,ϕ} and ERR_{0.025,ϕ} are computed.
Since the observed data is regarded as a sample from an underlying distribution, the optimal latency ϕ_{m} as inferred from the data is an estimate, whose distribution is again obtained by simulation. The twophase model is fitted at ϕ_{m} and the fitted model is resampled to provide simulated data, from which a new estimate is determined. To simplify, is restricted to integer values 5, 6,... 44. The process is repeated ×200 for the stomach, liver, lung, pancreas and leukaemia. The resulting set of is compared with the range of latencies for which BC_{a} 95%CIs are strictly positive, and likewise with the range for which LRT_{2plin} exceeds the estimated critical values.
To test whether the distributions of LRT_{2pcon} and LRT_{2plin} depend on latency, null distributions at (β_{0},σ_{0},τ_{0}) = (0,0, arbitrary) are simulated (×500) for the stomach with ϕ = 5, 10, 15,... 40 and compared with the null distribution from simulation (×1000) at ϕ_{m}.
City, dose range, and gender
The models are refitted with "city" as an added control covariate. For each of the 5 cancers, the optimal latency Φ_{m} (restricted to 5, 6,... 44) is determined along with latency ranges over which BC_{a} 95% and 90% CIs for ERR_{1} are strictly positive.
The models including "city" are fitted to the 5  500 mSv subcohort. A limited analysis of the null distributions of LRT_{2pcon} and LRT_{2plin} including "city" is carried out on the 0  20 mSv and 5  500 mSv dose ranges. For each cancer at its Φ_{m}, H_{0}: (β_{0},σ_{0},τ_{0}) = (0,0, arbitrary) is used for simulation of LRT_{2pcon} and H_{0}: (β_{0},σ_{0},τ_{0}) = (,0, arbitrary) where is the fitted parameter for the linear model, is used for simulation of LRT_{2plin}. Gamma distributions fitted to the simulated LRT* are used to estimate 95% critical values and to assign pvalues to the LRT_{2pcon} and LRT_{2plin} arising from the observed data.
The models including "city" are fitted separately to the male and female subcohorts in the 0  20 mSv dose range for the stomach, liver, and lung.
Finally, the linear model including "city" is fitted to the 0  5 mSv, 5  20 mSv, 0  0.5 mSv, and 0.5  20 mSv data for the liver and lung.
The very extensive simulations of LRT were shared across a number of PC's.
Results
Table 1 shows the individual cancer sites with over 100 cases (deaths) in the subcohort.
To check R against Excel, the twophase model is fitted to the stomach data for latencies ϕ = 5, 6,... 44 and LRT_{2pcon,ϕ} computed in both programmes. The difference (LRT_{2pcon,ϕ,R})  (LRT_{2pcon, ϕ, Excel}) has a range of (0.079, 0.000), mean = 0.004, s.d. = 0.014. At the optimal latency ϕ_{m} = ϕ_{mabs} = 11.89, LRT_{2pcon, ϕ} = 21.1903 by either method. The parameter estimates in R are = 0.4586, = 120.0076, = 82.6729, confirming the point estimates reported in [2]. On that basis, the methods are taken as comparable. Results are also comparable when computing LRTbased confidence intervals.
For the linear model ERR = βD_{ϕ} with ϕ = 5, 6,... 44 only the liver and urinary tract showed significant results. Figure 1 shows LRT_{lincon,ϕ} against ϕ for these two cancers, while Figure 2 shows the fitted values ERR_{1} = when ϕ ≥ 30 with 95%CIs (Profile Likelihood, assuming LRT ~ χ^{2} on 1 d.f. for testing β = β_{0} against β ≠ β_{0}). For the liver with this model, ϕ_{m} = ϕ_{mabs} = 38.58, at which = 0.69 (0.25, 1.26). Of the 540 cases in the subcohort, 171 have ϕ > 38.58. For the urinary tract, ϕ is locally optimal at 41.32 at which = 2.14 (0.51, 5.12). Of the 103 cases in the subcohort, 23 have ϕ > 41.32.
The stomach, liver, lung, pancreas, and leukaemia were selected for further analysis as LRT_{2plin,ϕ} > 6 for some ϕ, an indication that the dose response may be nonlinear. For the pancreas and leukaemia, Figure 3 shows LRT_{2plin,ϕ} and LRT_{2pcon,ϕ} while Figure 4 shows ERR_{1,ϕ} = as estimated from fitting the twophase model at latency ϕ. Note that may differ greatly from the value obtained by fitting the linear model at ϕ. Related graphs for the stomach, liver, and lung were shown in [2], though their statistical significance is reconsidered below.
Table 2 shows the latencies ϕ_{mabs} and ϕ_{m} which maximise LRT_{2pcon,ϕ} without and with the constraint ERR_{1,ϕ} ≥ 0, the values of LRT_{2pcon,ϕ} and LRT_{2plin,ϕ} at ϕ_{m} along with the fitted parameters and corresponding point estimates of ERR_{1} and ERR_{0.025} and their bootstrap confidence intervals by the BC_{a} (5000 replications) and percentile (1000 replications) methods. For the stomach, liver, and lung, these confirm the point estimates in [2].
Coverage for the BC_{a} method is estimated for the stomach at 94% for ERR_{1} (376 of 400) and 93% for ERR_{0.025} (372 of 400). Coverage for the percentile method is estimated for the stomach using the 0  20 mSv Hiroshima subcohort with the simplified twophase model (10 year ageatexposure categories) giving 94.3% for ERR_{1} (377 of 400) and 93% for ERR_{0.025} (372 of 400). All CIs reported in Table 2 use the full model (5 year categories, both cities).
For the stomach with latency ϕ = 5, 6,... 21, Figure 5 shows the BC_{a} 95% confidence intervals (B = 5000) for ERR_{1} and ERR_{0.025}. Each of these CIs is strictly positive. Since the minimum timesinceexposure in the subcohort is 6.08 years, for ϕ = 5 or 6 the model involves no correction for latency and ERR is a function of the unmodified colon dose.
More generally, for each site (stomach, liver, lung, pancreas, leukaemia) there is a neighbourhood surrounding ϕ_{m} over which BC_{a} 95% CIs for ERR_{1} and ERR_{0.025} are strictly positive, which was evaluated restricting to integer latencies and using B = 5000 bootstrap replications. For the stomach, both CIs are strictly positive when 5 ≤ ϕ ≤ 21. For the liver, both CIs are strictly positive when 32 ≤ ϕ ≤ 43. For the lung, both CIs are strictly positive when 7 ≤ ϕ ≤ 21. For the pancreas, both CIs are strictly positive when 5 ≤ ϕ ≤ 11. For leukaemia, the 95% CI for ERR_{1} is strictly positive when 24 ≤ ϕ ≤ 26, but the 95% CI for ERR_{0.025} is strictly positive when 19 ≤ ϕ ≤ 36.
For each site, simulating data (200 sets) from the fitted model at ϕ_{m} and redetermining the optimal latency (restricting to integer values) from the simulated data shows the uncertainty in ϕ_{m} itself. For the stomach 5 ≤ ≤ 21 for 196 out of 200 simulated datasets, while 5 ≤ ≤ 21 for 187 out of 200 simulated datasets. For the liver, 32 ≤ ≤ 43 for 200 out of 200 simulated datasets, while 32 ≤ ≤ 43 for 197 out of 200 simulated datasets. For the lung, 7 ≤ ≤ 21 for 184 of 200, while 7 ≤ ≤ 21 for 173 of 200 simulated datasets (nb for the lung ϕ_{m} ≠ ϕ_{mabs}). For the pancreas the results are weaker, as 5 ≤ ≤ 11 for 157 of 200, and 5 ≤ ≤ 11 for 151 of 200 simulated datasets. For leukaemia 24 ≤ ≤ 26 for 132 of 200 and 24 ≤ ≤ 26 for 130 of 200, but 19 ≤ ≤ 36 for 188 of 200 and 19 ≤ ≤ 36 for 184 of 200 simulated datasets.
Additional File 1, Tables S1, S2, and S3 give the simulated null distributions of LRT for the twophase model with the null hypothesis H_{0}: (β, σ, τ) = (β_{0},σ_{0},τ_{0}) vs the alternative (β, σ, τ) ≠ (β_{0},σ_{0},τ_{0}), for the stomach, liver, lung, pancreas, and leukaemia. Additional File 1, Table S1 shows simulation for each cancer at the fitted values (β_{0},σ_{0},τ_{0}) = () and for randomly selected points in the neighbourhood N_{10} of the fitted values (see Methods). Additional File 1, Table S2 shows simulation for each cancer with σ_{0} = 0, where τ_{0} is arbitrary. Additional File 1, Table S3 shows additional simulation for the stomach when σ_{0} ≠ 0, but (β_{0},σ_{0},τ_{0}) is outside the neighbourhood N_{10}. For the simulations in S2, Additional File 1, Table S4 shows the results for the linear model and the comparison with the twophase model. These simulations are carried out at the optimal latency for the specific cancer.
As the control parameters are optimised subject to H_{0} before simulating LRT, perhaps other choices of control parameters while retaining H_{0} might yield different estimates of the null distribution of LRT subject to H_{0}. However, four sets of simulations (×500) for the stomach with (β_{0},σ_{0},τ_{0}) = (0, 0, arbitrary) using very different choices of control parameters and total cases expected, yielded fairly similar LRT* and fitted gamma distributions as shown in Table 3. Optimised control parameters for the observed data are shown in the last row. This is evidence that the null distribution of is approximately independent of the parameters used to simulate the data, provided that H_{0} holds. In the terminology of [7] (Chapter 4 p. 139) is an "approximate pivot" even though Wilks Theorem fails. In turn, this gives the basis for rejecting H_{0} given the observed data.
The simulated null distributions of from Additional File 1, Tables S1, S2, and S3 are then applied to estimate t_{.95}, and as defined in Methods. All simulations showed LRT and its fitted gamma distribution had 95^{th} percentiles qlrt(0.95) < 10 and qgamma(0.95) < 10. Thus t_{.95} = 10 is a conservative estimate throughout the parameter space, for all cancer sites considered here, and the choice of N_{10} is appropriate (see Methods).
Table 4 shows Profile Likelihood 95% confidence intervals for ERR_{1} and ERR_{0.025} at optimal latency ϕ_{m} as computed with the various estimated critical values.
Likewise from Additional File 1, Table S4 and using the notation in that file, all simulations of the null distribution of LRT_{2plin} showed qlrt2(0.95) < 8 and qgamma2(0.95) < 8 so T_{.95} = 8 is a conservative estimate throughout the parameter space, for all cancer sites considered here. For any given site is the supremum of qlrt2(0.95) and qgamma2(0.95) over simulations for that site. is the supremum of qgamma2(0.95) over simulations for that site. Likewise and are the suprema of qgamma2(0.99) and qgamma2(0.999) over simulations for that site. Table 5 shows these critical values and the observed value of LRT_{2plin} (see Table 2) for each site.
Null distributions of and LRT_{2plin} for the stomach with ϕ = 5, 10,... 40 show little variation with latency and are similar to those obtained at ϕ_{m} as shown in Table 6.
Simulation of the optimal latency as described earlier also has implications for nonlinearity. Again restricting to integral latencies, for the stomach LRT_{2plin} > 9.4 when 5 ≤ ϕ ≤ 21 and 196 of 200 simulated datasets gave in this range. For the liver LRT_{2plin} > 16.7 when 34 ≤ ϕ ≤ 38 (198 of 200 simulations). For the lung LRT_{2plin} > 7.7 when 7 ≤ ϕ ≤ 21 (184 of 200 simulations). For leukaemia LRT_{2plin} > 9.4 when 17 ≤ ϕ ≤ 36 (191 of 200 simulations). Each of these results shows nonlinearity (see Table 5) and for the stomach and liver the evidence is very strong. For the pancreas, LRT_{2plin} > 9.5 when 7 ≤ ϕ ≤ 11 but only 156 of 200 simulations give in this range.
The covariate "city" has little impact on the estimates of ERR when taken as an additional control. For the liver including "city", BC_{a} 95%CIs for ERR_{1} and ERR_{0.025} are strictly positive when 29 ≤ ϕ ≤ 43, again restricting to integral latencies. With "city" included in both models LRT_{2pcon} attains its maximum subject to ERR_{1} ≥ 0 when ϕ_{m} = 37 and LRT_{2pcon} = 28.41. At ϕ_{m} = 37, ERR_{1} = 1.26 with BC_{a} 95%CI (0.61, 2.14) and ERR_{0.025} = 0.68 with BC_{a} 95%CI (0.18, 1.25). For comparison, without "city" the optimal integral latency is ϕ_{m} = 37, at which ERR_{1} = 1.22 with BC_{a} 95%CI (0.59, 2.11) and ERR_{0.025} = 0.74 with BC_{a} 95%CI (0.23, 1.43). As a test of nonlinearity (with "city" included in both models) LRT_{2plin} > 6.8 when 29 ≤ ϕ ≤ 41, and LRT_{2plin} > 18.15 when 34 ≤ ϕ ≤ 38, whereas without "city" LRT_{2plin} > 6.86 when 32 ≤ ϕ ≤ 41 and LRT_{2plin} > 16.73 when 34 ≤ ϕ ≤ 38.
Table 7 includes results with "city" for the 0  20 mSv dose range, for all 5 cancers. It also shows that fitting the twophase model with "city" to the 0  20 mSv and 5  500 mSv dose ranges gives comparable results for ERR_{1} and latency, for the stomach, liver, lung, and leukaemia.
A limited analysis by simulation of LRT for the model with "city" was carried out over the 0  20 mSv and 5  500 mSv dose ranges, focused at the optimal (integral) latencies on the null distributions of LRT_{2pcon} and LRT_{2plin} at (β_{0},σ_{0},τ_{0}) = (0, 0, arbitrary) and (β_{0},σ_{0},τ_{0}) = (, 0, arbitrary) where is obtained by fitting the linear model. Results are shown in Table 8 and the fitted gamma distributions are used to assign pvalues to the observed LRT in Table 7.
From Table 8 (with "city") and Additional File 1, Table S4 (without "city"), the criterion LRT_{2plin} > 6 is a test of nonlinearity at 90% level. Accordingly, Table 7 also shows the latency ranges surrounding Φ_{m} for which LRT_{2plin} > 6.
Other aspects of the analysis (simulated null distributions of LRT_{2pcon} and LRT_{2plin} across the parameter space and simulated variation in optimal latency ϕ_{m}) were not repeated with "city".
For the stomach, liver, and lung Table 9 shows the results when the twophase model with "city" is fitted separately to the male and female 0  20 mSv subcohorts. For the pancreas and leukaemia, there were too few cases to split the data by gender.
Table 10 shows the linear model with "city" fitted to the 0  20 mSv, 0  5 mSv, 5  20 mSv, 0  0.5 mSv, and 0.5  20 mSv subcohorts for the liver and lung.
Discussion
This paper develops and partially corrects the approach in [2] and the two should be read in conjunction. I focus initially on the 0  20 mSv dose range, which provides over 60% of the personyears of observation in the Life Span Study 12 cohort. Over the full data range 0  8000 mSv and the truncated range 0  4000 mSv the dose response for cancer mortality is known to be approximately linear and risk estimates for 1000 mSv have been obtained through the ongoing Life Span Study project. The ICRP recommendations for radiation protection are then based on linear extrapolation, as the Excess Relative Risk from 10 mSv is presumed to be 0.01 times the ERR from 1000 mSv.
If we were confident that the approximate linearity of the dose response extends to the low dose region, then linear extrapolation would be justified. But this is not known, and the response may be approximately linear over a wide dose range but highly nonlinear at lower doses. There are many examples of nonlinear dose response in radiation cell biology [16, 17].
Thus, if the Japanese data is used to derive risk estimates for doses such as 10 mSv or 1mSv which are directly relevant to occupational and public exposure, it makes sense to consider the 0  20 mSv data in its own right and in relation to wider dose ranges, rather than to pool the data and apply linear extrapolation.
Secondly, a latency parameter ϕ is used to assess the delay between exposure and cancer mortality. Although it would be better to allow latency to modify dose by some smooth function rather than an abrupt switch from no effect to full effect when TimeSinceExposure passes ϕ, this is much harder to compute and was not attempted here.
As pointed out during peer review, we should consider at the outset whether the data is sufficient for fitting a nonlinear model based on mean dose. Figure 6 shows scatterplots of py observation against mean dose, for the ranges 0  500 mSv, 5  20 mSv, 0  5 mSv, and 0  0.5 mSv. Clearly the 0  500 mSv and 5  500 mSv data contains a wide variety of doses. The 5  20 mSv data shows a spread of mean doses. Whilst the 0  5 mSv data is concentrated below 1 mSv, there is a spread of mean doses in that region. Each dose range determines a very large dataset.
The fact that for the stomach, liver, lung, and leukaemia rather similar results for ERR_{1} and latency are obtained from fitting the twophase model across the 5  500 mSv dose range as from fitting across the 0  20 mSv dose range suggests that there is also sufficient information in the 0  20 mSv data to justify fitting this model. The stomach, with the largest number of cases, shows fairly tight confidence intervals for Excess Relative Risk over a broad range of latencies, which supports the validity of the model.
If the dose response were linear, similar estimates of its slope (β) should be obtained over different dose ranges. As the liver and lung data illustrate (Table 10), fitting the linear model separately to subcohorts of the 0  20 mSv dose range yields very different estimates of the dose response. For the liver, there are significant results at comparable latencies in the 0  20 mSv dose range and all the subranges, but the estimates of β from doses below 5 mSv (or below 0.5 mSv) are completely different from those obtained from doses above 5 mSv (or 0.5 mSv), in fact around 20 times higher when latency = 36 years. For the lung, the linear model detects some significant activity in restricted dose ranges, but the results are very different, and combining the two dose ranges yields insignificant results. By contrast, see Table 7, the twophase model shows significant results for the lung with the 0  20 mSv data for latencies from 10 to 21 years.
It is not unusual for a model to achieve significance on portions of the data but fail to do so when they are combined. Even univariate Ordinary Least Squares regression on a bimodal independent variable may illustrate this, as in Figure 7 with synthetic data.
As discussed in [2] the graphs of ERR against latency show a clear separation into positive and negative regions, as is also apparent with the linear model over restricted dose ranges for the liver and lung. For any given model, the "optimal latency" ϕ_{m} identifies the time lag which maximises LRT within the region of nonnegative ERR_{1}. The existence of a significant raised risk is of interest whether or not other latencies show significant decrease in risk, with possibly higher LRT. Frequently ϕ_{m} coincides with ϕ_{mabs} which maximises LRT without constraint, but the distinction is relevant for the lung (Table 2) and more generally when simulating the variation in latency.
The point estimates of optimal latency ϕ_{m} and ERR obtained previously are confirmed by Table 2. The next stage of analysis concerns confidence intervals for ERR at the optimal latency. Statistical inference in [2] was based on the Likelihood Ratio Test but it was wrongly assumed, as Little [3] pointed out, that Wilks theorem held for the nonlinear models so that LRT would be asymptotically χ^{2} distributed, if the null hypothesis (specifying parameters in the model) were true.
One of the regularity conditions required for Wilks Theorem is that distinct values of the model parameters produce distinct probability distributions. In the twophase model when σ = 0, ERR = βD_{ϕ} irrespective of the value of τ. The asymptotic properties of models with an indeterminate parameter (which cannot be identified from the distribution) have been analysed [18–21] but I was unable to use these methods and therefore took an intensive computational approach, which applies to the given set of records rather than asymptotically.
The parametric bootstrap is an established technique for finding confidence intervals in the absence of any information on the distribution of LRT. The percentile method refits the model to simulated data generated by samples from the model whose parameter values were obtained by fitting to the actual data. The BC_{a} method adjusts for the fact that simulation may produce biased estimates of the statistic of interest (in this case ERR) and of its variance. Both methods gave good coverage with almost identical rates when tested on the stomach data, and comparable confidence intervals for all the sites analysed. The BC_{a} method is much faster and generally preferable.
However, the underlying assumption of the parametric bootstrap is open to question. Perhaps the observed data arose from sampling at parameter values other than those obtained by fitting the model to that data. Confidence Intervals based on LRT avoid this assumption but depend on the critical value t_{.95} used to reject the null hypothesis if LRT > t_{.95}.
If Wilks Theorem had applied, then LRT for testing H_{0}: (β, σ, τ) = (β_{0},σ_{0},τ_{0}) vs. the alternative (β, σ, τ) ≠ (β_{0},σ_{0},τ_{0}) would have the asymptotic distribution LRT ~ χ^{2} on 3 d.f. if H_{0} holds, for any H_{0}. Thus t_{.95} = F^{1}(0.95) = 7.8147 where F is the cumulative χ^{2} on 3 d.f. would be the appropriate critical value, rejecting H_{0} if LRT > t_{.95}.
In the absence of any theoretical prediction, the null distribution of LRT can still be estimated by Monte Carlo simulation but may depend on the choice of H_{0} as seen in Additional File 1, Tables S1, S2, S3 and S4. In fact Additional File 1, Tables S2 and S3 show the null distribution of departs from χ^{2} on 3 d.f. as σ_{0} → 0, while Additional File 1, Table S4 shows that the null distribution of LRT_{2plin} departs from χ^{2} on 2 d.f. everywhere. Table 3 indicates that the simulated null distribution of is fairly independent of the parameters for the control covariates, and depends only on H_{0}: (β, σ, τ) = (β_{0},σ_{0},τ_{0}). The null distributions when σ_{0} = 0 are quite different from those when σ_{0} >> 0, as expected since it is σ_{0} = 0 which causes the indeterminacy of the model and the failure of Wilks Theorem on this hyperplane. Gamma distributions give a reasonable fit to the simulated null distribution of in general (see Additional File 1, Tables S1, S2 and S3). When H_{0} has σ_{0} = 0, typical shape and rate parameters are s ~ 2.24 and r ~ 0.63, while for σ_{0} >> 0 s ~ 1.5 and r ~ 0.5 as predicted by Wilks Theorem. Figure 8, the probablity plot for the stomach with (β_{0},σ_{0},τ_{0}) = (0,0, arbitrary) shows the failure of Wilks Theorem and the close fit of simulated = LRT_{2pcon} with a gamma distribution (s = 2.418, r = 0.604).
The KolmogorovSmirnov test against the fitted distribution is extremely sensitive and even when it distinguishes LRT from the fitted gamma, a probability plot may show close agreement. For example for the liver, with (β_{0},σ_{0},τ_{0}) = (7.297, 0, arbitrary), the KS test has D = 0.058 with simulated p = 0.001 but the plot (Figure 9) is fairly close.
Likewise, fitted gamma distributions generally give a reasonable fit to the null distribution of LRT_{2plin} with typical parameters s ~ 1.79, r ~ 0.68.
For the linear model, Additional File 1, Table S4 shows simulation of the null distribution of generally conforms to a χ^{2} distribution on 1 d.f. as expected from Wilks theorem. However as β_{0} → 0.5, the null distribution approaches a γ distribution with shape = 0.5, rate = 1. Since β_{0} = 0.5 specifies a boundary of the parameter space, the theory of mixture models [22] predicts, at the boundary, the null distribution of ~ 1/2 χ^{2} (1 d.f.) = 1/2 γ (0.5,0.5) = γ (0.5,1). The confidence intervals shown in Figure 2 for liver and urinary cancers involve β values far from the boundary and Wilks theorem is valid in this region.
For all simulations and all cancer sites (Additional File 1, Tables S1, S2 and S3) the 95^{th} percentile of the fitted gamma distributions and the 95^{th} percentile of the simulated itself are all below 10. Additional File 1, Tables S1, S2, and S3 summarise nearly 50,000 simulations of LRT at a wide variety of randomly chosen values of H_{0} . On that basis, t_{.95} = 10 is chosen as a conservative estimate applying throughout the parameter space for all cancer sites considered here. H_{0} can be rejected whenever > 10, i.e. whenever (β_{0},σ_{0},τ_{0}) is outside the neighbourhood N_{10} of the fitted parameters (). N_{10} is thus an appropriate neighbourhood, as defined in Methods.
Inside this neighbourhood Additional File 1, Table S1 shows gamma distributions generally fit the simulated very well, justifying the definition of from fitted gamma distributions within N_{10}. The intermediate choice is based on simulations inside and outside N_{10} and uses both the simulations and the fitted distributions to adjust the critical value upwards as either may be underestimates.
For the stomach, liver, and lung, confidence intervals for ERR obtained by the bootstrap (Table 2) or from simulation of (Table 4) are broadly similar, whichever critical value or bootstrap method is chosen. LRT intervals using show closer agreement with the bootstrap methods. For these 3 cancers, all the LRTbased intervals for ERR_{1} and ERR_{0.025} exclude 0 as do the bootstrap intervals, which are tighter. This validates the claim in [2] that ERR_{1} is significantly elevated for the stomach, liver, and lung.
All five cancers show significant nonlinearity of the dose response by LRT comparison of the twophase and linear models (Table 5). For the stomach, liver, and leukaemia the twophase model is preferable with p << 0.001, while for the lung and pancreas p < 0.01.
The previous discussion concerns behaviour at the optimal latency ϕ_{m}, an unknown parameter estimated from the data. BC_{a} intervals, comparable at ϕ_{m} to CIs obtained by other methods, were used to investigate ERR across a range of latencies, restricted to integers 5, 6,... 44. Each cancer site has latency ranges in which BC_{a} CIs for ERR_{1} and ERR_{0.025} are strictly positive. Figure 5 illustrates this for the stomach and also shows that ERR is significantly raised when the dose is not modified by latency (ϕ = 6 is below the minimum timesinceexposure in the 0  20 mSv cohort). With the linear model, liver and urinary cancers are significantly elevated over a range of long latencies, showing similar variation (Figures 1 and 2).
Fitting the twophase model at ϕ_{m}, simulating new data and finding the simulated optimal latency gives an estimate of the probability that the true value of ϕ_{m} lies in any given region. For the liver and stomach, this approach gives very strong evidence that ϕ_{m} lies in the regions where ERR_{1} and ERR_{0.025} are strictly positive, and where the twophase model is clearly superior (by LRT) to the linear model. For the lung, the evidence on these points is still valid, though weaker. For leukaemia, there is strong evidence that ϕ_{m} lies in the region where the dose response is nonlinear, and likewise in the region where the 95% CI for ERR_{0.025} is strictly positive. However, with the variation in ϕ_{m} we cannot be confident that the 95% CI for ERR_{1} is strictly positive.
Many results for the pancreas are considerably weaker than for other sites. Even at the optimal latency, the confidence intervals are extremely wide. With the BC_{a} method, the lower limit is attained as the minimum bootstrap value, and is assigned a probability > 0.025. The "bias" and "acceleration" of the BC_{a} bootstrap sample are both large, unlike those for other sites. When the uncertainty in the optimal latency is determined by simulation at ϕ_{m} we cannot be confident that the true optimal latency for the pancreas lies within the range where ERR > 0 or where nonlinearity is detected by LRT_{2plin}.
Up to this point, the primary aim was to test the validity of the conclusions in [2], and accordingly "city" was not included as a control covariate and the dose range remained at 0  20 mSv. A limited reanalysis including "city" in the baseline model and comparing the 0  20 mSv and 5  500 mSv dose ranges is shown in Tables 7 and 8. The higher dose range was chosen to include 10 mSv, but to exclude the lowest dose category in the LSS12 dataset. Above 5 mSv the revised dosimetry DS02 generally agrees with the DS86 dosimetry used here (and in [2]). Comparing these ranges tests whether the results from 0  20 mSv simply reflect some anomaly in the lowest dose category. With the exception of the pancreas, there is a striking similarity between results from the 0  20 mSv and 5  500 mSv dose ranges, which show comparable estimates of the optimal latency, latency ranges, and the Excess Relative Risk from 10 mSv.
I do not know if any other approach to baseline risk may explain these effects. However, significant nonlinearity and positive ERR are also found for the stomach, liver and lung when the data is restricted to male or female subcohorts, with "city" included in the model (Table 9). Any possible interaction effects involving gender would disappear with the allmale or allfemale data. As mentioned in [2] the use of log mean attained age was an adequate alternative to the full set of attained age categories, and alternative controls affected the estimates of ERR by no more than a factor of 2.
The results at 0.25 mSv (ERR_{0.025}) show that the risk detected here arises at extremely low doses. If there are doubts about the accuracy of the DS86 dosimetry, the question remains as to why the DS86 dose is a significant predictor of risk.
Overall, the results have several possible interpretations. They may reflect confounding by other covariates not shown in the public dataset, which may be correlated with the external dose used here (and in LSS12) and may also interact with radiation and may vary over time. Differences between the rural and urban populations may contribute to the nonlinear effects. Or, these may be caused by gross underestimates of the external dose (unlikely) or by internal doses arising, for example, from "black rain". Such interpretations would be specific to the Japanese cohort, and would therefore cast doubt on using this data to predict radiation response elsewhere.
However, the results may also indicate that low doses of radiation do impact on cancer mortality, provided that latency is included in the model and the dose response is not constrained to be linear. In radiation cell biology, nonlinear dose response is known for the bystander effect using a variety of endpoints and the twophase model was proposed as a simplified form of a model for the bystander effect [16].
This paper does not reconsider any of these possible interpretations [2]. However, I hope it establishes the statistical validity of this analysis of the available public data for the 19501990 mortality cohort.
Conclusion
This reanalysis validates the main conclusions of [2]. Bootstrap methods and Monte Carlo simulation of LRT show that Excess Relative Risk for cancer mortality in Japanese Abomb survivors exposed to external radiation doses below 20 mSv is positive, large, and significant for various cancers, as detected by models incorporating latency. The dose response is highly nonlinear for the stomach, liver, lung, pancreas, and leukaemia. In each case the twophase model shows large Excess Relative Risk at 10 mSv external dose lagged by the optimal latency for the cancer. LRTbased 95% Confidence Intervals are strictly positive, except for leukaemia. Bootstrap BC_{a} 95% CIs are strictly positive for all five cancers over a range of latencies. Large, positive Excess Relative Risk is also found when the male and female data is analysed separately for the stomach, liver and lung.
When the optimal latency varies by simulation, the stomach, liver, lung and leukaemia still show nonlinear dose response, and likewise ERR > 0 at 95% level for the stomach, liver, lung.
With "city" included as a control covariate in the twophase model, similar estimates of latency and ERR at 10 mSv are obtained for the stomach, liver, lung, and leukaemia whether the dose range is 0  20 mSv or 5  500 mSv. Dose response for the liver, lung, and leukaemia is significantly nonlinear in the 5  500 mSv range, particularly for the lung. Such results cannot be explained by any anomaly of the 0  5 mSv data.
The linear model finds significant results for the liver and urinary tract over a range of long latencies.
This analysis of cancer mortality in Japanese Abomb survivors exposed to low doses of external radiation in 1945 shows significant nonlinearity of dose response and significant large Excess Relative Risk over a range of latencies. These findings do not support the current ICRP recommended annual occupational dose limit of 20 mSv.
Abbreviations
 BC_{a} :

Biascorrected accelerated
 CEDR:

Comprehensive Epidemiological Data Resource
 CI:

Confidence Interval
 DS02:

Dosimetry System 2002
 DS86:

Dosimetry System 1986
 ERR:

Excess Relative Risk
 ICRP:

International Commission on Radiological Protection
 KS:

KolmogorovSmirnov
 LF:

Least Favourable
 LRT:

Likelihood Ratio Test
 LSS:

Life Span Study
 MLE:

Maximum Likelihood Estimate
 mSv:

milliSievert
 py:

personyears
 RERF:

Radiation Effects Research Foundation.
References
 1.
ICRP: 1990 Recommendations of the International Commission on Radiological Protection Publication 60, Annals of the ICRP 21 (13). 1991, Oxford
 2.
Dropkin G: Low dose radiation and cancer in Abomb survivors. Environ Health. 2007, 6: 110.1186/1476069X61.
 3.
Little Mark: Statistical problems with analysis of Dropkin. [http://www.ehjournal.net/content/6/1/1/comments#260548]
 4.
Pierce DA, Shimizu Y, Preston DL, Vaeth M, Mabuchi K: Studies of the Mortality of Atomic bomb survivors. Report 12, Part 1, Cancer: 19501990. Radiat Res. 1996, 146 (1): 127. 10.2307/3579391.
 5.
Dudley R: MIT OpenCourseWare  Mathematics  18.466 Mathematical Statistics, Spring 2003  Lecture Notes. [http://ocw.mit.edu/OcwWeb/Mathematics/18466MathematicalStatisticsSpring2003/LectureNotes/]
 6.
Efron B, Tibshirani RJ: An Introduction to the Bootstrap. 1993, New York: Chapman & Hall
 7.
Davison AC, Hinkley DV: Bootstrap Methods and their Application. 1997, Cambridge: Cambridge University Press
 8.
R Development Core Team: R: A language and environment for statistical computing. 2007, Vienna, Austria: R Foundation for Statistical Computing, [http://www.Rproject.org]
 9.
RERF (Radiation Effects Research Foundation). [http://www.rerf.or.jp/]
 10.
CEDR (Comprehensive Epidemiological Data Resource) Data File Set JALSSA03 File ID: R12CANC. [http://cedr.lbl.gov/cgibin/spiface/find/cedrfile/?FILE=1040], accessed 19 September 2002
 11.
R source code based on Nash JC: Compact Numerical Methods for Computers: Linear Algebra and Function Minimisation. 1990, CRC Press
 12.
DiCiccio TJ, Efron B: Bootstrap Confidence Intervals. Stat Sci. 1996, 11 (3): 189228. 10.1214/ss/1032280214. [http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?view=body&id=pdf_1&handle=euclid.ss/1032280214]
 13.
Pinheiro JC, Bates DM: Mixedeffects models in S and SPLUS. 2000, New York: Springer
 14.
Venables WN, Ripley BD: Modern Applied Statistics with S. 2002, New York: Springer, 4
 15.
Geyer C: KolmogorovSmirnov and Lilliefors Tests. lecture notes Statistics. 5601, [http://www.stat.umn.edu/geyer/5601/examp/kolmogorov.html#onetest] , Univ of Minnesota
 16.
Brenner DJ, Little JB, Sachs RK: The bystander effect in radiation oncogenesis: II. A quantitiative model. Radiat Res. 2001, 155 (3): 402408. 10.1667/00337587(2001)155[0402:TBEIRO]2.0.CO;2.
 17.
Geras'kin SA, Oudalova AA, Kim JK, Dikarev VG, Dikareva NS: Cytogenetic effect of low dose γradiation in Hordeum vulgare seedlings: nonlinear doseeffect relationship. Radiat Environ Biophys. 2007, 46: 3141. 10.1007/s004110060082z.
 18.
Davies RB: Hypothesis Testing When a Nuisance Parameter is Present Only Under the Alternative. Biometrika. 1977, 64 (2): 247254. 10.2307/2335690.
 19.
Davies RB: Hypothesis Testing When a Nuisance Parameter is Present Only Under the Alternatives. Biometrika. 1987, 74 (1): 3343. 10.1093/biomet/74.1.33.
 20.
Cheng RCH, Traylor R: NonRegular Maximum Likelihood Problems. J R Stat Soc Series B Stat Methodol. 1995, 57 (1): 344. [http://www.jstor.org/pss/2346086]
 21.
Andrews DWK, Ploberger W: Admissibility of the Likelihood Ratio Test When a Nuisance Parameter is Present Only Under the Alternative. Ann Stat. 1995, 23 (5): 16091629. 10.1214/aos/1176324316.
 22.
Self GS, Liang KY: Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard Conditions. J Am Stat Assoc. 1987, 82 (398): 605610. 10.2307/2289471.
Acknowledgements
The very extensive simulations were shared out across a bank of PCs. Special thanks to Tom Bimpson, Phil Boyd, Jeremy Hawthorne, Peter Mayne, Paul Grunnill, Helen Marks, Paul Mayne, and Martin Grunnill for their help. Thanks to Chris Roberts for suggesting the use of R and the bootstrap, and to David Richardson for continuing correspondence and encouragement. Thanks to Duncan Thomas for pointing out the uncertainty in optimal latency, and for other criticisms during peer review. Thanks to Mark Little for criticisms of [2] which provoked this paper.
This work was not funded.
The data described in this report were obtained from the Radiation Effects Research Foundation (RERF) via the Comprehensive Epidemiologic Data Resource. RERF is a private foundation that is funded equally by the Japanese Ministry of Health and Welfare and the US Department of Energy through the US National Academy of Sciences. The conclusions reached in this report are mine and do not necessarily reflect the scientific judgement of RERF or its funding agencies.
Author information
Additional information
Competing interests
The author declares that they have no competing interests.
Authors' contributions
GD conceived of and designed the study and carried out the statistical analysis.
Electronic supplementary material
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Received
Accepted
Published
DOI
Keywords
 Likelihood Ratio Test
 Null Distribution
 Excess Relative Risk
 Life Span Study
 Optimal Latency
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate. Please note that comments may be removed without notice if they are flagged by another user or do not comply with our community guidelines.