An initial analysis of latency is carried out for all cancer sites with at least 100 cases (deaths) in the 0 - 20 mSv subcohort. Sites which appear to have a raised dose response are then analysed by the bootstrap and the distribution of LRT is estimated by simulation. A fuller simulation is carried out for the stomach, the site with the largest number of cases in this subcohort. The analysis focuses on the behaviour of the two-phase model (defined below) and its relation to the linear model. It aims to compare confidence intervals obtained by bootstrap and LRT simulation methods and to discover without relying on Wilks Theorem, whether the apparent elevation and non-linearity of the dose response in the 0 - 20 mSv subcohort are statistically significant. Further analysis considers variation in latency, the impact of "city", male and female subcohorts, and the 5 - 500 mSv dose range. Computations use the statistical freeware R[8], giving a further check on previous results. Possible sources of error in the Japanese data itself and the potential for confounding by other covariates as discussed in [2], are not reconsidered here.
As previously source data for the 1950-1990 mortality cohort LSS12 was obtained from RERF [9] via the CEDR [10], and the 0 - 20 mSv subcohort defined by restricting the weighted adjusted colon dose. The subcohort comprises 3011 data cells with 1690391.75 p-y. The control model, linear model, and two-phase model are defined as in [2]. Briefly, the control model is log-linear in 14 indicator variables for 5 year age-at-exposure categories, an indicator for gender, and the numerical variable log mean attained age, each of which is well-defined for the data cells. The control model is specified as λ = λ0 = exp(α+Σβjxj) where α and the βj are unknown parameters, xj are the covariates, exp denotes exponential, and the Poisson parameter in cell i with Ti person-years is λiTi where λi is evaluated using the covariates in cell i. At this stage, as in [2], "city" (Hiroshima or Nagasaki) is not included as a covariate.
Other models are defined by λ = λ0(1+ERR) where ERR depends only on Dϕ, the radiation dose lagged by a latency parameter ϕ. Dϕ is defined as the weighted adjusted colon dose in 10 mSv units when time-since-exposure ≥ ϕ, and 0 otherwise.
For the linear model, ERR = βDϕ
For the two-phase model, ERR = βDϕ + σDϕexp(-τDϕ) where exp denotes exponential.
The models require 1+ERR > 0 in all cells so that the Poisson distributions are defined. For the two-phase model τ > 0, slightly modifying the previous approach where τ ≥ 0. However, τ = 0 reduces to the linear model as does σ = 0.
In fitting a model at latency ϕ by Poisson regression, the unknown parameters α,βj and β, σ, τ (as appropriate to the model) are chosen to maximise the likelihood of the observed data assuming independent Poisson distributions in each cell. Equivalently, fitting minimises the sum over the 3011 data cells of Ei - Oiln(Ei), where Oi and Ei = λiTi are the observed and expected number of cases in cell i.
Latency ϕ is fixed during the minimisation. The models depend smoothly on all other parameters. ERRD,ϕ denotes the Excess Relative Risk evaluated at D for the fitted model with latency ϕ. Note ERR1,ϕ is the estimated risk at 10 mSv lagged by ϕ years. For model I nested within model J, LRTJ-I,ϕ is the Likelihood Ratio Test computed at latency ϕ.
Code written in R emulates the previous model fitting with Excel-Solver, using the Newton-Raphson minimisation routine "optim" [11]. Results for the stomach at latency ϕ = 5, 6,... 44 are checked against the values obtained previously.
At a given cancer site, the control, linear, and two-phase models are fitted at latency ϕ = 5, 6,... 44 to give LRTJ-I,ϕ for each pair of nested models: linear against control, two-phase against control, and two-phase against linear. Sites are selected for further analysis if LRTlin-con,ϕ > 4 for some ϕ, or if LRT2p-lin,ϕ > 6 for some ϕ (not assuming that this criterion establishes statistical significance). For sites meeting the criterion for the linear model, Profile Likelihood Confidence Intervals for β = ERR1,ϕ are computed using Wilks Theorem at integer values of ϕ. For sites meeting the criterion for the two-phase model, ϕ = ϕmabs is chosen to maximise LRT2p-con,ϕ (not LRT2p-lin,ϕ) so that fitting at ϕmabs gives the absolute Maximum Likelihood Estimate (MLE) for the model. The "optimal latency" ϕm is chosen to maximise LRT2p-con,ϕ subject to the constraint ERR1,ϕ ≥ 0. Bootstrap confidence intervals and LRT simulations are then computed at ϕm . Details are given in the next two more technical subsections, and examples with commentary are supplied with Additional File 1. The method for taking account of uncertainty in the "optimal latency" is outlined in a final subsection.
Bootstrap
Fitting the two-phase model at ϕ = ϕm gives a maximum likelihood estimate
for the vector of all 20 model parameters and determines fitted values of λi in each data cell. The parametric bootstrap assumes the observed data arose from sampling this fitted model, and considers the variation arising if other data had been sampled from that same model.
Sampling the independent Poisson distributions with parameters λiTi gives simulated observations in each data cell. Fitting the two-phase model to this simulated data gives simulated parameter values including
,
and
, and therefore a simulated value of ERR = ERRD,ϕ =
). The entire process is repeated B times to give the bootstrap replications ERR*.
There are many well-established methods to obtain confidence intervals from bootstrap replications [7, 12]. The two options chosen here are the original "percentile" method, which is easy to describe, and the "Bias-corrected accelerated" method which is much faster to compute, and in general more accurate.
The 1-2α "percentile" confidence interval has endpoints
where
denotes the Bαth ordered value of ERR*. With B = 1000 replications and α = 0.025, the endpoints of the 0.95 percentile interval are the 25th and 975th order statistics of ERR*.
For the Bias-corrected accelerated (BCa) confidence interval (see [7] Chapter 5 pp. 203-207 and p.249) simulation is again carried out using λiTi. However, the simulated observations are now used to fit the model to the "Least Favourable family" of distributions, obtained by restricting the parameter vector ψ to a line parametrised by ζ, passing through the MLE
and defined by
where
Here i-1(
) is the inverse of the Fisher Information evaluated at
and
is the gradient of the function h which defines the variable of interest (in this case ERR) in terms of the model parameters ψ.
For each bootstrap replication, fitting the simulated data along the LF family determines ζ and thus the simulated value ERR =
. If ERR*
LF
denotes the resulting set of B replications, the BCa confidence interval has endpoints
where
Here Φ is the standard normal cumulative distribution and z(α) is the 100αth percentile point of a standard normal distribution. The parameters w and a are estimated from the bootstrap replicates. The bias- correction w is computed as
i.e. the fraction of bootstrap replications along the LF family which are below the original fitted value of ERR, in normal units.
The acceleration a is computed as
where m3 and m2 are the 3rd and 2nd moments of
, the partial derivative of the log likelihood of the simulated data with respect to ζ, evaluated at ζ = 0.
Code for this calculation is shown in Additional File 1, Code File AC1.txt, which calls Additional File 1, Data Files stomdat1.txt, lambda2.txt and theta2.txt. Additional File 1, Commentary File ACom1.doc discusses this code.
Despite the intricate formulas, computation is much faster for BCa than for the simpler percentile method, because optimisation is now restricted to a line within the full 20-dimensional parameter space. The percentile CIs are computed at B = 1000, the recommended minimum number, while BCa results are obtained at B = 5000.
Coverage for the percentile and BCa methods is tested using the stomach data. For each of 400 sets of simulated observations generated by sampling the Poisson distributions determined by the original fitted parameters, the model is refitted to the simulated data. Confidence intervals for the simulated ERR are obtained by the bootstrap methods (B = 1000) using the refitted parameters and counted as 1 or 0 if they contain or exclude the "true" value of ERR from the original fitted parameters. To streamline these computations of coverage for the percentile method, data is restricted to the Hiroshima subcohort (N = 1536 cells) and the model simplified to use 10-year rather than 5-year age-at-exposure categories. Estimating coverage for the BCa method uses the full 0 - 20 mSv dataset and model. The application of both methods to compute CIs from the observed data uses the full dataset and model.
Simulation of LRT
The Likelihood Ratio Test (LRT) for comparing nested models H0 ≤ H1 (H0 contained within H1) given the observed data
is LRT = 2[ℒ(H1,
) - ℒ(H0,
)] where ℒ(Hi,
) is the log likelihood of the observed data after fitting the model Hi. In well-behaved situations where Wilks theorem [5] applies, the asymptotic distribution of LRT is known, provided the data was drawn from a distribution which satisfies the null hypothesis H0. In particular, an appropriate critical value t0.95 can be chosen for rejecting H0 if LRT > t0.95 and this will result in a 5% error rate of rejecting H0 when it is true. Typically H0 fixes the values of d parameters in the full model H1 which allows those parameters to vary freely in an open neighbourhood. In this case Wilks theorem says that LRT is asymptotically χ2 distributed on d degrees of freedom if the observed data was generated by the model with those d parameters fixed as specified. On that basis, the null hypothesis can be rejected if LRT > t0.95 = F-1(0.95) where F is the cumulative of a χ2 distribution on d degrees of freedom.
Wilks theorem depends on many regularity conditions for the models H0 and H1. If these fail, LRT may not be χ2 distributed; its distribution may depend on H0 rather than just on the number of specified parameters, and there may be no simple universal 95% critical value for rejecting H0. Simulation aims to estimate the distribution of LRT and then to obtain conservative critical values which are sufficient to reject H0. Confidence intervals for functions of the parameters are constructed using these critical values.
For the two-phase model
is the Likelihood Ratio Test for the hypothesis H0: (β, σ, τ) = (β0,σ0,τ0) vs. the alternative (β, σ, τ) ≠ (β0,σ0,τ0), i.e. H1 is the union of H0 with its alternative, and leaves (β, σ, τ) free to vary, subject only to the defining conditions of the two-phase model. For various choices of H0 the null distribution of
assuming H0 holds is sampled by Monte Carlo simulation (see [7] Chapter 4; for another context see [13] p.84) as follows.
One) The model is fitted subject to H0. The resulting parameter estimates give fitted values
and therefore
Ti for each cell, subject to H0.
Two) Poisson distributions with parameters
Ti are sampled independently to give simulated cases
in each cell
Three) The model is fitted to the simulated data
subject to H0, yielding new fitted values
for each cell. Put
.
Four) The model is fitted to the simulated data
with (β, σ, τ) allowed to vary freely within the parameter space, yielding new fitted values
for each cell. Put K =
. The simulated value of
is defined as LRT* = K0 - K. Note that up to a constant which depends on the simulated data
but not on the model, K0 = -2ℒ(H0,
) while K = -2ℒ(H1,
).
Five) Steps Two through Four are repeated 500 (or 1000) times to give the sample LRT*
An example of the required code is given in Additional File 1, Code File AC2.txt which calls Additional File 1, Data Files stomdat1.txt, lambda1.txt, theta1.txt, and Additional File 1, Code File AC2s.txt. Additional File 1, Commentary File ACom2.doc discusses this code.
While step Four is time-consuming, steps One and Three are quick as the model is log-linear in the remaining parameters. Step Two is trivial in R. Note that in step Four the possible values of (β, σ, τ) are still constrained by the requirement that 1+ERR > 0 in all cells. For computability, the constraint is set at 1+ERR ≥ 0.001.
To check whether Wilks theorem holds, LRT* is first tested against a χ2 distribution on 3 d.f. by the Kolmogorov-Smirnov (one-sample) test. LRT* is then tested against gamma distributions with unknown shape and rate parameters. A γ distribution is fitted to LRT* using the "fitdistr" subroutine of R (MASS library) [14]. When testing LRT* against the fitted distribution with shape parameter
and rate parameter
, the Kolmogorov statistic D is obtained from the (one sample) K-S test but its p-value is determined by simulation (as in the Lilliefors test). When LRT* has 500 elements, 500 random deviates are taken from the fitted γ distribution with parameters
and
to form a sample S, a new gamma distribution Γ is fitted to S, the K-S statistic D* for testing S against Γ (one-sample) is computed, and the process is repeated 5000× to estimate the probability that D* > D when samples are drawn from γ(
,
). R code was adapted from [15].
For the special case (β0,σ0,τ0) = (0,0, arbitrary) the sample LRT* uses 1000 replications; note that in this case
= LRT2p-con. Likewise at the MLE estimate (β0,σ0,τ0) = (
), LRT* is produced during the (percentile) bootstrap with 1000 replications. In both cases the simulations required for the p-value of D use 1000 random deviates of the fitted γ distribution.
For the liver, lung, pancreas and leukaemia, LRT simulation (×1000) is carried out at (0,0, arbitrary) and (
), and (×500) at 3 points chosen at random with σ0 = 0 and β0 in the intervals -0.5 < β0 < 0, 0 < β0 < 1, and 1 < β0 < 10, and also at 4 points chosen at random in a neighbourhood N10 of (
). N10 comprises those (β0,σ0,τ0) for which the observed (not simulated)
< 10. Each set of simulations gives a sample LRT
i
* and corresponding fitted gamma distributions γi
For the stomach, simulations and corresponding fitted γi are carried out at the MLE and 50 other random points in the parameter space: (×500) at 11 points with σ0 = 0, 4 points in a neighbourhood N10 of (
), 34 other points in the (β, σ, τ) parameter space, and (×1000) at (0,0, arbitrary) and (
).
Estimates of a 95% critical value for the null distribution of LRT are then obtained in three ways. For each cancer site, the fitted gamma distributions and the order statistics of the simulated LRT
i
* are used to estimate a global 95% critical value as
= max(sup(t.95,i), sup(s.95,i)) where t.95,i is the 95th percentile of γi and s.95,i is the 95th percentile of LRT
i
*. That is, at each value of (β0,σ0,τ0) tested for a particular site, 95% of the simulated LRT values and the 95th percentile of the fitted gamma distribution are both below
. An overall conservative estimate t.95 is chosen to exceed the corresponding maximum for all simulations and all cancer sites investigated. At each site, a refined estimate
= max(sup(t.95,j)) is obtained by considering only the fitted gamma distributions and restricting to simulations at points (β0,σ0,τ0) within N10. At each cancer site
≤
≤ t.95.
The neighbourhood N10 of (
) is termed "appropriate" if the overall estimate t.95 ≤ 10 so that for (β0,σ0,τ0) outside N10,
> t.95 . This is not a circular definition; t.95 is computed from the behaviour of simulated LRT at points outside as well as inside N10 for each site, while for (β0,σ0,τ0) within N10 and even at (
) simulation may, if N10 is inappropriate, give LRT* with 95th percentile > 10.
Profile Likelihood Confidence Intervals for ERR1,ϕ and ERR0.025,ϕ are then constructed using all three estimates for the critical value. With ϕ fixed, let Uϕ be the set of those (β0,σ0,τ0) which cannot be ruled out with 95% confidence as
< t.95(alternatively <
, or <
) Define the 95%CI for ERRD,ϕ as the range of ERRD,ϕ over Uϕ.
When σ0 = 0,
is the Likelihood Ratio Test for the hypothesis H0: β = β0 and σ = 0 vs. the alternative β ≠ β0 and σ = 0, while LRT2p-lin is the Likelihood Ratio Test for the hypothesis σ = 0 vs. the alternative σ ≠ 0. Note LRT2p-lin +
=
. Simulated null distributions of all three LRT's are obtained for points with σ = 0. Conservative, global, and refined estimates T.95,
and
of the 95% critical value for LRT2p-lin are then defined in the same fashion as t.95,
and
for
. For each cancer,
is the supremum of the 95th percentiles of γ distributions fitted to the simulations of LRT2p-lin,
is the supremum of 95th percentiles of the fitted γ distributions and simulations of LRT2p-lin, and T.95 is an overall conservative value exceeding
for all cancer sites considered. Likewise
and
are defined from the 99th and 99.9th percentiles of the fitted γ distributions. These various estimates are used as critical values for LRT2p-lin when testing the linear model against the two-phase model, given the observed data.
Latency
For latency ϕ = 5, 6,... 44 BCa 95%CIs for ERR1,ϕ and ERR0.025,ϕ are computed.
Since the observed data is regarded as a sample from an underlying distribution, the optimal latency ϕm as inferred from the data is an estimate, whose distribution is again obtained by simulation. The two-phase model is fitted at ϕm and the fitted model is resampled to provide simulated data, from which a new estimate
is determined. To simplify,
is restricted to integer values 5, 6,... 44. The process is repeated ×200 for the stomach, liver, lung, pancreas and leukaemia. The resulting set of
is compared with the range of latencies for which BCa 95%CIs are strictly positive, and likewise with the range for which LRT2p-lin exceeds the estimated critical values.
To test whether the distributions of LRT2p-con and LRT2p-lin depend on latency, null distributions at (β0,σ0,τ0) = (0,0, arbitrary) are simulated (×500) for the stomach with ϕ = 5, 10, 15,... 40 and compared with the null distribution from simulation (×1000) at ϕm.
City, dose range, and gender
The models are refitted with "city" as an added control covariate. For each of the 5 cancers, the optimal latency Φm (restricted to 5, 6,... 44) is determined along with latency ranges over which BCa 95% and 90% CIs for ERR1 are strictly positive.
The models including "city" are fitted to the 5 - 500 mSv subcohort. A limited analysis of the null distributions of LRT2p-con and LRT2p-lin including "city" is carried out on the 0 - 20 mSv and 5 - 500 mSv dose ranges. For each cancer at its Φm, H0: (β0,σ0,τ0) = (0,0, arbitrary) is used for simulation of LRT2p-con and H0: (β0,σ0,τ0) = (
,0, arbitrary) where
is the fitted parameter for the linear model, is used for simulation of LRT2p-lin. Gamma distributions fitted to the simulated LRT* are used to estimate 95% critical values and to assign p-values to the LRT2p-con and LRT2p-lin arising from the observed data.
The models including "city" are fitted separately to the male and female subcohorts in the 0 - 20 mSv dose range for the stomach, liver, and lung.
Finally, the linear model including "city" is fitted to the 0 - 5 mSv, 5 - 20 mSv, 0 - 0.5 mSv, and 0.5 - 20 mSv data for the liver and lung.
The very extensive simulations of LRT were shared across a number of PC's.