An initial analysis of latency is carried out for all cancer sites with at least 100 cases (deaths) in the 0 - 20 mSv subcohort. Sites which appear to have a raised dose response are then analysed by the bootstrap and the distribution of LRT is estimated by simulation. A fuller simulation is carried out for the stomach, the site with the largest number of cases in this subcohort. The analysis focuses on the behaviour of the two-phase model (defined below) and its relation to the linear model. It aims to compare confidence intervals obtained by bootstrap and LRT simulation methods and to discover without relying on Wilks Theorem, whether the apparent elevation and non-linearity of the dose response in the 0 - 20 mSv subcohort are statistically significant. Further analysis considers variation in latency, the impact of "city", male and female subcohorts, and the 5 - 500 mSv dose range. Computations use the statistical freeware **R**[8], giving a further check on previous results. Possible sources of error in the Japanese data itself and the potential for confounding by other covariates as discussed in [2], are not reconsidered here.

As previously source data for the 1950-1990 mortality cohort LSS12 was obtained from RERF [9] via the CEDR [10], and the 0 - 20 mSv subcohort defined by restricting the weighted adjusted colon dose. The subcohort comprises 3011 data cells with 1690391.75 p-y. The control model, linear model, and two-phase model are defined as in [2]. Briefly, the control model is log-linear in 14 indicator variables for 5 year age-at-exposure categories, an indicator for gender, and the numerical variable log mean attained age, each of which is well-defined for the data cells. The control model is specified as λ = λ_{0} = exp(α+Σβ_{j}x_{j}) where α and the β_{j} are unknown parameters, x_{j} are the covariates, exp denotes exponential, and the Poisson parameter in cell i with T_{i} person-years is λ_{i}T_{i} where λ_{i} is evaluated using the covariates in cell i. At this stage, as in [2], "city" (Hiroshima or Nagasaki) is not included as a covariate.

Other models are defined by λ = λ_{0}(1+ERR) where ERR depends only on D_{ϕ}, the radiation dose lagged by a latency parameter ϕ. D_{ϕ} is defined as the weighted adjusted colon dose in 10 mSv units when time-since-exposure ≥ ϕ, and 0 otherwise.

For the linear model, ERR = βD_{ϕ}

For the two-phase model, ERR = βD_{ϕ} + σD_{ϕ}exp(-τD_{ϕ}) where exp denotes exponential.

The models require 1+ERR > 0 in all cells so that the Poisson distributions are defined. For the two-phase model τ > 0, slightly modifying the previous approach where τ ≥ 0. However, τ = 0 reduces to the linear model as does σ = 0.

In fitting a model at latency ϕ by Poisson regression, the unknown parameters α,β_{j} and β, σ, τ (as appropriate to the model) are chosen to maximise the likelihood of the observed data assuming independent Poisson distributions in each cell. Equivalently, fitting minimises the sum over the 3011 data cells of E_{i} - O_{i}ln(E_{i}), where O_{i} and E_{i} = λ_{i}T_{i} are the observed and expected number of cases in cell i.

Latency ϕ is fixed during the minimisation. The models depend smoothly on all other parameters. ERR_{D,ϕ} denotes the Excess Relative Risk evaluated at D for the fitted model with latency ϕ. Note ERR_{1,ϕ} is the estimated risk at 10 mSv lagged by ϕ years. For model I nested within model J, LRT_{J-I,ϕ} is the Likelihood Ratio Test computed at latency ϕ.

Code written in **R** emulates the previous model fitting with Excel-Solver, using the Newton-Raphson minimisation routine "optim" [11]. Results for the stomach at latency ϕ = 5, 6,... 44 are checked against the values obtained previously.

At a given cancer site, the control, linear, and two-phase models are fitted at latency ϕ = 5, 6,... 44 to give LRT_{J-I,ϕ} for each pair of nested models: linear against control, two-phase against control, and two-phase against linear. Sites are selected for further analysis if LRT_{lin-con,ϕ} > 4 for some ϕ, or if LRT_{2p-lin,ϕ} > 6 for some ϕ (not assuming that this criterion establishes statistical significance). For sites meeting the criterion for the linear model, Profile Likelihood Confidence Intervals for β = ERR_{1,ϕ} are computed using Wilks Theorem at integer values of ϕ. For sites meeting the criterion for the two-phase model, ϕ = ϕ_{mabs} is chosen to maximise LRT_{2p-con,ϕ} (not LRT_{2p-lin,ϕ}) so that fitting at ϕ_{mabs} gives the absolute Maximum Likelihood Estimate (MLE) for the model. The "optimal latency" ϕ_{m} is chosen to maximise LRT_{2p-con,ϕ} subject to the constraint ERR_{1,ϕ} ≥ 0. Bootstrap confidence intervals and LRT simulations are then computed at ϕ_{m} . Details are given in the next two more technical subsections, and examples with commentary are supplied with Additional File 1. The method for taking account of uncertainty in the "optimal latency" is outlined in a final subsection.

### Bootstrap

Fitting the two-phase model at ϕ = ϕ_{m} gives a maximum likelihood estimate for the vector of all 20 model parameters and determines fitted values of λ_{i} in each data cell. The parametric bootstrap assumes the observed data arose from sampling this fitted model, and considers the variation arising if other data had been sampled from that same model.

Sampling the independent Poisson distributions with parameters λ_{i}T_{i} gives simulated observations in each data cell. Fitting the two-phase model to this simulated data gives simulated parameter values including , and , and therefore a simulated value of ERR = ERR_{D,ϕ} = ). The entire process is repeated B times to give the bootstrap replications **ERR***.

There are many well-established methods to obtain confidence intervals from bootstrap replications [7, 12]. The two options chosen here are the original "percentile" method, which is easy to describe, and the "Bias-corrected accelerated" method which is much faster to compute, and in general more accurate.

The 1-2α "percentile" confidence interval has endpoints

where denotes the Bα^{th} ordered value of **ERR***. With B = 1000 replications and α = 0.025, the endpoints of the 0.95 percentile interval are the 25^{th} and 975^{th} order statistics of **ERR***.

For the Bias-corrected accelerated (BC_{a}) confidence interval (see [7] Chapter 5 pp. 203-207 and p.249) simulation is again carried out using λ_{i}T_{i}. However, the simulated observations are now used to fit the model to the "Least Favourable family" of distributions, obtained by restricting the parameter vector ψ to a line parametrised by ζ, passing through the MLE and defined by where

Here i^{-1}() is the inverse of the Fisher Information evaluated at and is the gradient of the function h which defines the variable of interest (in this case ERR) in terms of the model parameters ψ.

For each bootstrap replication, fitting the simulated data along the LF family determines ζ and thus the simulated value ERR = . If **ERR***_{
LF
}denotes the resulting set of B replications, the BC_{a} confidence interval has endpoints

where

Here Φ is the standard normal cumulative distribution and z^{(α)} is the 100αth percentile point of a standard normal distribution. The parameters *w* and *a* are estimated from the bootstrap replicates. The bias- correction *w* is computed as

i.e. the fraction of bootstrap replications along the LF family which are below the original fitted value of ERR, in normal units.

The acceleration *a* is computed as

where m_{3} and m_{2} are the 3^{rd} and 2^{nd} moments of , the partial derivative of the log likelihood of the simulated data with respect to ζ, evaluated at ζ = 0.

Code for this calculation is shown in Additional File 1, Code File **AC1.txt**, which calls Additional File 1, Data Files **stomdat1.txt**, **lambda2.txt** and **theta2.txt**. Additional File 1, Commentary File **ACom1.doc** discusses this code.

Despite the intricate formulas, computation is much faster for BC_{a} than for the simpler percentile method, because optimisation is now restricted to a line within the full 20-dimensional parameter space. The percentile CIs are computed at B = 1000, the recommended minimum number, while BC_{a} results are obtained at B = 5000.

Coverage for the percentile and BC_{a} methods is tested using the stomach data. For each of 400 sets of simulated observations generated by sampling the Poisson distributions determined by the original fitted parameters, the model is refitted to the simulated data. Confidence intervals for the simulated ERR are obtained by the bootstrap methods (B = 1000) using the refitted parameters and counted as 1 or 0 if they contain or exclude the "true" value of ERR from the original fitted parameters. To streamline these computations of coverage for the percentile method, data is restricted to the Hiroshima subcohort (N = 1536 cells) and the model simplified to use 10-year rather than 5-year age-at-exposure categories. Estimating coverage for the BC_{a} method uses the full 0 - 20 mSv dataset and model. The application of both methods to compute CIs from the observed data uses the full dataset and model.

### Simulation of LRT

The Likelihood Ratio Test (LRT) for comparing nested models H_{0} ≤ H_{1} (H_{0} contained within H_{1}) given the observed data is LRT = 2[ℒ(H_{1}, ) - ℒ(H_{0}, )] where ℒ(H_{i}, ) is the log likelihood of the observed data after fitting the model H_{i}. In well-behaved situations where Wilks theorem [5] applies, the asymptotic distribution of LRT is known, provided the data was drawn from a distribution which satisfies the null hypothesis H_{0}. In particular, an appropriate critical value t_{0.95} can be chosen for rejecting H_{0} if LRT > t_{0.95} and this will result in a 5% error rate of rejecting H_{0} when it is true. Typically H_{0} fixes the values of d parameters in the full model H_{1} which allows those parameters to vary freely in an open neighbourhood. In this case Wilks theorem says that LRT is asymptotically χ^{2} distributed on d degrees of freedom if the observed data was generated by the model with those d parameters fixed as specified. On that basis, the null hypothesis can be rejected if LRT > t_{0.95} = F^{-1}(0.95) where F is the cumulative of a χ^{2} distribution on d degrees of freedom.

Wilks theorem depends on many regularity conditions for the models H_{0} and H_{1}. If these fail, LRT may not be χ^{2} distributed; its distribution may depend on H_{0} rather than just on the number of specified parameters, and there may be no simple universal 95% critical value for rejecting H_{0}. Simulation aims to estimate the distribution of LRT and then to obtain conservative critical values which are sufficient to reject H_{0}. Confidence intervals for functions of the parameters are constructed using these critical values.

For the two-phase model is the Likelihood Ratio Test for the hypothesis H_{0}: (β, σ, τ) = (β_{0},σ_{0},τ_{0}) vs. the alternative (β, σ, τ) ≠ (β_{0},σ_{0},τ_{0}), i.e. H_{1} is the union of H_{0} with its alternative, and leaves (β, σ, τ) free to vary, subject only to the defining conditions of the two-phase model. For various choices of H_{0} the null distribution of assuming H_{0} holds is sampled by Monte Carlo simulation (see [7] Chapter 4; for another context see [13] p.84) as follows.

One) The model is fitted subject to H_{0}. The resulting parameter estimates give fitted values and therefore T_{i} for each cell, subject to H_{0}.

Two) Poisson distributions with parameters T_{i} are sampled independently to give simulated cases in each cell

Three) The model is fitted to the simulated data subject to H_{0}, yielding new fitted values for each cell. Put .

Four) The model is fitted to the simulated data with (β, σ, τ) allowed to vary freely within the parameter space, yielding new fitted values for each cell. Put K = . The simulated value of is defined as LRT* = K_{0} - K. Note that up to a constant which depends on the simulated data but not on the model, K_{0} = -2ℒ(H_{0}, ) while K = -2ℒ(H_{1}, ).

Five) Steps Two through Four are repeated 500 (or 1000) times to give the sample **LRT***

An example of the required code is given in Additional File 1, Code File **AC2.txt** which calls Additional File 1, Data Files **stomdat1.txt**, **lambda1.txt**, **theta1.txt**, and Additional File 1, Code File **AC2s.txt.** Additional File 1, Commentary File **ACom2.doc** discusses this code.

While step Four is time-consuming, steps One and Three are quick as the model is log-linear in the remaining parameters. Step Two is trivial in **R**. Note that in step Four the possible values of (β, σ, τ) are still constrained by the requirement that 1+ERR > 0 in all cells. For computability, the constraint is set at 1+ERR ≥ 0.001.

To check whether Wilks theorem holds, **LRT*** is first tested against a χ^{2} distribution on 3 d.f. by the Kolmogorov-Smirnov (one-sample) test. **LRT*** is then tested against gamma distributions with unknown shape and rate parameters. A γ distribution is fitted to **LRT*** using the "fitdistr" subroutine of **R** (MASS library) [14]. When testing **LRT*** against the *fitted* distribution with shape parameter and rate parameter , the Kolmogorov statistic D is obtained from the (one sample) K-S test but its p-value is determined by simulation (as in the Lilliefors test). When **LRT*** has 500 elements, 500 random deviates are taken from the fitted γ distribution with parameters and to form a sample S, a new gamma distribution Γ is fitted to S, the K-S statistic D* for testing S against Γ (one-sample) is computed, and the process is repeated 5000× to estimate the probability that D* > D when samples are drawn from γ(, ). **R** code was adapted from [15].

For the special case (β_{0},σ_{0},τ_{0}) = (0,0, arbitrary) the sample **LRT*** uses 1000 replications; note that in this case = LRT_{2p-con}. Likewise at the MLE estimate (β_{0},σ_{0},τ_{0}) = (), **LRT*** is produced during the (percentile) bootstrap with 1000 replications. In both cases the simulations required for the p-value of D use 1000 random deviates of the fitted γ distribution.

For the liver, lung, pancreas and leukaemia, LRT simulation (×1000) is carried out at (0,0, arbitrary) and (), and (×500) at 3 points chosen at random with σ_{0} = 0 and β_{0} in the intervals -0.5 < β_{0} < 0, 0 < β_{0} < 1, and 1 < β_{0} < 10, and also at 4 points chosen at random in a neighbourhood N_{10} of (). N_{10} comprises those (β_{0},σ_{0},τ_{0}) for which the observed (not simulated) < 10. Each set of simulations gives a sample **LRT**
_{
i
}* and corresponding fitted gamma distributions γ_{i}

For the stomach, simulations and corresponding fitted γ_{i} are carried out at the MLE and 50 other random points in the parameter space: (×500) at 11 points with σ_{0} = 0, 4 points in a neighbourhood N_{10} of (), 34 other points in the (β, σ, τ) parameter space, and (×1000) at (0,0, arbitrary) and ().

Estimates of a 95% critical value for the null distribution of LRT are then obtained in three ways. For each cancer site, the fitted gamma distributions and the order statistics of the simulated **LRT**
_{
i
}* are used to estimate a global 95% critical value as = max(sup(t_{.95,i}), sup(s_{.95,i})) where t_{.95,i} is the 95^{th} percentile of γ_{i} and s_{.95,i} is the 95^{th} percentile of **LRT**
_{
i
}*. That is, at each value of (β_{0},σ_{0},τ_{0}) tested for a particular site, 95% of the simulated LRT values and the 95^{th} percentile of the fitted gamma distribution are both below . An overall conservative estimate t_{.95} is chosen to exceed the corresponding maximum for all simulations and all cancer sites investigated. At each site, a refined estimate = max(sup(t_{.95,j})) is obtained by considering only the fitted gamma distributions and restricting to simulations at points (β_{0},σ_{0},τ_{0}) within N_{10}. At each cancer site ≤ ≤ t_{.95}.

The neighbourhood N_{10} of () is termed "appropriate" if the overall estimate t_{.95} ≤ 10 so that for (β_{0},σ_{0},τ_{0}) outside N_{10}, > t_{.95} . This is not a circular definition; t_{.95} is computed from the behaviour of simulated LRT at points outside as well as inside N_{10} for each site, while for (β_{0},σ_{0},τ_{0}) within N_{10} and even at () simulation may, if N_{10} is inappropriate, give **LRT*** with 95^{th} percentile > 10.

Profile Likelihood Confidence Intervals for ERR_{1,ϕ} and ERR_{0.025,ϕ} are then constructed using all three estimates for the critical value. With ϕ fixed, let U_{ϕ} be the set of those (β_{0},σ_{0},τ_{0}) which cannot be ruled out with 95% confidence as < t_{.95}(alternatively <, or <) Define the 95%CI for ERR_{D,ϕ} as the range of ERR_{D,ϕ} over U_{ϕ}.

When σ_{0} = 0, is the Likelihood Ratio Test for the hypothesis H_{0}: β = β_{0} and σ = 0 vs. the alternative β ≠ β_{0} and σ = 0, while LRT_{2p-lin} is the Likelihood Ratio Test for the hypothesis σ = 0 vs. the alternative σ ≠ 0. Note LRT_{2p-lin} + = . Simulated null distributions of all three LRT's are obtained for points with σ = 0. Conservative, global, and refined estimates T_{.95}, and of the 95% critical value for LRT_{2p-lin} are then defined in the same fashion as t_{.95}, and for . For each cancer, is the supremum of the 95^{th} percentiles of γ distributions fitted to the simulations of LRT_{2p-lin}, is the supremum of 95^{th} percentiles of the fitted γ distributions and simulations of LRT_{2p-lin}, and T_{.95} is an overall conservative value exceeding for all cancer sites considered. Likewise and are defined from the 99^{th} and 99.9^{th} percentiles of the fitted γ distributions. These various estimates are used as critical values for LRT_{2p-lin} when testing the linear model against the two-phase model, given the observed data.

### Latency

For latency ϕ = 5, 6,... 44 BC_{a} 95%CIs for ERR_{1,ϕ} and ERR_{0.025,ϕ} are computed.

Since the observed data is regarded as a sample from an underlying distribution, the optimal latency ϕ_{m} as inferred from the data is an estimate, whose distribution is again obtained by simulation. The two-phase model is fitted at ϕ_{m} and the fitted model is resampled to provide simulated data, from which a new estimate is determined. To simplify, is restricted to integer values 5, 6,... 44. The process is repeated ×200 for the stomach, liver, lung, pancreas and leukaemia. The resulting set of is compared with the range of latencies for which BC_{a} 95%CIs are strictly positive, and likewise with the range for which LRT_{2p-lin} exceeds the estimated critical values.

To test whether the distributions of LRT_{2p-con} and LRT_{2p-lin} depend on latency, null distributions at (β_{0},σ_{0},τ_{0}) = (0,0, arbitrary) are simulated (×500) for the stomach with ϕ = 5, 10, 15,... 40 and compared with the null distribution from simulation (×1000) at ϕ_{m}.

### City, dose range, and gender

The models are refitted with "city" as an added control covariate. For each of the 5 cancers, the optimal latency Φ_{m} (restricted to 5, 6,... 44) is determined along with latency ranges over which BC_{a} 95% and 90% CIs for ERR_{1} are strictly positive.

The models including "city" are fitted to the 5 - 500 mSv subcohort. A limited analysis of the null distributions of LRT_{2p-con} and LRT_{2p-lin} including "city" is carried out on the 0 - 20 mSv and 5 - 500 mSv dose ranges. For each cancer at its Φ_{m}, H_{0}: (β_{0},σ_{0},τ_{0}) = (0,0, arbitrary) is used for simulation of LRT_{2p-con} and H_{0}: (β_{0},σ_{0},τ_{0}) = (,0, arbitrary) where is the fitted parameter for the linear model, is used for simulation of LRT_{2p-lin}. Gamma distributions fitted to the simulated **LRT*** are used to estimate 95% critical values and to assign p-values to the LRT_{2p-con} and LRT_{2p-lin} arising from the observed data.

The models including "city" are fitted separately to the male and female subcohorts in the 0 - 20 mSv dose range for the stomach, liver, and lung.

Finally, the linear model including "city" is fitted to the 0 - 5 mSv, 5 - 20 mSv, 0 - 0.5 mSv, and 0.5 - 20 mSv data for the liver and lung.

The very extensive simulations of LRT were shared across a number of PC's.