Approach (see Additional file 4 for example of this approach with real data):
1. Summarize percent recoveries for each chemical across analytical batches and flag those chemicals with average recoveries outside of a pre-established acceptable range.
□ We typically apply an acceptable range of 50–150% recovery for most environmental samples, particularly when we are analyzing for new chemicals or combinations of chemicals for which methods are not well-established. For well-established methods, a more conservative range – 80-120% recovery – is appropriate.
2. Visualize percent recoveries for each chemical across analytical batches to assess consistency.
□ If recoveries for a particular chemical or chemicals are consistently out of range (> 150% or < 50%) across multiple batches, this should be discussed with the laboratory analyst.
○ If the laboratory analyst agrees that the method was not successful, we drop the chemical(s) from our dataset. We do not report values or include such chemicals in any data analyses.
○ If the laboratory analyst can explain the reason for consistent high or low recoveries and has confidence in the ranking and relative values of the reported sample data, the reported values can be used for many data analyses, but it will be difficult to compare with levels from another study.
□ If recoveries from one or a few batches are out of range, we are concerned that results in those batches might be over/under-estimated compared to the rest. One way to investigate this concern is to look for corresponding systematic differences in sample data (see Additional file 4: Figure S3).
○ If field samples have been randomized into batches, we check if the variation in sample results correlates with spiked sample or CRM recoveries by batch. Note: we still go through this step even if we were not able to randomize field samples, but in this case it can be very challenging to distinguish systematic analytical variation from other possible sources of variation in sample results between batches (e.g., if samples in different batches were also collected during different seasons).
▪ If there are systematic differences (e.g., the sample results for a chemical are higher in the batch where the spike or CRM recovery was high, or if only one batch, the sample results for a chemical with high spike or CRM recovery are much higher than previously reported levels), we consider dropping the chemical results from the affected batches from the dataset. If an identical/split reference sample was analyzed in each batch, these results can also be helpful to resolve questions about whether and how to use the data in this case.
▪ If there are no obvious systematic differences, we keep the chemical in our dataset, but flag the results for that chemical in the batch with the out-of-range spiked sample or CRM recovery.
□ We note in summary statistics when the average spiked sample or CRM recovery for a particular chemical was out of range.
□ We note whether levels in our study might be systematically over- or under-reported (i.e., because of consistent high or low spiked sample or CRM recoveries). We especially note this if comparing to levels from another study.
□ For chemicals with low/high recoveries in certain batches, we may perform sensitivity analyses – for example, by including lab batch as a covariate in regression analyses, though this can be challenging for small datasets.