Mathematically Analytic Distributions Can Explain Radioanalytic Results

 

Allen Brodsky*

Science Applications International Corporation

1410 Spring Hill Road, Mail Stop SH2-1

McLean, VA 22102

 

In a poster at last year’s BAER meeting (Brodsky and Barss 1999) and in other presentations (Brodsky et al. 1999; Brodsky 2000) we reported on the characterization of a distribution of radioanalytic results by the use of a mathematical convolution of probability distribution functions fit to (observed) lognormal distributions of fission tracks from plutonium in population urine samples, and in simulated (reagent) urine blanks.  The same type of mathematical derivation can be used to characterize and explain the shapes of other distributions of data from bioassay or environmental samples. (The difference between two variates, each lognormally distributed, is not lognormally distributed; on the other hand, the difference between two normally distributed variates is normally distributed and the resulting distribution can be immediately written down in functional form.)  The rigorous mathematical derivation of the distribution function of net results explained, in the fission track analysis case, the reasons that a plot of the net urine concentrations on probability vs. log scale coordinates does seem to fit the lognormal straight line above median concentrations, but curves to the left on the logarithmic scale below the median as urine concentrations of plutonium approach those of the blanks.

 

Further examination of the literature shows other distributions, particularly of concentrations of radionuclides in human tissue samples at very low levels, that can be explained by a similar mathematical analysis. Actual distributions of radionuclides in samples that might be truly lognormal  will appear at low levels to deviate from their actual distribution because the subtraction of blank measurements having random fluctuations produces an increased population of small and negative values at low analyte levels.  The net results thus force the line on a probability vs. log concentration plot to curve to the left, usually below the median concentration.

 

Since the logic of the mathematical derivations for the resulting “interpreted” distributions can  be applied to the subtraction of sample and blank distributions represented by any two continuous functions (an algorithm applied to subtraction of two discrete distributions has been presented in Brodsky (1992), the derivation of the interpreted distribution for subtraction of lognormally-distributed blanks from lognormally distributed population urine samples, as applied and verified in fission track analysis (FTA) is presented in detail in the Appendix to this paper.

 

*Also, Adjunct Professor, Health Physics Graduate Program, Georgetown University.  The views expressed in this paper are those of the author alone, and do not necessarily reflect those of any employer.  Also, the data presented are not necessarily representative of any appropriate sample of the population, and do not represent an actual data sample used in any official project.

 

 

 

 

 

 

 

 

 

 

For the well-known (constant) blank subtraction:

 

The cumulative distribution function ("well-known" blank) for the interpreted net plutonium activity in urines from a control population unexposed to plutonium other than from fallout is:

           

                                    w=X+b/m

                  Fx(X) =     òw®0 (1/sg,fyÖ2p)(1/w)exp[- (ln mw  - ln mg,fy)2/2sg,fy2] dw,

 

 

where the parameters for the combined controls are

 

                                     m = 1.073; mg,fy = 32; sg,fy = 0.72732; sg,fy2 = 0.529

 

Here, m is the slope of the calibration plot of blanks and spikes (in tracks/aCi); the remaining parameters are the median, standard deviation of the natural logarithms, and variance of the natural logarithms, of the corrected track (fy) distributions before blank subtraction.

 

For the case where a random, paired, blank is subtracted from each sample in a stable analytic process, the cumulative distribution of the urine activities becomes:

 

              

                                   B         X+(b/m)

                    F(X)  =  ò p(b) {ò pw(w) dw} db,                                                                      

                                   e           e

 

where (as for the first FxX) the value of  e for the lower limit of integration of each integral must be close enough to zero for convergence of the integration (but not = 0), and the upper limit B of the integration over db must be high enough to include the range over which the probability is very close to 1 of obtaining a value of b. 

 

Here, the probability that a plutonium activity, before subtraction of the paired blank activity, will be in the interval dw is:

 

                    pw(w) dw =  (1/sg,fyÖ2p)(1/w)exp[- (ln mw  - ln mg,fy)2/2sg,fy2 dw                

 

The probability that a blank will have the number of tracks b in the interval db is, using the maximum likelihood parameters to fit a lognormal function to the blank distribution,

 

                     p(b)db  =  (1/sbÖ2p)(1/b)exp[- (ln b  - ln mgb)2/2sb2 db                                   

 

The parameter  mgb is the median of the blank activity distribution, and sb is the standard deviation of the distribution of the natural logarithms of the blank activities. The analytically fit cumulative distributions F(X) were found in both cases to closely agree with the interpreted results that were reported as net activity of plutonium in the urine samples (See Appendix, Figures 4 and 5, and Exhibits A and B).

 

Conclusions

 

1.       The derivation of  the probability distributions of “interpreted” or “reported” results, when blanks are subtracted from variable measurements that range down near the blank values, can explain the shapes of underlying distributions of radioactivity in human samples.

2.       Lognormal distributions of activity in human samples are more “normal” than normal distributions, but the true distribution shapes must be inferred from measurements well above the levels of the appropriate blanks.

3.       Comparisons of the data in Figures 1-5 with other data in the literature indicate that the extrapolation of the straight line portions in Figures 4 and 5 would, in the absence of further data, yield a reasonable estimate of the distribution (about 1998) of urinary excretion of plutonium (239) in the general U. S. population exposed only to fallout plutonium. (Errors of measurement average out to some degree in the fitting of this overall distribution function.) This distribution could then be a reasonable assumed prior distribution for Bayesian analyses regarding whether a population is exposed to a given alternative increment of plutonium above that of the general population.

 

APPENDIX – DETAILED DERIVATION OF “INTERPRETED” DISTRIBUTIONS

 

Calibration Data

 

The calibration data in fission track analysis of plutonium (FTA) (Moorthy et al. 1988) consist of a series of measurements of fission tracks on quartz slides.  These slides contain residues from simulated synthetic urine blanks, and residues from “spikes” of these blanks to which known quantities of Pu-239 have been added. Checks are also made on the cleaned slide themselves, which usually produce only a few tracks. The slides are then irradiated with neutrons in a nuclear reactor, to produce tracks of fission fragments from the fission of the Pu-239 in each of the residues (Moorthy et al. 1988).  The chemical processes include detailed procedures for removing, as much as possible, any uranium or other fissionable material that would produce fission tracks interfering with the specific identification of tracks originating from plutonium fission. The blanks are composed of a mixture of chemicals that simulate the constituents of urine; they are also subjected to the same physico-chemical processes and deposited on the quartz slides in the same manner as the chemical residue from human urine samples.  The blanks thus yield residues chemically similar to those deposited from the human urine samples.  The blanks to which known quantities of plutonium have been added will be denoted by the shorter term “spikes”. A known mass quantity of plutonium is also deposited on one or two areas of the slides of a batch irradiation as a neutron flux monitor, to normalize to a total neutron fluence.  The complex physico-chemical processes leading to deposition of each blank and spike are standardized within a homogeneous batch of data; the dates for a given procedure are documented.

 

A simple (non-weighted) regresson line of tracks vs. amount of Pu in attocuries (aCi) was used as the calibration data for interpreting plutonium in human urine samples, since the absolute magnitude of the dispersion of y (track) values about the regression line was about the same vs. amount of Pu added, and were approximately normal (Gaussian) (see Figure 1).

 

The linear regression equation for the calibration line will thus be of the form:

 

                                             Y = mX + b                                                                                                                                       Eq. 1

 

For comparison, the slopes, intercepts and their 95% confidence limits obtained from several sets of data are tabulated below:

 

SAMPLE

 

SLOPE

(with 95% CI in parentheses)

INTERCEPT

(with 95% CI in parentheses)

191 blanks and spikes

m = 1.073 (1.033, 1.114)

b = 19.18 (14.95, 23.42)

172 blanks and spikes

m = 1.079 (1.036, 1.121)

b = 20.04 (15.56, 24.51)

115 spikes alone

m = 1.071 (1.003, 1.139)

b = 19.58 (10.47, 28.70)

 

This table shows that the calibration data provided a relatively stable line over the period of analyses used in this illustration.

 

.  Thus, for interpretation of human samples, the interpreted one-day excretion of plutonium in a given urine sample is obtained as:

 

                                             xj = (yj fj - b)/m                                                                                                                                  Eq. 2

 

where  xj = the amount of plutonium in a one-day excretion of urine, interpreted from

                   a sample of volume Vj  (actually measured by the proportional mass of the residue used for neutron irradiation),

            yj = the number of fission tracks observed on the standardized area of the slide

                   where the reduced sample was deposited,

            fj =  a  scaling factor to normalize the tracks from a human urine sample to a 24-hour volume, and to normalize to a standard

                   neutron fluence,

            b =  the intercept of Equation 1 obtained from regression analysis,

            m = the slope of Equation 1 obtained from regression analysis.

 

Derivation of the analytical cumulative distribution function of net plutonium activity (per 24 hour sample) calculated from Equation 2, when best estimates of b and m from prior calibration data are taken to be constants (CONSTANT BLANK CASE)

 

The distribution of x values as interpreted for the control urine samples is, from Equation 2, the same as the distribution of  yf/m - b/m.  First, the density function must be obtained for the distribution of yf.  The cumulative distribution of data values is shown in Figure 2 for the products yf, corrected tracks per 24-hour urines, are presented for a “control population” set of urines taken from persons living in several geographic regions. The distribution of tracks from simulated 24-hour urine blanks is seen in Figure 3 to also of lognormal shape. (Tests were made against other fits.)   Although regional differences are evident, the pooled sample is seen to be fit well by a lognormal cumulative line, which will be used in this example derivation. (Other analytical functions fitting the data can be used in other cases.) (Note: Although the counting of tracks would seem to yield discrete density functions, the functions are converted to continuous variates by the continuously varying correction factors and yields. Algorithms for the subtraction of two random variates, both Poisson distributed, were illustrated in Brodsky (1992); here, it is appropriate to consider the density functions of interest to be continuous functions.)

 

Estimates of the parameters of the lognormal fit can be made from graphs such as Figure 2.

 

Graphic estimate of mg = value on x axis where ordinate is 0.50 or 50%                                                                   Eq. 3

 

 

            Graphic estimate of sg =  x value for 0.8413/x value for 0.50 = x0.8413/x0.50

 

                                          or  =   x0.50/x0.1587 , or to obtain a better estimate by averaging,

 

                                                           =   0.5{(x0.8413/x0.50) + (x0.50/x0.1587)                                                                                Eq. 4

 

            Then, the variance in the natural logarithm of x over the lognormal distribution is

            (Brodsky 1982):

 

                                           s2  =  (ln sg)2                                                                                                                                        Eq. 5

 

The parameter sg has been given different names in the literature (Brodsky 1982); it will be called the “standard geometric deviation” here.  It is a very useful parameter, since it can be used to obtain a particular set of confidence intervals about the median (geometric mean) read on the logarithmic scale of the graphic plot); e.g., (mg/sg, mg x sg) provides a 95% “confidence interval”.  Note:  The symbol s is not here used for the sample estimator of a true s, but is defined above; to avoid a complexity of symbols, all parameters in this derivation should be understood to be derived from a sample of data.

 

The underlying probability density function for the distribution in corrected track readings for all controls, denoted by pfy(u) with u = fy, is now given by (using the formulation in Brodsky 1982):

 

                                 pfy(u) =  (1/sg,fyÖ2p)(1/u)exp[- (ln u  - ln mg,fy)2/2sg,fy2]                                                                                 Eq. 6

 

where the parameters mg,fy = 32, and sg,fy = (0.529)0.5 =  0.727 are obtained from the data fit in Figure 2.  These parameters were obtained as maximum likelhood estimates (geometric mean and variance of ln fy) and were in good agreement with parameters obtained from the graph.

 

The cumulative distribution function giving the probability that a value of fy will be less than or equal to u is then:

                                                 z=u

                                 Ffy(u) =  òz®0 pfy(z) dz                                                                                                                                     Eq. 7

 

Next, the distribution functions for the terms fy/m are obtained as follows:

 

Suppose that variate z has a pdf of pz(z) over the range z1 to z2.  Then, the probability that a value of z will occur in the differential interval dz in this range is pz(z) dz.  Now, consider a change in variate is made by multiplying z by a constant to obtain variates w = cz.  The new pdf for w will then be such that the probability of obtaining a value of cz in the interval d(cz) will be the same as that of obtaining z in the interval dz from which cz was derived.  That is, pw(w) dw = pz(z)dz.

 

Therefore, pw(w) dw = pw(w) d(cz) = c pw(w) dz, which equals pz(z)dz.  Thus, pw(w) = (1/c) pz(z).

 

Therefore, the pdf for (fy/m), letting w = fy/m, is

                                  pfy/m(w) = m pfy(u),

 

and the probability of obtaining a value of w in dw is m pfy(mw) dw.

 

The cumulative distribution function for w = fy/m is obtained as follows:

 

The running variable V  in Equation 7 becomes = mw.  Then,

 

                         when V = 0, w=0, and when V = u, w = u/m.

 

Replacing V  with mw in Equation 7, the probability that a value less than fy/m will be obtained is:

                                                 w=fy/m

                         Ffy/m (fv/m) =  òw®0 (1/sg,fyÖ2p)(1/mw)exp[- (ln mw  - ln mg,fy)2/2sg,fy2] d(mw)   

 

and canceling the m in numerator and denominator,

 

                                 w=fy/m

            Ffy/m (fy/m) =  òw®0 (1/sg,fyÖ2p)(1/w)exp[- (ln mw  - ln mg,fy)2/2sg,fy2] dw                                                                             Eq. 8

 

   

Now, the probability that an interpreted value of x will be less than or equal to some value X can be denoted by

 

                         Prob[ x = fy/m - b/m X],           (CAUTION: Some printers apparently do not print the less than or equals sign

                                                                                                       that belongs to the immediate left of the X in these inequalities.)

                                                                                                       

which is the same as

 

                         Prob[ fy/m X + (b/m)],

 

for a procedure involving a constant subtraction of the same interpreted (“best estimate”) blank b/m for each urine interpretation.

 

This probability is obtained using Equation 8 as:

 

                          Prob[fy/m X + (b/m)]  = Ffy/m (X + (b/m)),                                                                                                           Eq. 9

 

which is now the desired cumulative probability Fx(X) that an interpreted control population urine concentration will be less than or equal to X. 

 

The cumulative distribution function that best fits the combined urine concentrations for the control populations is thus:

                                   

                            w=X+(b/m)

                Fx(X) =  òw®0 (1/sg,fyÖ2p)(1/w)exp[- (ln mw  - ln mg,fy)2/2sg,fy2] dw,                                                                                Eq. 10

 

where the parameters for the combined controls are again

 

                          m = 1.073; mg,fy = 32; sg,fy = 0.72732; sg,fy2 = 0.529

 

Since the lognormal functions of Equations 8-10 are not defined at their lower limits of integration (as a result, e.g., of w in the denominator and non-convergence of the integrand),  the value of the integrand must avoid zero.  The calculations of the cumulative probabilities using  Equation 10 to obtain the cumulative distribution for interpreted control population urine concentrations were carried out using MATHCAD (see Exhibit A) for the case where b and m are maintained constant for each interpretation of urine concentration.  The lower limit of integration was reduced until the integral converged. The resulting cumulative distribution F(x) is seen in Figure 4 to agree very closely with the plot of the cumulative urinary concentration distribution for all controls that was provided as results from the analytical laboratory (BNL). Figure 5 shows the original and clearer hand plot of the results in Figure 4, using log-probability paper (which is no longer available for sale, unfortunately).

 

 

Derivation of the analytical cumulative distribution function of net plutonium activity (per 24 hour sample) in combined controls, when m is the constant slope of the calibration line, but the intercept b is taken as the number of tracks on a paired blank that varies randomly as the lognormal distribution  in Figure 3 (PAIRED BLANK CASE)

 

A convolution integral in which b is a random variable is evaluated in MATHCAD (Exhibit B)  to obtain the desired analytical expression for the cumulative distribution function (CDF) for combined control urines.  As shown in Figures 4and 5, the resulting cumulative distribution function represents the interpreted activities even better than the CDF obtained in the previous section with constant values of b and m.  However, the treatment of b as a random variable, rather than as a constant, does not appreciably increase the spread in the resulting distribution of interpreted values, probably because the distribution of b is narrower than the distribution in control urines, and also because the positive and negative variations of b about the central value would tend to cancel in the data or in the theory describing the resulting urine activity distribution. 

 

In order to illustrate the concepts in deriving the cumulative distribution functions, the convolution integral in Exhibit B is derived below as an example. The convolution is performed assuming the lognormal distribution in b as a paired blank will be that observed in Figure 3, but with m as a constant. The slope m obtained from cumulative calibration data in a well-controlled analytical system varies much less than the number of tracks in a paired blank

 

In order to derive the cumulative distribution function, which is the probability that a control urine value x will be less than some chosen fixed value X, we need to integrate (add up) for each value of b in the (infinitesimal) interval db the probability that the associated (paired) value of fy/m will be such that fy/m X + (b/m); for such an inequality, by subtracting (b/m) from both sides, we see that the interpreted value will be fy/m - b/m X.  Thus, the integral over w for a given value of b must be from infinitesimally close to zero up to X + b/m.  Then, after obtaining the sum for each value of b, we need to add over all possible values of b, from near zero to effectively a high-enough value of b so that the integral of probability over b is essentially the same as if the computer covered a range of b up to infinity.  Convergence can be tested by determining whether further increasing width of the integration intervals produces any further increases in the result.

 

The probability density function (pdf) for the distribution of fy/m is obtained from the integrand of Equation 8, using a notation of pw(w) for convenience of notation in MATHCAD, where w is the dummy variable representing fy/m, and pw is the specific functional form for the pdf of fy/m.  Thus, the probability that a plutonium activity, before subtraction of the paired blank activity, will be in the interval dw is:

 

                                pw(w) dw =  (1/sg,fyÖ2p)(1/w)exp[- (ln mw  - ln mg,fy)2/2sg,fy2 dw                                                            Eq. 11

 

The probability that a blank will have the number of tracks b in the interval db is, using the maximum likelihood parameters of Figure 3  to fit a lognormal function to the blank distribution,

 

                     p(b)db  =  (1/sbÖ2p)(1/b)exp[- (ln b  - ln mgb)2/2sb2 db                                                                                           Eq. 12

 

(Since no further pdfs will be used in this calculation, simplified symbols are used here for p(b), and for the geometric mean and standard deviation of ln b in the fitted distribution.)

 

The cumulative distribution function for interpreted urine values then becomes the integral over dw for each b value, followed by the integral over all b values, giving the probability of pairs of tracks from the urine sample analysis and the paired control such that the interpreted value is less than X:

 

                                    B         X+(b/m)

                     F(X)  =  ò p(b) {ò pw(w) dw} db,                                                                                                                               Eq. 13

                                   e           e

 

where the value of  e  for the lower limit of integration of each integral must be close enough to zero for convergence of the integration, and the upper limit B of the integration over db must be high enough to include the range over which the probability is very close to 1 of obtaining a value of b. The values of the parameters for each distribution are given in Exhibit B.  The resulting values of F(X), for interpreted values of X from 5, 10, 15....100 aCi plutonium per 24-hour sample, are printed in Exhibit B to the lower right of the formula for Fk, which is the representation of F(X) in Equation 13 in the MATHCAD notation required for computation for a range of values of X.  These values are seen to almost exactly overlap the interpreted control urine cumulative distribution as reported and as plotted in Figures 4 and 5.

 

These results now explain why some determinations of what might be natural lognormal distributions in human urine or tissue samples (Schubert et al. 1967; McInroy et al. 1979), curve to the left away from the straight-line representation of a lognormal distribution on a probability vs. log plot.  The abundant tissue samples of McInroy et al. (1979) all present this type of plot, probably because for very low level measurements, the subtraction of a random or fixed blank from varying and low sample measurements tends to provide small or negative results that tend toward negative infinity on a log x-axis.

 

When the systemic burdens were calculated from the data of McInroy by Jackson (2000) using the straight line portions of the lognormal plots, and then scaled back and forward in time using intake retention functions for the system and for urinary excretion of Lessard et al. (1987) for periods of intake and excretion of plutonium from fallout, respectively, the urinary excretion predicted for the U. S. population was in reasonable agreement with the mean excretion of the control population represented in Figures 4 and 5.  Although the limited population utilized to illustrate methods in this paper can not be assumed to be representative of the U. S. population for any particular purpose, these results and the consistent behavior of the FTA data during the analytical period correspond to this data, indicate that FTA analysis is capable under good quality assurance conditions of measuring plutonium in human urine samples. 

 

 The data of Figures 1-3 indicate, using the lognormal shape, that the decision level (0.05 probability of Type I error) would be about 37 aCi  (for this pre-October 1998 data) to decide that any sample measurement is positive with respect to a blank.  An a priori “minimum detectable amount”  (MDA) obtained by the simple formula based on normal statistics as derived by Currie (Brodsky 1986), for a Type II error probability of 0.05 would be about 74 aCi, close to that estimated in  Boecker et al. (1991).  However, any further definition of MDAs must take into account the actual shape of error distributions and the shapes of human sample distributions, and carefully consider  what question is asked: 1) Is a single sample measurement greater  than a blank?; 2) Is a single sample measurement greater than that from an unexposed control population of interest?; or 3) Is a distribution of measurements different (greater in distribution) than a comparison distribution of interest?….

 

References

 

1.        B. Boecker, R. Hall, K. Inn, J. Lawrence, P. Ziemer, G. Eisele, B. Wachholz and W. Burr, Jr., “Current Status of Bioassay Procedures to Detect and Quantify Previous Exposures to Radioactive Materials,” Health Phys. 60, Supplement 1, 45-100, 1991.

2.        A. Brodsky, “Statistical Methods of Data Analysis,” in A. Brodsky, editor, Handbook of Radiation Measurement and Protection, Vol. II, Biological and Mathematical Information, CRC Press, Boca Raton, Florida, 1982, pp. 261-329.

3.        A. Brodsky, “Accuracy and Detection Limits for Bioassay Measurements in Radiation Protection – Statistical Considerations,” NUREG-1156, U. S. Nuclear Regulatory Commission, Washington, DC, 1986.

4.        A. Brodsky and N. M. Barss, “Mathematical Analytic Distributions to Fit Radioanalytic Results,” Poster presented at the 45th Conference on Bioassay, Analytical and Environmental Radiochemistry, National Institute of Standards and Technology, Gaithersburg, Maryland, October 18-22, 1999.

5.        A. Brodsky, D. M. Schaefer, S. O’Toole, E. Kaplan, N. M. Barss, J. Dancz, W. J. Klemm, D. A. Raine, III and J. Stiver, “Statistical Model for Fission Track Analysis of Plutonium in Human Samples,” Health Phys. 76(6), S175, 1999.

6.        A. Brodsky, “Cumulative Lognormal Distributions of Dose-Response vs. Dose,” in the proceedings of the 10th International Congress of the International Radiation Protection Association, Hiroshima, Japan, May 14-19, 2000.

7.        A. Brodsky, “Exact Calculation of Probabilities of False Positives and False Negatives for Low Background Counting,” Health Phys. 63(2), 198-204,1992.

8.        E. Jackson, “Assessing the U. S. Population Dose from Fallout Plutonium Using Physiological  Models of Uptake, Retention and Excretion,” Master of Science Thesis submitted to the Faculty of the Graduate School of Arts and Sciences, Georgetown University, Washington, DC, July 20, 2000.

9.        E. T. Lessard, X. Yihua, K. W. Skrable, G. E. Chabot, C. S. French, T. R. Labone, J. R. Johnson, D. R. Fisher, R. Belanger and J. Landmann-Lipzstein, “Interpretation of Bioassay Measurements,” NUREG/CR-4884, U. S. Nuclear Regulatory Commission, Washington, DC, 1987.

10.     A. R. Moorthy, C. J. Shopfer and S. Banerjee, “Plutonium from Atmospheric Weapons Testing: Fission Track Analysis of Urine Samples,” Anal. Chem. 60, 857A, 1988.

11.     J. F. McInroy, E. E. Campbell, W. D. Moss, G. L. Tietgen, B. C. Eutsler and H. A. Boyd, “Plutonium in Autopsy Tissue,” Health Phys. 37, 1-136, 1979.

12.     J. Schubert, A. Brodsky and S. Tyler, “The Log-Normal Function as a Stochastic Model of the Distribution of Sr-90 and Other Fission Products in Humans,” Health Phys. 13, 1187-1204, 1967.