Original File Name: VanDyke trap area jackknife anova.txt

 04/30/2020

                           OR-DBM MRMC 2.51

                     <beta> Build  20181028 </beta>

              MULTIREADER-MULTICASE ROC ANALYSIS OF VARIANCE
                          TRAPEZOIDAL AREA ANALYSIS
 
 
 |====================================================================|
 |*****                         Credits                          *****|
 |====================================================================|
 |                                                                    |
 | ANOVA Computations & Display:                                      |
 | -----------------------------                                      |
 | Kevin M. Schartz, Stephen L. Hillis, Lorenzo L. Pesce, &           |
 | Kevin S. Berbaum                                                   |
 |                                                                    |
 | Expected Utility Computations:                                     |
 | -----------------------------                                      |
 | Craig K. Abbey                                                     |
 |                                                                    |
 |====================================================================|
 
 
 
 |====================================================================|
 |***************************     NOTE     ***************************|
 |====================================================================|
 | The user agreement for this software stipulates that any           |
 | publications based on research data analyzed using this software   |
 | must cite references 1-5 given below.                              |
 |                                                                    |
 | Example of citing the software:                                    |
 |                                                                    |
 |      Reader performance analysis was performed using the software  |
 | package OR-DBM MRMC 2.51, written by Kevin M.Schartz, Stephen L.   |
 | Hillis, Lorenzo L. Pesce, and Kevin S. Berbaum, and freely         |
 | available at http://perception.radiology.uiowa.edu. This program   |
 | is based on the methods initially proposed by Dorfman, Berbaum,    |
 | and Metz [1] and Obuchowski and Rockette [2] and later unified and |
 | improved by Hillis and colleagues [3-5].                           |
 |====================================================================|
 
 Data file: \\vmware-host\Shared Folders\VanDykeAnalyzed\VanDyke.lrc                                                                                                                                                                                                        
 
 2 treatments, 5 readers, 114 cases (69 normal, 45 abnormal)
 
 Curve fitting methodology is TRAPEZOIDAL/WILCOXON
 Dependent variable is AUC
 
 Study Design:  Factorial
 Covariance Estimation Method:  Jackknifing
 
 ===========================================================================
 *****                            Estimates                            *****
 ===========================================================================
 
TREATMENT x READER AUC ESTIMATES

                  TREATMENT
           -----------------------
 READER         1            2
 ------    ----------   ----------   
     1     0.91964573   0.94782609
     2     0.85877617   0.90531401
     3     0.90386473   0.92173913
     4     0.97310789   0.99935588
     5     0.82979066   0.92995169
 

 TREATMENT AUC MEANS (averaged across readers)
 ---------------------------------------------
       1      0.89703704
       2      0.94083736
 
 

 TREATMENT AUC MEAN DIFFERENCES
 ------------------------------
     1 - 2    -0.04380032
 
 
 
 ===========================================================================
 *****            ANOVA Tables (OR analysis of reader AUCs)            *****
 ===========================================================================
 
 TREATMENT X READER ANOVA of AUCs
 (Used for global test of equal treatment AUCs and for treatment differences
  confidence intervals in parts (a) and (b) of the analyses)
 
Source            SS               DF             MS    
------   --------------------    ------   ------------------
     T             0.00479617         1           0.00479617
     R             0.01534480         4           0.00383620
   T*R             0.00220412         4           0.00055103
 
 
 
 READER ANOVAs of AUCs for each treatment
 (Used for single treatment confidence intervals in part (c) of the analyses)
 

                        Mean Squares
 Source     df   Treatment 1   Treatment 2
 ------    ---   -----------   -----------
      R      4    0.00308263    0.00130460
 
 
 ===========================================================================
 *****        Variance component and error-covariance estimates        *****
 ===========================================================================
 
 Obuchowski-Rockette variance component and covariance estimates
 (for sample size estimation for future studies)
 Note: These are ANOVA estimates which can be negative
 
     OR Component             Estimate         Correlation  
 -----------------------  ----------------  ----------------
 Var(R)                         0.00153500
 Var(T*R)                       0.00020040
 COV1                           0.00034661        0.43203138
 COV2                           0.00034407        0.42886683
 COV3                           0.00023903        0.29793328
 Var(Error)                     0.00080229
 
 
 Corresponding DBM variance component and covariance estimates
 
     DBM Component            Estimate    
 -----------------------  ----------------
 Var(R)                         0.00153500
 Var(C)                         0.02724923
 Var(T*R)                       0.00020040
 Var(T*C)                       0.01197530
 Var(R*C)                       0.01226473
 Var(T*R*C) + Var(Error)        0.03997160
 
 
 ===========================================================================
 *****    Analysis 1 (OR Analysis): Random Readers and Random Cases    *****
 ===========================================================================
 (Results apply to the population of readers and cases)


    a) Test for H0: Treatments have the same AUC
 
 Source        DF      Mean Square    F value  Pr > F 
 ----------  ------  ---------------  -------  -------
 Treatment        1       0.00479617     4.46   0.0517
 Error term   15.26       0.00107626
 Error term = MS(T*R) + r*max[Cov2 - Cov3,0]
 
 Conclusion: The treatment AUCs are not significantly different [F(1,15) = 4.46, p = .0517].
 
 Df(error term) = [MS(T*R) + r*max(Cov2 - Cov3,0)]**2/{MS(T*R)**2/[(t-1)(r-1)]}
 Note: "Error term" is the denominator of the F statistic and is a linear
 combination of mean squares, as defined above.  The value of this linear 
 combination is given under the "Mean Square" column
 Note: Df(error term) is called "ddf_H" in Hillis (2007).
 

    b) 95% confidence intervals and hypothesis tests (H0: difference = 0)
       for treatment AUC differences
 
 Treatment
 Comparison  Difference   StdErr      DF      t     Pr >|t|          95% CI       
 ----------  ----------  --------  -------  ------  -------  ---------------------
   1 - 2       -0.04380   0.02075    15.26   -2.11   0.0517  (-0.08796 ,  0.00036)
 
 StdErr = sqrt{(2/r)*[MS(T*R) + r*max(Cov2 - Cov3,0)]}
 Df same as df(error term) from (a)
 95% CI: Difference +- t(.025;df) * StdErr
 

    c) Single-treatment 95% confidence intervals
       (Each analysis is based only on data for the specified treatment, i.e., 
       on the treatment-specific reader ANOVA of AUCs and Cov2 estimates.)
 
  Treatment      AUC      Std Err       DF     95% Confidence Interval      Cov2   
 ----------  ----------  ----------  -------  -------------------------  ----------
          1  0.89703704  0.03317360    12.74  (0.82522360 , 0.96885048)  0.00048396
          2  0.94083736  0.02156637    12.71  (0.89413783 , 0.98753689)  0.00020419
 
 StdErr = sqrt{1/r * [MS(R) + r*max(Cov2,0)]}
 Df = [MS(R)+ max(r*cov2,0)]**2/[(MS(R)**2/(r-1)]
 Note: Df is called "ddf_H_single" in Hillis (2007)
 95% CI: AUC +- t(.025;df) * StdErr
 
 
 ===========================================================================
 *****    Analysis 2 (OR Analysis): Fixed Readers and Random Cases     *****
 ===========================================================================
 (Results apply to the population of cases but only for the readers used in
 this study. Chi-square or Z tests are used; these are appropriate for 
 moderate or large case sample sizes.)
 
    a) Chi-square test for H0: Treatments have the same AUC
    Note: The chi-square statistic is denoted by X2 or by X2(df), where df is its 
    corresponding degrees of freedom.
 
 
     X2 value       DF    Pr > X2
 ---------------  ------  -------
         5.47595       1   0.0193
 
 Conclusion: The treatment AUCs are not equal [X2(1) = 5.48, p = .0193].
 
 X2 = (t-1)*MS(T)/[Var(error) - Cov1 + (r-1)*max(Cov2 - Cov3,0)]


    b) 95% confidence intervals and hypothesis tests (H0: difference = 0)
       for treatment AUC differences
 
 Treatment
 Comparison  Difference   StdErr     z     Pr >|z|          95% CI       
 ----------  ----------  --------  ------  -------  ---------------------
   1 - 2       -0.04380   0.01872   -2.34   0.0193  (-0.08049 , -0.00711)
 
 StdErr = sqrt{2/r * [(Var(error) - Cov1 + (r-1)*max(Cov2 - Cov3,0)]}
 95% CI: difference +- z(.025) * StdErr
 

    c) Single treatment AUC 95% confidence intervals
       (Each analysis is based only on data for the specified treatment, i.e., on
        the specific reader ANOVA of AUCs and error-variance and Cov2 estimates.)
 
  Treatment      AUC      Std Error   95% Confidence Interval 
 ----------  ----------  ----------  -------------------------
          1  0.89703704  0.02428971  (0.84943008 , 0.94464399)
          2  0.94083736  0.01677632  (0.90795637 , 0.97371835)
 
  Treatment  Var(Error)     Cov2   
 ----------  ----------  ----------
          1  0.00101410  0.00048396
          2  0.00059047  0.00020419
 
 StdErr = sqrt{1/r * [Var(error) + (r-1)*max(Cov2,0)]}
 95% CI: AUC +- z(.025) * StdErr


    d) Single-reader 95% confidence intervals and tests (H0: difference = 0) for 
    treatment AUC differences.
       (Each analysis is based only on data for the specified reader, i.e, on the 
        reader-specific AUC, error-variance and Cov1 estimates.)
 
         Treatment
 Reader  Comparison  Difference  StdErr      z     Pr >|z|          95% CI       
 ------  ----------  ----------  --------  ------  -------  ---------------------
      1    1 - 2       -0.02818   0.02551   -1.10   0.2693  (-0.07818 ,  0.02182)
      2    1 - 2       -0.04654   0.02630   -1.77   0.0768  (-0.09809 ,  0.00501)
      3    1 - 2       -0.01787   0.03121   -0.57   0.5668  (-0.07904 ,  0.04330)
      4    1 - 2       -0.02625   0.01729   -1.52   0.1290  (-0.06014 ,  0.00764)
      5    1 - 2       -0.10016   0.04406   -2.27   0.0230  (-0.18651 , -0.01381)
 
 Reader  Var(Error)     Cov1   
 ------  ----------  ----------
      1  0.00069890  0.00037347
      2  0.00110605  0.00076016
      3  0.00084234  0.00035532
      4  0.00015058  0.00000108
      5  0.00121357  0.00024304
 
 StdErr = sqrt[2*(Var(error) - Cov1)]
 95% CI: Difference +- z(.025) * StdErr
 
 
 ===========================================================================
 *****    Analysis 3 (OR Analysis): Random Readers and Fixed Cases     *****
 ===========================================================================
 (Results apply to the population of readers but only for the cases used in
 this study)

     These results result from using the OR model, but treating reader as a random 
 factor and treatment and case as fixed factors.  Because case is treated as a fixed
 factor, it follows that Cov1 = Cov2 = Cov3 = 0; i.e., there is no correlation
 between reader-performance measures (e.g, AUCs) due to reading the same
 cases.  Thus the OR model reduces to a conventional treatment x reader ANOVA
 for the reader-performance outcomes, where reader is a random factor and
 treatment is a fixed factor.  This is the same as a repeated measures ANOVA
 where treatment is the repeated measures factor, i.e., readers provide an
 outcome (e.g., AUC) under each treatment.
     Note that the DBM and OR papers do not discuss this approach, but rather 
 it is included here for completeness.

    a) Test for H0: Treatments have the same AUC
 
 Source        DF    Mean Square      F value  Pr > F 
 ----------  ------  ---------------  -------  -------
 Treatment        1       0.00479617     8.70   0.0420
 T*R              4       0.00055103
 
 Conclusion: The treatment AUCs are not equal [F(1,4) = 8.70, p = .0420].
 Note: If there are only 2 treatments, this is equivalent to a paired t-test applied
 to the AUCs


    b) 95% confidence intervals and hypothesis tests (H0: difference = 0)
       for treatment AUC differences
 
 Treatment
 Comparison  Difference   StdErr      DF      t     Pr >|t|          95% CI       
 ----------  ----------  --------  -------  ------  -------  ---------------------
   1 - 2       -0.04380   0.01485        4   -2.95   0.0420  (-0.08502 , -0.00258)
 
 StdErr = sqrt[2/r * MS(T*R)]
 DF = df[MS(T*R)] = (t-1)(r-1)
 95% CI: Difference +- t(.025;df) * StdErr
 Note: If there are only 2 treatments, this is equivalent to a paired t-test applied
 to the AUCs
 

    c) Single treatment AUC 95% confidence intervals
       (Each analysis is based only on data for the specified treatment, 
       i.e. on the treatment-specfic reader ANOVA of AUCs
 
  Treatment      AUC        MS(R)     Std Error     DF     95% Confidence Interval 
 ----------  ----------  ----------  ----------  -------  -------------------------
          1  0.89703704  0.00308263  0.02482994        4  (0.82809808 , 0.96597599)
          2  0.94083736  0.00130460  0.01615303        4  (0.89598936 , 0.98568536)
 
 StdErr = sqrt[1/r * MS(R)]
 DF = df[MS(R)] = r-1
 95% CI: AUC +- t(.025;df) * StdErr
 Note: this is the conventional CI, treating the reader AUCs as a random sample.
 
 
 
 #=> Reference resources are missing. Default references provided. <=#
 


                               REFERENCES

 1.   Dorfman, D.D., Berbaum, K.S., & Metz, C.E. (1992). Receiver operating
 characteristic rating analysis: Generalization to the population of 
 readers and patients with the jackknife method. Investigative Radiology,
 27, 723-731.

 2.    Obuchowski, N.A., & Rockette, H.E. (1995). Hypothesis testing of diagnostic
 accuracy for multiple readers and multiple tests: An ANOVA approach with dependent
 observations. Communications in Statistics-Simulation and Computation, 24, 285-308.

 3.   Hillis, S.L., Obuchowski, N.A., Schartz, K.M., & Berbaum, K.S.
 (2005). A comparison of the Dorfman-Berbaum-Metz and Obuchowski-Rockette
 methods for receiver operating characteristic (ROC) data. 
 Statistics in Medicine, 24, 1579-1607  DOI:10.1002/sim.2024.

 4.   Hillis, S.L. (2007). A comparison of denominator degrees of freedom for
 multiple observer ROC analysis.  Statistics in Medicine, 26:596-619  DOI:10.1002/sim.2532.

 6.   Hillis, S.L., Berbaum, K.S., & Metz, C.E. (2008). Recent developments in the
 Dorfman-Berbaum-Metz procedure for multireader ROC study analysis. Academic Radiology, 15, 
 647-661. DOI:10.1016/j.acra.2007.12.015
