Comparison of two or more data sets is a routine procedure during data analysis. Typically, two buffers are compared to inspect the stability of the background subtraction, or two different frames are checked for radiation damage, but also multiple frames may be compared to detect radiation damage prior to averaging. Here, DATCMP calculates the probability that the given input files may be considered similar, i.e. there is not a single frame that may be considered significantly different from any of the others.

Since data sets measured on the same specimen can not be considered independent (in the statistical sense), an Univariate Type III Analysis of Variance (ANOVA) for repeated measurements is employed. Post hoc pairwise tests are implemented as paired t-tests with a correction according to Bonferroni-Holm. Please consult your favourite textbook on statistics for an in-depth discussion of the topic.

One or more additional data files in any of the supported formats.

Absolute as well as relative paths to data files are accepted. Up to one of the input files may also be given as '-', in this case input is read from stdin instead of a file.

DATCMP recognizes following command-line options:

Short Option

Long Option

Description

--test=<NAME>

Test to apply, either ANOVA (default) or CHI-QUARE. The CHI-SQUARE test is the test implemented in previous versions of the program. It accepts only two input files at a time.

--post-hoc=[NAME]

Enables post hoc tests if given without argument. Supported tests are STUDENT-T (default) and SCHEFFE.

--scale

Scale the data sets prior to comparison. This may help to avoid issues with incorrect scaling.

As everywhere with statistical testing, one must set the threshold of what one considers significant prior to running the test. Failing to do so will skew the results.Here we shall assume a significance level of 0.01.

$ datcmp frame_010_00*.dat
Hypothesis: all data sets are similar
Alternative: at least one data set is different
Univariate Type III Repeated-Measures ANOVA
eps num Df den Df F Pr(>F) adj Pr(>F)
Assuming Sphericity 7 15169 1.4974 0.162984
Greenhouse-Geyser Correction 0.2258 1 3425 0.221163
Huynh-Feldt Correction 0.2260 1 3427 0.221163

The first two lines of the output restate the hypothesis and their alternative: the hypothesis is that all frames are similar and the alternative that at least one frame is different among the provided set.

There are three lines of output of the Univariate Type III Repeated-Measures ANOVA:

Results Assuming Sphericity

Results correcting for violations of Sphericity using the Greenhouse-Geyser Correction

Results correcting for violations of Sphericity using the Huynh-Feldt Correction

Either way, the ANOVA is a F-test with the given degrees of freedom. The value of the test statistic is stated in the F-column, the p-value, the probability of an F statistic larger than the value computed (here
1.4974) is given in the column Pr(>F), here 0.162984. If this p-value is less than the initially stated significance level, here 0.01, then the hypothesis of similar frames can be rejected. This is not the case here, there is no frame that is significantly different from any other.

That said, if the assumption of Sphericity is violated, the above result is not conclusive as uncorrected F-test results may be too liberal, i.e. reject the hypothesis too often. Sphericity is checked by Greenhouse-Geyser and Huynh-Feldt epsilon values given in the first column. Epsilon lies in the range of 0.0 to 1.0 and if the Sphericity assumption hold, the value is close to 1.0. The more Sphericity is violated, the smaller the value becomes. The p-values are adjusted by reducing the degrees of freedom proportional to the violation of the Sphericity assumption. Adjusted p-values are given in the last column and are usually larger than the uncorrected p-value, here 0.221163 > 0.162984.

The test decision should thus always be based on the adjusted p-values. See also the Examples.

For the following examples we shall assume a significance level of 0.01.

Comparing multiple frames to determine radiation damage:

$ datcmp frame_010_00*.dat
Hypothesis: all data sets are similar
Alternative: at least one data set is different
Univariate Type III Repeated-Measures ANOVA
eps num Df den Df F Pr(>F) adj Pr(>F)
Assuming Sphericity 7 15169 1.4974 0.162984
Greenhouse-Geyser Correction 0.2258 1 3425 0.221163
Huynh-Feldt Correction 0.2260 1 3427 0.221163

Test decision: the hypothesis of similar frames can not be rejected (alpha=0.01, p-value of 0.221163), the frames are to be considered similar.

$ datcmp frame_011_00*.dat
Hypothesis: all data sets are similar
Alternative: at least one data set is different
Univariate Type III Repeated-Measures ANOVA
eps num Df den Df F Pr(>F) adj Pr(>F)
Assuming Sphericity 7 15169 4.4145 0.000066
Greenhouse-Geyser Correction 0.3242 2 4918 0.012148
Huynh-Feldt Correction 0.3246 2 4923 0.012148

Although the uncorrected F-test shows highly significant results, the adjusted p-values are still larger than the chosen significance level! Test decision: the hypothesis of similar frames can not be rejected (alpha=0.01, p-value of 0.012148), the frames are to be considered similar.

$ datcmp frame_036_00*.dat
Hypothesis: all data sets are similar
Alternative: at least one data set is different
Univariate Type III Repeated-Measures ANOVA
eps num Df den Df F Pr(>F) adj Pr(>F)
Assuming Sphericity 7 15169 16.5080 0.000000
Greenhouse-Geyser Correction 0.2506 1 3801 0.000049
Huynh-Feldt Correction 0.2508 1 3804 0.000049

Test decision: the hypothesis of similar frames has to be rejected at a significance level of alpha=0.01 with a p-value of 0.000049, the frames are to be considered different.

Now, and only now, one may ask which frames are different. To answer this, re-run the test with the post-hoc t-tests enabled:

$ datcmp frame_036_00*.dat --post-hoc
Hypothesis: all data sets are similar
Alternative: at least one data set is different
Univariate Type III Repeated-Measures ANOVA
eps num Df den Df F Pr(>F) adj Pr(>F)
Assuming Sphericity 7 15169 16.5080 0.000000
Greenhouse-Geyser Correction 0.2506 1 3801 0.000049
Huynh-Feldt Correction 0.2508 1 3804 0.000049
Post-hoc paired t-tests with Bonferroni-Holm correction
Df t Pr(>t) adj Pr(>t)
1 vs. 2 2167 0.9425 0.346047 0.346047
1 vs. 3 2167 -1.0974 0.272573 0.545147
1 vs. 4 2167 -1.9825 0.047545 0.142635
1 vs. 5 2167 -2.6553 0.007983 0.031931
1 vs. 6 2167 -3.1074 0.001912 0.009559
1 vs. 7 2167 -4.8796 0.000001 0.000008
1 vs. 8 2167 -4.3401 0.000015 0.000089
1 frame_036_001.dat
2 frame_036_002.dat
3 frame_036_003.dat
4 frame_036_004.dat
5 frame_036_005.dat
6 frame_036_006.dat
7 frame_036_007.dat
8 frame_036_008.dat

Here, the same rules apply as for the F-test: to hold the family wise error-level, the p-values need to be corrected for multiple testing. Comparing the last column, adj Pr(>t), with the significance level, here alpha=0.01, one concludes that the last three frames, 6, 7 and 8, suffer from statistically significant radiation damage and should be excluded from further analysis.