Quality of X-ray Data
One of the commonly used indicators of the quality of X-ray
diffraction data is the so-called symmetry R-factor (Rsym)
or merging R-factor (Rmerge), which is nothing but
the sum of the differences of all measurements from the average
value of the measurement divided by the sum of all measurements
(see equation 1).
Eq. 1 : R
=
I
(hkl)
-
/
I
(hkl)
However, the principal
problem with Rmerge as a quality indicator is, that it is inherently
dependent on the redundancy (or multiplicity) of the data. The more often a given reflection
is observed the higher the Rmerge will be,
even though by simple statistical reasoning the average
value of the measurements becomes more precise.
Furthermore, if one compares the equations for Rmerge and the
canonical standard deviation, one can derive that in the case a reflection is only
observed often enough that the Rmerge for this reflection is 0.7979 times the
standard deviation divided by the mean intensity of this reflection. In other
words if I/sigma(I) is about 2.0 (what is commonly used to define the high
resolution limit of a data set) the Rmerge can really not be
better than 40% just assuming statistical errors.
Two other R-factors that should be better suited to describe the quality of
diffraction data, are : the so-called redundancy-independent merging R-factor
(Rr.i.m.) and the precision-indicating merging R-factor (Rp.i.m.).
Rr.i.m. contains the redundancy N or the multiplicity of the observed
reflection and is basically the conventional Rmerge made
independent of how often a given reflection has been observed. For that
reason it gives higher values than Rmerge especially at
low redundancy. (Maybe that is the reason why it hasn't become popular
... just guessing.)
Rp.i.m. also contains the redundancy N and indicates how precisely
the average measurement has been measured. Just assuming statistical errors
this should probably be the one to use, since structures are usually solved
and refined using averaged measurements. The equations for the two R-factors are
given below (equations 2 and 3).
Eq. 2 : R
=
I
(hkl) -
/
I
(hkl)
Eq. 3 : R
=
I
(hkl) -
/
I
(hkl)
Unfortunately some scaling programs such as SCALEPACK don't provide these
data. SCALA, however, calculates both Rr.i.m. and Rr.i.m.
(Rr.i.m. is also called Rmeas
following the paper by Diederichs and Karplus). In order to provide
these numbers, I have written a program in (almost) standard
Fortran 77 that can be downloaded from this site, and together with the
short set of instructions it should be possible to calculate all of them
very easily. The program has been tested extensively on a Silicon
Graphics platform (IRIX 6.2) and it compiles alright under Digital
UNIX V4.0D. It should compile ok on other platforms such as Linux
and Windows using the compilers ifort, gfortran and
g95 as well, but we are not sure.
Case 1 : SCALEPACK is your favorite scaling program
- Once you have scaled and merged your data using SCALEPACK,
re-run SCALEPACK with the line NO MERGE ORIGINAL INDEX added.
This writes out a reflection file with scaled but unmerged
intensities.
- run an executable version of the program RMERGE, which you
can download here. You need a FORTRAN
compiler to make the executable of course. The input to the
program is pretty much self-explanatory. The output will be
written to the screen and to a file called rmerge.data.
Case 2 : SCALA is your favorite scaling program
You are lucky, because SCALA will do the job for you automatically.
Case 3 : XDS/XSCALE
XDS/XSCALE will provide Rr.i.m. (called Rmeas)
but not Rp.i.m.. Thanks to Kay Diederichs, the current
version is now also able to read and process the file XDS_ASCII.HKL.
Just one last thing: I would really appreciate if you would drop me
a line at msweiss
when you have picked up the program. Also, let me know if there
are any problems.
More information about this can be found in the following papers:
M. S. Weiss (2001). Global indicators of X-ray data quality. J. Appl. Cryst.
34, 130-135.
M.S. Weiss and R. Hilgenfeld (1997). On the use of the merging R factor as a quality indicator
for X-ray data. J. Appl. Cryst. 30, 203-205.
K. Diederichs and P.A. Karplus (1997). Improved R-factors for diffraction data analysis in
macromolecular crystallography. Nature Struct. Biol. 4, 269-275.