The following describes the special versions
of the rigid body modelling program SASREF,
SASREFCV (formerly known as SASREF7) for fitting of SANS contrast variation
series (also combinable with SAXS) and SASREFMX for the structural analysis
of transient complexes and weak oligomers from polydisperse data. In the latter
case, the rigid body modelling is coupled with mixture analysis, whereby
the volume fractions of the dissociation products are estimated.
The two approaches have very much in common, so this manual provides details
of the dialog prompt as well as the required
configuration / input files as well as the produced
output for both programs.
SASREF CV / MX perform quaternary structure modeling of a complex particle
formed by subunits with known atomic structure against the SAS data set
in case of contrast variation series and a polydisperse system, respectively.
Multiple data sets can be fitted simultaneously, e.g. different D2O content and/or
perdeuteration (in SASREFCV) or profiles recorded at different conditions
(concentration, temperature, pH, ionic strength) yielding different affinity
of the complex particle (in SASREFMX). Both algorithms are capable to account
for the symmetry (which can be subunit- and data-specific).
A simulated annealing protocol is employed to construct an interconnected ensemble
of subunits without steric clashes, while minimizing the discrepancy between the
experimental scattering data and the predicted curves from the appropriate subunits
assemblies. In case of SASREFMX, the experimental data is fitted by the linear
combination of the profiles calculated from the intact particle and from the
dissociation products.
For futher details of the rigid body modelling approach please refer to
SASREF manual and to the papers cited above.
SASREF CV / MX can only be run in the dialog mode, no command line arguments
are accepted. Similarly to MONSA, significant amount
of the user input is provided using configuration files.
There are two modes, EXPERT and USER. In the former mode, the
user have the options to adjust more parameters. In the latter mode,
fewer questions are asked as the default values are used for the most of the
program parameters. The default settings are the same in both modes.
"Master" (highest order) symmetry. Individual subunits or scattering
profiles may have lower order symmetry. Supported symmetries are:
P1(no symmetry)P2-P6, P222, P32-P62. The n-fold axis
is typically Z, if there is in addition a two-fold axis, it coincides with Y.
If required, SASREF smears the theoretical curves using the resolution
function introduced by J. Skov Pedersen et al. (1990), J. Appl.
Cryst., 23, 321. It is mostly needed for the SANS data
but could also be applied for non-point SAXS source. Please refer to
MONSA manual for the explanations on the file format.
If no file name is provided, no smearing is applied.
Configuration file with the cross-corelation table
between the scattering curves and contributing subunits.
Cross penalty weight
10.0
N
How much the Cross Penalty shall influence the acceptance or rejection
of a mutation. A value of 0.0 disables the penalty. If unsure, use the
default value. If clashes between the subunits are observed, try increasing
this penalty weight.
Disconnectivity penalty weight
10.0
N
How much the Disconnectivity Penalty shall influence the acceptance or
rejection of a mutation. A value of 0.0 disables the penalty. If unsure,
use the default value. If not interconnected arrangement of the subunits is
observed, try increasing this penalty weight.
If the information on interface between certain subunits in terms
of contacting residues is available, it may be used as a modeling restraint.
The information is provided in a file with special format.
By default no information is given.
Contacts penalty weight
10.0
N
How much improper contacts shall influence the acceptance or
rejection of a mutation. If unsure,
use the default value. If desired interfaces are not obtained, try increasing
this penalty weight. This question is only asked if the
contacts conditions file is provided.
If, due to prior studies, it is known that the particle's shape
shall be either PROLATE or OBLATE, one may use
the anisometry option to enforce a penalty on particles that do
not correspond with the expected anisometry. By default,
anisometry is 'UNKNOWN'.
Anisometry penalty weight
1.0
N
How much improper anisometry shall influence the acceptance or
rejection of a mutation. If unsure,
use the default value. This question is skipped if the
Expected particle shape is 'UNKNOWN'.
This question is only asked if the
Expected particle shape is not 'UNKNOWN' and the
symmetry is 'P2'.
The user can specify if the symmetry axis coincides with (ALONG) or
perpendicular to (ACROSS) the anisometry axis.
Shift penalty weight
1.0
N
How much shift from the origin of the entire complex shall influence
the acceptance or rejection of a mutation. A value of 0.0 disables the
penalty. If unsure, use the default value. This penalty is necessary to keep the
model close to the origin so that the higher order harmonics are not lost
and the scattering is computed accurately.
Spatial step in angstroems
5.0
N
Maximal random shift of a subunit at a single modification of the system
in the course of simulated annealing. This question is asked for each subunit.
Angular step in degrees
20.0
N
Maximal random rotation angle of a subunit at a single modification of
the system in the course of simulated annealing. Setting it to zero may be useful
to keep the mutual orientations of certain subunits, e.g. if NMR RDC data are
available. This question is asked for each subunit.
Stop simulated annealing if not at least this many successful
mutations within a single temperature step can be done.
The default value is 50*
total number of subunits.
On runtime, two lines of output will be generated for each
temperature step:
j: 4 T: 0.729E+01 Suc: 1000 Eva: 12497 CPU: 0.208E+03 F:99.4301 Pen: 13.803
The best chi values:11.64871 5.96331
The fields can be interpreted as follows, top-left to bottom-right:
Field
Description
j
Step number. Starts at 1, increases monotonically.
T
Temperature measure, starts at an arbitrary high value, decreases
each step by the annealing schedule
factor.
Suc
Number of successful mutations in this temperature step.
Limited by the minimum and
maximum number of successes.
The number of successes should slowly decrease, the first couple of
steps should be terminated by the maximum
number of successes criterion. If instead the
maximum number of iterations are done, or the number
of successes drops suddenly by a large amount, the system should
probably be cooled more slowly.
Eva
Accumulated number of function evaluations.
CPU
Elapsed wall-clock time since the annealing procedure was started.
F
The best target function value obtained so far.
Pen
Accumulated penalty value of the best target function.
The best chi values
For each curve out of total number of curves, the χ
value of the best target function is given.
SASREFMX additionally outputs the volume fraction of the intact construct
for each of the fitted curves:
- The first line contains one integer K (total number of scattering curves
for SASREFCV and total number of scattering curves+1 for SASREFMX)
- K lines, each containing 8 parameters related to the scattering data set:
Field
Acceptable values
Description
1.
N/A
File name with the experimental data (*.dat)
in ascii format containing 3 columns: (1) experimental scattering
vector, (2) experimental intensity and (3) experimental errors
2.
[-1.0, 0.0-1.0]
D2O fraction in the solvent or
-1.0, if X-ray scattering data
3.
[P1-P6, P222-P62]
Symmetry for the given
construct at the given conditions (which may be different from the
overal symmetry)
4.
[1,2]
Angular units:
1 = 4*π*sin(θ)/λ in Å-1,
2 = 4*π*sin(θ)/λ in nm-1
5.
[0.1-1.0]
Fraction of the curve to be fitted
6.
[0-15]
Setting number: number of the column
in the optional "Resolution file"
containing the information for smearing. This value must be 0
for X-ray curves and for the neutron curves, for which smearing information
is not available. Neutron scattering curves with the same non-zero
setting number must have the same angular axis and number
of experimental points.
7.
[0.0-1.0]
Weight of the curve in the target function.
8.
[Y/N]
If Y, a constant background
will be automatically adjusted for this curve (this could be useful
for example to correct for incoherent background in neutron data)
Last line in case of SASREFMX describes the dissociation products and contains
all dummy values except for the symmetry.
- The first line contains one integer M (total number of subunits)
- M lines, each containing 4 parameters related to the subunit:
Field
Acceptable values
Description
1.
N/A
PDB file name
2.
[Y/N]
Whether to shift the subunit to the
origin at the begining or not
3.
[N/F/X/Y/Z/D]
Movements limitations.
N='No limitations';
F='subunit will be fixed';
X='rotations/translations along X axis only';
Y='rotations/translations along Y axis only';
Z='rotations/translations along Z axis only';
D='rotations/translations along (1,1,1) vector only';
4.
[P1-P6, P222-P62]
Symmetry applied to the
given subunit (may be different from the overal symmetry)
Cross-correlation file (table.con
in the examples) contains a table which sets the
relationship between the subunits and the scattering profiles. The number of its
columns equals to the total number of the subunits
(M) and the number of its rows equals to the
total number of the scattering curves (K).
The value in the i-th column and j-th row gives the contribution
of the i-th subunit in the j-th scattering data set. For SANS curve
this value ([0.0-1.0 or -1.0]) is the subunit perdeuteration
(D2O content in solution where the protein is expressed), whereby
-1.0 means that the given subunit is not present in the corresponding
construct. For X-ray scattering curve, 0.0 is to be used, if the
subunit is present.
In case of SASREFMX, the last row describes the dissociation products which
are mixed to all the curves. Here, an integer number, 0 or -1
are allowed, whereby -1 means that the subunit is not among the
dissociation products, 0 means that the subunit is a part of a
sub-complex which is a dissociation product of a larger assembly (there could be
not more than just one such sub-complex) and an integer means the molar ratio
(stoichiometry) of the subunit in the original (fully dissociated) sample.
Distance restraints may be imposed on the model using contacts
conditions file (optional) in the following format:
"dist 7.0" means that the minimum distance between CA atoms of the
residues (or P atoms in the nucleotides) specified in the following lines
should not exceed 7 Å. The first and the fourth numbers in the line not
containing keyword "dist" mean the ordial numbers of the 1st and the 2nd subunits
having the contact by any residue/nucleotide of the 1st subunit in the range
from second number to third number with any residue of the 2nd subunit in the
range from fifth number to sixth number. 0 means the last residue/nucleotide
of the subunit.
If two (or more) alternatives are given after the line with the keyword
"dist", the program compares the better (smaller) distance among them with
the specified one.
Please refer to SASREF manual for more
details. Important: there is a small difference here in the numbering of
the subunits due to the possibility of distinct symmetries applied to individual
subunits in SASREF CV / MX. First, all subunits in the asymmetric part appear
(as in conventional SASREF version) followed by all symmetry mates of the first
subunit, then of the second and so on until the last one.
After each simulated annealing step, SASREF CV /MX creates a set of output
files, each filename starts with a customizable prefix
that gets an extension appended. If a prefix has been used before, existing
files will be overwritten without further note.
The current model of the entire complex. The REMARK section of
the file contains information about the application used and
about the parameters of the model, e.g. penalties and χ.
Fit of the scattering curve computed from the complex (subcomplex)
versus the corresponding experimental data. i stands for the
construct number. Columns in the output file
are: 's', 'Iexp' and 'Icomp'.
A simulated exampe of T7 DNA Polymerase Ternary Complex with DCTP
(PDB entry 1t8e). PDB files containing the atomic coordinates of the three
subunits are:
phsave1.pdb - Polymerase
phsave2.pdb - DCTP
phsave3.pdb - DNA
Simulated SAS data contain 17 curves in total: 2 X-ray profiles
(from the entire complex and from the binary construct without DNA) +
15 neutron scattering curves from the complex [(series of D2O
content: 0, 40, 55, 70 and 100% D2O)*
(3 perduterations of DCTP: 0, 50 and 100%)]:
x-prot.dat X-ray protein complex
x-compl.dat X-ray, ternary complex
complh_0.dat ternary complex with protonated DCTP in 0%D2O
complh_40.dat in 40%D2O
complh_55.dat in 55%D2O
complh_70.dat in 70%D2O
complh_100.dat in 100%D2O
compl50d_0.dat 50% deuterated DCTP in 0%D2O
compl50d_40.dat in 40%D2O
compl50d_55.dat in 55%D2O
compl50d_70.dat in 70%D2O
compl50d_100.dat in 100%D2O
compl100d_0.dat fully deuterated DCTP in 0%D2O
compl100d_40.dat in 40%D2O
compl100d_55.dat in 55%D2O
compl100d_70.dat in 70%D2O
compl100d_100.dat in 100%D2O
Content of the curves.con file:
17
x-prot.dat -1.00 P1 1 1.0 0 1.0 y
x-compl.dat -1.00 P1 1 1.0 0 1.0 y
complh_0.dat 0.00 P1 1 1.0 0 1.0 y
complh_40.dat 0.40 P1 1 1.0 0 1.0 y
complh_55.dat 0.55 P1 1 1.0 0 1.0 y
complh_70.dat 0.70 P1 1 1.0 0 1.0 y
complh_100.dat 1.00 P1 1 1.0 0 1.0 y
compl50d_0.dat 0.00 P1 1 1.0 0 1.0 y
compl50d_40.dat 0.40 P1 1 1.0 0 1.0 y
compl50d_55.dat 0.55 P1 1 1.0 0 1.0 y
compl50d_70.dat 0.70 P1 1 1.0 0 1.0 y
compl50d_100.dat 1.00 P1 1 1.0 0 1.0 y
compl100d_0.dat 0.00 P1 1 1.0 0 1.0 y
compl100d_40.dat 0.40 P1 1 1.0 0 1.0 y
compl100d_55.dat 0.55 P1 1 1.0 0 1.0 y
compl100d_70.dat 0.70 P1 1 1.0 0 1.0 y
compl100d_100.dat 1.00 P1 1 1.0 0 1.0 y
Content of the subunits.con file:
3
phsave1.pdb Y F P1
phsave2.pdb Y N P1
phsave3.pdb Y N P1
Let weak_tetramer.dat be a SAXS profile from a polydisperse sample
containing a tetramer with P222 symmetry and its monomer
(i.e. oligomeric equilibrium). The atomic structure of the latter is contained
in monomer.pdb. Content of the curves.con file is then:
2
weak_tetramer.dat -1.00 P222 1 1.0 0 1.0 y
dummy.dat -1.00 P1 1 1.0 0 1.0 y
Equimolar mixture of proteins A and B yields an equilibrium between
the AB complex and its components in unbound state. Concentration series
with distinct volume fracions of the intact complex can be fitted simultaneously.
Content of the configuration files in this case is as follows.
curves.con:
4
transient_c1.dat -1.00 P1 1 1.0 0 1.0 y
transient_c2.dat -1.00 P1 1 1.0 0 1.0 y
transient_c3.dat -1.00 P1 1 1.0 0 1.0 y
dummy.dat -1.00 P1 1 1.0 0 1.0 y
If an excess of subunit C is needed for formation of a stable complex ABC,
resulting sample will be polydisperse. Content of the configuration
files in this case is as follows.
curves.con:
2
mixture.dat -1.00 P1 1 1.0 0 1.0 y
dummy.dat -1.00 P1 1 1.0 0 1.0 y
By lack of subunit C (e.g. in a stoichiometric study), some amount of
sub-complex AB migh be present in the mixture with the ternary ABC. Content
of the configuration files is then as follows.
curves.con:
2
mixture.dat -1.00 P1 1 1.0 0 1.0 y
dummy.dat -1.00 P1 1 1.0 0 1.0 y