MONSA is an extended version of
DAMMIN
for multiphase bead modelling which allows one to fit
simultaneously multiple curves (e.g. from X-ray and/or neutron contrast
variation series).
MONSA reads in multiple data
sets and information about the
contrasts and volume fractions of the phases in a particle.
The program can simultaneously fit data recorded at different
instrumental settings and also with different radiations
(e.g. X-rays and neutrons). The structure of the input data is therefore somewhat
complicated.
The program requires:
a MASTER
file
(file *.mst) containing the general phase information and
references to CONTROL file(s);
CONTROL file(s)
(*.con) containing the smearing information for
the given setting, information about contrasts and references
to DATA files (*.dat);
DATA files
(*.dat), containing raw experimental data at different
contrasts;
a PDB-like file
defining the number of phases and the SEARCH VOLUME
for the model.
An identifier (up to six characters) to define all the output files names
Project description:
N/A
Text description of the problem
Master file name:
N/A
Name of the master file
Maximum order of harmonics:
14
The more harmonics, the more accurate the reconstruction becomes,
but the slower the process. May be between 5 and 20
DAM coordinates file name:
N/A
Name of the Search Volume file generated by BODIES.
Symmetry: Pn or Pn2 (n=1,2,3,4,5,6):
P1
Specify the symmetry to enforce on the particle.
Reset (unfix) all atoms [ Y / N ]:
No
If 'Y', the phases indices allowed for
the atoms in the pdb file are set to.
Atomic radius:
var
If the file is prepared by BODIES,
the value is read from the file.
Atomic volume:
var
This is ( 4 / 3 ) π × r3 / 0.74 (volume per sphere
for dense packing).
Preference for non-solvent contacts:
0.3
With a value of 0.0, the phase of the atom (solvent or protein)
does not influence the looseness penalty weight.
When this value is increased, non-solvent contacts are prefered,
through the calculation of the looseness penalty
weight. If unsure, use the default value.
How much the Looseness Penalty shall influence the acceptance or
rejection of phase changes.
A value of 0.0 disables the penalty. If unsure, use the default
value. If unlike smooth surfaces, sharp edges are observed,
try decreasing this penalty weight.
How much the Discontiguity Penalty shall influence the acceptance
or rejection of phase changes.
A value of 0.0 disables the penalty. If unsure, use the default
value.
Randomize the initial DAM [ Y / N ]
Yes
If 'Y', the starting model is randomized
Fix the overall scale factor [ Y / N ]
No
If No (recommended), then the overall scale factor, as well as
individual relative scale factors for all the data sets will be
determined automatically. If the scale factor is known (data on
absolute scale) in may be fixed and entered manually.
Volume fraction penalty weight
50
How much the Volume Fraction Penalty should influence the
acceptance or rejection of phase changes.
Rg penalty weight
0.0
How much the radius of gyration penalty should influence the
acceptance or rejection of phase changes.
A value of 0.0 disables the penalty.
Center penalty weight (negative = WeiPer):
0.0
How much the Center Penalty shall influence the acceptance
or rejection of phase changes.
A value of 0.0 disables the penalty. If unsure, use
the default value.
Initial annealing temperature :
10
If the value is too high, it could take ages for the system to cool down.
If the value is too low, the system can be trapped in a local minimum.
If unsure use the default value.
Annealing schedule factor :
0.9
Factor by which the temperature is decreased; 0.95 is a good
average value. Faster cooling for smaller systems is possible
(0.9), but slower cooling (0.99) needs to be applied more often.
The fields can be interpreted
as follows, top-left to bottom-right:
Field
Description
jAnn
Step number. Starts at
1, increases monotonically.
T
Temperature measure,
starts at an arbitrary high value, decreases each step by the
temperature schedule factor
iSuc
Number of successful
phase changes in this temperature step.
The number of successes should slowly decrease,
the first couple of steps should be terminated
by the maximum
number of successes criterion. If instead the
maximum number of iterations per step
are done, or the number of successes drops suddenly by a large amount,
the system should probably be cooled more slowly.
nEva
Accumulated number of
function evaluations.
CPU
Elapsed CPU cycles since
the annealing procedure was started.
SqfVal
Goodness of the model (fit + penalties).
Rf
Goodness of fit of
simulated data versus experimental data, does not take penalties into
account.
The master file contains the
general phase information: volumes of the different phases, radii of
gyration, connectivity etc. It has the following structure:
Line 1 Title (up to 80 characters)
Line 2 Four theoretical volumes
of individual phases (required)
Line 3 Four theoretical radii of gyration in Ångström (even if your data are in nm-1)
of individual phases (optional)
Line 4 Connectivity indicators of phases (required):
'1' for 'interconnected', '0' for 'disconnected', '-1' for 'symmetry defined'
Line 5 Control file name and Npts for Guinier fit
(no fit if the latter is equal to '-1')
... OPTIONAL ...
Line 6 Control file name and Npts for Guinier fit
(no fit if the latter is equal to '-1')
...
etc Erroneous lines skipped; read to the end
The program
works with up to four-component particles. If the
number of components (phases) is less than four, just put
zeroes for the values required for this phase.
The control file contains the
smearing information for the given setting, information about contrasts
and references to the data file. It has the following structure:
Line 1 Resolution file name, resolution setting number (free format)
Line 2 Output file name for the fits (not used) (free format)
Line 3 Title (character*80)
Line 4 Number of points in the setting (free format)
(put negative number to indicate nm-1 as angular units)
Line 5 Data file name, contrasts and constants (free format)
etc Erroneous lines skipped; read to the end
The information about the data sets is given in the format:
Filename Dro1 Dro2 Dro3 Dro4 Mult Const Weight
Field
Description
Filename
Filename of the scattering pattern (up to 15 characters).
If required, MONSA smears the theoretical curves using
the resolution function
introduced by J. Skov Pedersen et al. (1990), J. Appl.
Cryst., 23,
321.
Several subroutines for data smearing are provided by J. Skov Pedersen
and modified for the use in MONSA. The resolution
file must have the following format (the numbers describe a setting
at RISOE SANS instrument):
Row
Value
Description
1
0.8
Effective collimation slit diameter in cm.
2
0.35
Effective sample diameter in cm.
3
300
Collimation distance in cm.
4
105
Sample-detector distance in cm.
5
3
λ in Å
6
0.18
δ(λ)/λ
7
1.1
Pixel size in cm.
8
0.0000
Averaging error (accounted for in Pixel size).
If the file is corrupted or does not exist, no smearing is performed.
An example of the resolution file is given below. The resolution
setting
number is the number of column in the resolution file.
The input file defining the search volume is a PDB-like file containing
the coordinates of dummy atoms with the extra "phase" information
telling to which phase the atom belongs. The file looks like this:
0 1 2 3 4 5 6 7
01234567890123456789012345678901234567890123456789012345678901234567890123456
ATOM 1 CA ASP 1 -17.000 -16.957-101.666 1.00 20.00 3 3012
ATOM 2 CA ASP 1 -17.000 -.957-101.666 1.00 20.00 1 3 3012
ATOM 3 CA ASP 1 -17.000 15.043-101.666 1.00 20.00 0 1 3012
ATOM 4 CA ASP 1 -1.000 -16.957-101.666 1.00 20.00 2 3012
ATOM 5 CA ASP 1 -1.000 -.957-101.666 1.00 20.00 1 202
The characters 1 to 65 in a
line are as in a normal PDB file.
Column 67 (iCore): if '1', the phase of
this atom is fixed and will not
be changed
during the search ("core atom"); if iCore='0' or ' ', the phase
of the atom is free to change. The core indicators may be
re-computed automatically when loading the model to that
iCore will be put to 1 ff an atom is surrounded by the atoms
of the same phase only. In this case, the program will change
interface atoms. This option may be useful if a preliminary
model is available.
Column 69(iPhas):
is the phase indices of the atom (ordinal number
in the iAllo array).
Column 71 (nAllo): is the number of phases allowed for the given atom.
Columns 72 etc ( iAllo):
are the indices of the allowed phases such that
iAllo(iPhas) is the phase of the atom.
This system allows one, if required, to select the phases which can
occupy
any given point. In the above example of a two-phase system
Atom 1: free atom of phase 2
Atom 2: fixed atom of phase 2
Atom 3: free atom of phase 0 (solvent)
Atom 4: free atom of phase 1
Atom 5: free atom of phase 0 (solvent; could be only solvent or phase 2)
In most cases,
however, the user does not need to learn the structure
of this file. A program BODIES
is available to generate an ellipsoidal
(or spherical) search volume for the given number of phases and given
number of dummy atoms. In a general case, one can always use the
spherical search volume with the diameter equal to Dmax, as in DAMMIN.
MONSA will automatically calculate the number of phases in the
search model when reading this file. The number of dummy atoms in the
search
volume must not exceed 10000!
The distribution package includes an example of a batch
file containing the required answers. Typing monsa144 < test.ans
will run the structure determination for the supplied example in
the batch mode (may take a day of CPU on a PC!).
The example is taken from the article (a 30S ribosomal subunit-like
particle with simulated proteins inside). The model is given in the
file model.pdb (phase 1 - proteins, phase 2 - RNA), the initial
search volume in the file sph105-2.pdb (a sphere with diameter 210A,
two-phase system; generated by BODIES).
The scattering curves *.dat
are computed from this model (see above example of test control
data file) and randomized.
NOTE that for any solution obtained using this method, an
enantiomorph would yield the same scattering patterns!
It was also observed (quite seldom) in test examples that one phase was
enantiomorphous whereas the others not.
With each successful run, MONSA
creates a set of output files, each
filename starts with a customizable prefix
that gets an extension appended. If a prefix has been used before,
existing
files will be overwritten without further note.
Contains the same information as the screen output
and is updated during execution of the program.
-0.pdb
This pdb file contains the beads of the solvent
(a.k.a. the search volume).
-1.pdb : -n.pdb
These pdb files contain the beads of each individual phases.
.pdb
This pdb file contains the beads of all the phases and the
solvent (a.k.a. the search volume). The beads of the different
phases and the solvent are distinguished by their chain number.
The header of the file contains information about the application
used and about invariants of the particle, e.g. Rg,
volume and molecular mass of the particle.
Fit of the simulated scattering curve versus a smoothed-out version
of the real-data. Columns in the output file are: 's',
'c.Iexp', 'c.ErrIexp' and
'IFIT'.
In previous releases two helper applications, DAMESV and DAMEMB, were included
to generate suitable search volumes for MONSA. This functionality was integrated
into the search-volume mode of BODIES.
Master file for the test example:
contrast variation simulated
data
of a 30S ribosomal subunit-like particle consisting of "RNA" (phase 2,
density = 4.0) with some "proteins" inside (phase 1; density = 2.0)
Master file for quazi-30S model randomized data to s=0.2
3.7e5 8.7e5 0.00 0.0 ! Desired Volumes
49.0 61.0 0.00 0.0 ! Desired Rgs
0 1 0 0 ! Connectivity
'test.con' 10 ! Control file name; Rgs will be
! computed from 10 first points
Control file for the test example
'Point collimation' 1 !! No smearing
'test.fit' !! Output fits
Test for 30S -- use randomized data up to 0.2 !! Title
98 !! Number of points
'0r1.dat' 2.00 4.00 0.00 0.00 1.000 0.0 1.00 0
'2r1.dat' 0.00 2.00 0.00 0.00 1.000 0.0 1.00 0
'4r1.dat' -2.00 0.00 0.00 0.00 1.000 0.0 1.00 0
'6r1.dat' -4.00 -2.00 0.00 0.00 1.000 0.0 1.00 0
'infr1.dat' 1.00 1.00 0.00 0.00 1.e-6 0.0 1.00 0
Here, the data sets '?r1.dat' correspond to the scattering patterns from the
test body in solvents with density 0.0, 2.0, 4.0, 6.0. The set 'infr1.dat'
corresponds to "shape scattering" (infinite contrast). Note that the test
would have worked also without the 'infinite contrast' data. Please note:
filename should be given in
quotes (up to 15 characters);
put zeroes as contrasts for
phases, which are not present;
all files in the setting
MUST have the same number of points
and the same angular axis; if you have data set(s) on another
angular grid(s), put them as another setting(s);
from each data set, a
constant "Const" will be subtracted and the
result will be multiplied by "Mult";
the data sets will be
weighted with the relative weight
"Weight" in the total discrepancy; reducing the weight is
equivalent to increasing errors in the data file;
number of points must not exceed 2048.
Choose the value, so that the maximal s value becomes 2.5 nm-1.