0
EMBL Hamburg Biological
Small Angle Scattering
BioSAXS

DAMAVER manual

damaver

Written by D. Svergun and M. Petoukhov.
Post all your questions about DAMAVER to the ATSAS Forum.

© ATSAS Team, 2001-2012

This is the manual for the program suite DAMAVER, a set of programs to align ab initio models, select the most typical one and build an averaged model. The following sections briefly describe the different programs that are part of DAMAVER, how to run them and what the input and output files are. If you use results from DAMAVER in your publication, please cite:

V. V. Volkov and D. I. Svergun (2003). Uniqueness of ab-initio shape determination in small-angle scattering. J. Appl. Cryst. 36, 860-864.

DAMAVER is a set of programs to align ab initio low resolution models (e.g. provided by DAMMIN and/or GASBOR), select the most typical ("probable") one and build an averaged model. The program package requires SUPCOMB. In the following DAMAVER means the program suite, whereas damaver refers to the actual program that is part of it. This suite contains the following programs:

  • damsel: compare all models, find the most probable one and outliers (uses SUPCOMB)
  • damsup: align all models with the most probable one (uses SUPCOMB)
  • damaver: average aligned models and compute probability map
  • damfilt: filter the averaged model at a given cut-off volume
  • damstart: generates from the averaged model an input file with fixed core for DAMMIN (for those who want to refine the averaged model)

Note that the most typical usage of this suite is to "let DAMAVER do all the work". Thus, in most cases you only need DAMAVER and therefore one can go to the DAMAVER manual straight away from here. Alternatively, you can use the different programs separately and the order of the programs as listed above reflects the typical order in which they are normally run to obtain an averaged model. The examples shown are all based on the same models and therefore essentially describe a full session of DAMAVER. Please refer to the paper cited above for further details about the implemented algorithm.

Please note that besides DAMAVER there is also DAMCLUST that allows for clustering of models instead of simply rejecting outliers from the largest group.

Damsel

Table of Contents

Manual

Introduction

DAMSEL compares sets of models, finds the most probable one and defines outliers by using the NSD values computed by SUPCOMB.

Given several structures in PDB format, the program superimposes all possible pairs by applying SUPCOMB. The output file provides a cross-correlation table according to NSD and the list of input files with the respective recommendations of inclusion or exclusion for each file. See the description of the output file for details.

NOTE: For DAMMIN or GASBOR averaging, 10-20 models are recommended. Models with NSD exceeding 2 standard deviations from the mean are considered outliers.

Running damsel

Usage:

$ damsel [OPTIONS] <FILES>

Command-Line Arguments and Options

FILES are the models to compare. DAMSEL recognizes the following command-line options. Mandatory arguments to long options are mandatory for short options too.

Short Option Long Option Description
-o --output=<FILE> Log output filename. Default: damsel.log.
-s --symmetry=<SYMMETRY> Specify the symmetry to consider. Any P-n-m symmetry known by other programs is supported (Pn, n=1, ..., 19 and Pn2, n=2, ..., 12). By default, no symmetry is assumed (P1).
-e --enantiomorphs=<YES|NO> Enable/disable the search of enantiomorphs, i.e. either one of a pair of molecules that are mirror images of each other but are not identical. By default, this is enabled.
-q --quiet Suppress progress information. Default: provide log information.
-o --version Print version information and exit.
-h --help Print this help text and exit.

Runtime Output

As described in the introduction, SUPCOMB is called for all pairs of models. Consequently, the output of DAMSEL corresponds to N*(N-1)/2 times the output of SUPCOMB. Please refer to the runtime output of SUPCOMB for details.

damsel input files

Models in PDB format.

damsel output files

DAMSEL output files are built up in sections. After the header, a symmetric cross-correlation table which lists the NSD values of each model to each other. For example:

$ damsel t0[1234].pdb

results in

Input files:
    1  t01.pdb
    2  t02.pdb
    3  t03.pdb
    4  t04.pdb

  File Aver      1      2      3      4
     1   1.15   0.00   1.10   1.14   1.21
     2   1.13   1.10   0.00   1.12   1.18
     3   1.13   1.14   1.12   0.00   1.14
     4   1.17   1.21   1.18   1.14   0.00
  Aver   1.15   1.15   1.13   1.13   1.17

Further, the mean value of the differences between the files and the respective standard deviation are listed. The values are used to determine which files to include and exclude:

It is recommend to discard files with nsd > Mean + 2*Standard deviation
 Recommendation       NSD  File
        Include     1.131  t02.pdb
        Include     1.133  t03.pdb
        Include     1.148  t01.pdb
        Include     1.175  t04.pdb

The first file listed shall be used as a reference frame in DAMSUP.

Example

Suppose there are multiple pdb files generated by DAMMIF in a directory. Then damsel can be run as:

$ damsel prefix-*-1.pdb -o prefix-damsel.log

Where prefix may be a user-defined prefix which may include absolute and/or relative paths.

Damsup

Table of Contents

Manual

Introduction

DAMSUP may be used to align a set of models to a reference model (Dummy Atom Model SUPerposition).

Running damsup

There are two possible usages:

$ damsup [OPTIONS] [FILE]

Here, FILE shall be a log file as written by DAMSEL. If no file name is specified, DAMSUP assumes damsel.log as input.

$ damsup [OPTIONS] <FILE> <FILE(s)>

Alternatively, a list of PDB files may be provided. The first FILE is the reference model. The second to the N-th FILE(s) will be superimposed onto the reference.

Command-Line Arguments and Options

DAMSUP recognizes the following command-line options. Mandatory arguments to long options are mandatory for short options too.

Short Option Long Option Description
-o --output=<FILE> Log output filename. Default: damsup.log.
-p --prefix=<PREFIX> The PREFIX to prepend to any pdb output filename. By default, no prefix is applied.
-s --symmetry=<SYMMETRY> Specify the symmetry to consider. Any P-n-m symmetry known by other programs is supported (Pn, n=1, ..., 19 and Pn2, n=2, ..., 12). By default, no symmetry is assumed (P1).
-e --enantiomorphs=<YES|NO> Enable/disable the search of enantiomorphs, i.e. either one of a pair of molecules that are mirror images of each other but are not identical. By default, this is enabled.
-q --quiet Suppress progress information. Default: provide log information.
-o --version Print version information and exit.
-h --help Print this help text and exit.

Runtime Output

As described in the introduction, SUPCOMB is called for all superpositions of input files onto the reference. Consequently, the output of DAMSUP corresponds to N-1 times the output of SUPCOMB. Please refer to the runtime output of SUPCOMB for details.

damsup input files

DAMSUP either reads an output log of DAMSEL, or input models directly. If a DAMSEL log file is given as input, the first included file is assumed to be the reference, all other included files are superimposed. Any files marked as discarded by DAMSEL are ignored.

damsup output files

Each superimposed model is written to a file that gets r appended to its basename, i.e.

$ damsup foo.pdb bar1.pdb bar2.pdb

will write bar1r.pdb and bar2r.pdb.

Further, DAMSUP also provides a log file which provides a list of the superimposed PDB files plus their respective NSD compared to the reference. For example:

$ damsup t02.pdb t01.pdb t03.pdb t04.pdb

Results in a damsup.log like:

          NSD  Filename
        0.000  t02.pdb
        1.097  t01r.pdb
        1.120  t03r.pdb
        1.176  t04r.pdb

A prefix is prepended to all output files if specified.

Damaver

Table of Contents

Manual

Introduction

DAMAVER averages models aligned by SUPCOMB or DAMSUP and computes a probability map. More specifically, given a list of models in PDB format which are aligned to yield the best overlap, DAMAVER remaps those onto a grid of densely packed beads in order to compute a frequency map. For each bead, the cross volume with proximal dummy atoms in the input models (occupancy) is computed and saved into the output file. This output file can be processed further by DAMFILT for a filtered model which does generally not fit the data, and DAMSTART which creates an input model suitable for DAMMIN which, after refinement, would fit the data.

Running damaver

There are two possible usages:

$ damaver [OPTIONS] [FILE]

Here, FILE shall be a log file as written by DAMSUP. If no file name is specified, damsup.log is assumed.

$ damaver [OPTIONS] <FILE> <FILE(S)>

Alternatively, a list of PDB files may be provided for averaging. The models are assumed to be aligned. If also the --automatic flag is used, models are selected and superimposed first.

Command-Line Arguments and Options

DAMAVER recognizes the following command-line options. Mandatory arguments to long options are mandatory for short options, too.

Short Option Long Option Description
-o --output=<FILE> Output filename of the average. By default, this is damaver.pdb.
-a --automatic[=PREFIX] automatically runs all programs of the package (in batch mode):
1) DAMSEL, 2) DAMSUP, 3) DAMAVER, 4) DAMFILT and 5) DAMSTART.
-s --symmetry=<SYMMETRY> Specify the symmetry to consider. Any P-n-m symmetry known by other programs is supported (Pn, n=1, ..., 19 and Pn2, n=2, ..., 12). By default, no symmetry is assumed (P1).
-e --enantiomorphs=<YES|NO> Enable/disable the search of enantiomorphs, i.e. either one of a pair of molecules that are mirror images of each other but are not identical. By default this is enabled.
--nbeads=<N> Overall number of beads within the resulting DAM (default: 5000)
-q --quiet Suppress progress information. Default: provide log information.
-o --version Print version information and exit.
-h --help Print this help text and exit.

Runtime Output

In the case of automatic mode, i.e., when all of the different programs of the suite are called, please refer to their respective runtime output sections.

damaver Input Files

DAMAVER either reads an output log of DAMSUP, or input models directly. If a DAMSUP log file is given as input, the listed filenames are read and averaged. If no file name is given, damsup.log is read by default.

If model file names are specified at the command line, they are used as given.

damaver Output Files

When averaging is complete, DAMAVER writes its output file, damaver.pdb by default.

Example

An example run may look like this:

$ damaver damsup.log -o taver.pdb
 Read file .............................................. : t02.pdb
 Read file .............................................. : t01r.pdb
 Read file .............................................. : t03r.pdb
 Read file .............................................. : t04r.pdb
 Wrote file ............................................. : taver.pdb
 Number of atoms written ................................ : 925

Damfilt

Table of Contents

Manual

Introduction

DAMFILT filters an averaged model created by DAMAVER at a given cut-off volume. Given the frequency map computed by DAMAVER and the value of a cut-off volume, DAMFILT removes low occupancy and loosely connected atoms and writes a compact - most probable - model to the output file. The output volume is selected to be close to the cut-off volume (expected excluded volume of the particle). This value is either read from the input file or specified on the command line.

Please note: the filtered model does generally not fit the data.

Running damfilt

$ damfilt [OPTIONS] [FILE]

Here, FILE is an input PDB file of the model to be filtered. If no file name is specified, damaver.pdb is assumed. The default cut-off value for filtering is taken from the input file if available, half of the volume of the input model otherwise. A user-defined cut-off can be specified at the command-line.

Command-Line Arguments and Options

DAMFILT recognizes the following command-line options. Mandatory arguments to long options are mandatory for short options, too.

Short Option Long Option Description
--volume=<X> Cut-off volume (default: averaged volume of the models if available, half the input volume otherwise)
--contact-threshold=<N> Contact threshold to discard (default: minimum number of contacts in the model, plus 2)
-o --output=<FILE> Output filename of the filtered model. By default this is damfilt.pdb.
-q --quiet Suppress progress information. Default: provide log information.
-o --version Print version information and exit.
-h --help Print this help text and exit.

Runtime Output

When running damfilt, the program prints on the screen information about the number of atoms read, the center of the dummy atom model, maximum radius, number of phases, atomic radius, excluded volume per atom, average excluded volume read, minimum and maximum number of contacts, number of atoms written and final excluded volume. For an example, see damfilt examples.

damfilt input files

Input for DAMFILT is the pdb file to be filtered. If no input file is specified, damaver.pdb is assumed.

damfilt output files

The output file of DAMFILT is the filtered model written to damfilt.pdb by default. The header of the file includes the filter conditions applied.

damfilt examples

An example run may look like this:

$ damfilt taver.pdb -o tfilt.pdb
 Read file .............................................. : taver.pdb
 Number of atoms ........................................ : 925
 Number of phases ....................................... : 1
 Minimum number of contacts ............................. : 2
 Maximum number of contacts ............................. : 12
 Selected contact threshold ............................. : 4
 Atomic radius .......................................... : 2.750
 Excluded volume per atom ............................... : 117.7
 Maximum radius ......................................... : 64.07
 Average excluded volume ................................ : 0.0
 Selected cut-off volume ................................ : 5.445e+4
 Final contact threshold ................................ : 4
 Final cut-off volume ................................... : 5.445e+4
 Final number of atoms .................................. : 463
 Final volume ........................................... : 5.451e+4
 Wrote file ............................................. : tfilt.pdb

Damstart

Table of Contents

Manual

Introduction

Given the frequency map computed by DAMAVER and the value of a cut-off volume, DAMSTART generates a modification of the DAMAVER model with fixed core for further use in DAMMIN as an initial approximation. The core indices of high occupancy atoms with fair number of contacts are set to 1 so that their phases will not change in a DAMMIN run. DAMAVER writes this model to the file damstart.pdb by default. The core volume is selected to be close to the cut-off volume (one half of expected excluded volume of the particle). This value is either read from the input file (default file name is damaver.pdb, see damaver) or specified by a command line option. If the information on the expected particle volume is missing in the input pdb file then the cut-off value is taken to be one quarter of the volume of the model.

Running damstart

$ damstart [OPTIONS] [FILE]

Here, FILE is an input PDB file of the model to be filtered for refinement. If no file name is specified, damaver.pdb is assumed.

Command-Line Arguments and Options

DAMSTART recognizes the following command-line options. Mandatory arguments to long options are mandatory for short options, too.

Short Option Long Option Description
--volume=<X> Cut-off volume, half the input volume by default.
--contact-threshold=<N> Contact threshold to discard (default: minimum number of contacts in the model, plus 2)
-o --output=<FILE> Output filename of the filtered model. By default this is damstart.pdb.
-q --quiet Suppress progress information. Default: provide log information.
-o --version Print version information and exit.
-h --help Print this help text and exit.

Runtime Output

When running damstart in batch mode (see Command-Line Arguments and Options, the program prints on the screen information about the number of atoms read, the number of phase, minimum and maximum number of contacts, the selected and final contact threshold, the selected and final cut-off volume the atomic radius, the excluded volume per atom, maximum radius, average excluded volume read, number of atoms written and final excluded volume of the created model. For an example, see damstart examples.

damstart input files

Input for DAMSTART is the pdb file to be filtered for refinement. If no input file is specified, damaver.pdb is assumed.

damstart output files

The output file of DAMSTART is the filtered model with a fixed core written to damstart.pdb by default. The header of the file includes the filter conditions applied.

damstart examples

An example run may look like this:

$ damstart taver.pdb -o tstart.pdb
 Read file .............................................. : taver.pdb
 Number of atoms ........................................ : 925
 Number of phases ....................................... : 1
 Minimum number of contacts ............................. : 2
 Maximum number of contacts ............................. : 12
 Selected contact threshold ............................. : 4
 Atomic radius .......................................... : 2.750
 Excluded volume per atom ............................... : 117.7
 Maximum radius ......................................... : 64.07
 Average excluded volume ................................ : 0.0
 Selected cut-off volume ................................ : 5.445e+4
 Final contact threshold ................................ : 4
 Final cut-off volume ................................... : 5.445e+4
 Final number of atoms .................................. : 463
 Final volume ........................................... : 5.451e+4
 Wrote file ............................................. : tstart.pdb

  Last modified: April 11, 2013

© BioSAXS group 2013