0
EMBL Hamburg Biological
Small Angle Scattering
BioSAXS
SASBDB

AUTOMERGE manual

AutoMerge

Written by M.J. Gajda.
Post all your questions about AUTOMERGE to the ATSAS Forum.

© ATSAS Team, 2008-2009

Manual

Introduction

AutoMerge performs extrapolation, averaging of experimental .dat files. On the way it also assesses their quality, amount of data that is available from them, and checks that files in the same data set fit to each other up to scaling, background subtraction within range that is sufficient to infer their similarity.

Two last files are to be used only when merging procedure gives wrong answer in your case. Otherwise you can just switch off extrapolation and/or averaging if you wish.

Running AutoMerge

Synopsis:

$> automerge filename1.dat filename2.dat ... --output automerged.dat

If you have results.txt or results.xml file, you may also use:

$> automerge --results-txt results.txt filename1.dat filename2.dat --output automerged.dat
$> automerge --results-xml results.xml filename1.dat filename2.dat --output automerged.dat

If you don't like a point where two files were merged, you can pick it by yourself:

$> automerge filename1.dat filename2.dat filename3.dat --merging-point 111 --output automerged.dat

where 111 is an integer index of a merging point.

All input files must have the same X axis (the same number of points, for the same S values.) This is usually true if they come from the same beamline setup. Otherwise, please use ATSAS package tools to regrid them.

Command-Line Arguments and Options

Option name Description
--help, -h show this help message and exit
--autorg-dll use AutoRg.dll for Rg determination (not recommended)
--no-autorg do not use AutoRg_cli.exe for Rg determination - take parameters from results.* files.
--autorg-cli use AutoRg_cli.exe for Rg determination (requires Windows or Wine on Linux) - DEFAULT
--rg=Rg force a given Rg (single dataset only; requires also --sigma-rg)
--sigma-rg=sigma_Rg force a given sigma of Rg (single dataset only; requires also --rg)
--bsa=I0 compute mol. mass from I0 for a given I0 of BSA (mol. mass of 66.409kDa)
--bsa-concentration=[mg/ml] BSA concentration for mol. mass determination
--fit-whole fit whole data range (including possible constant region & region affected by beam stop)
--all process all data sets found in results.txt [DEPRECATED]
--results-txt=FILEPATH path to results.txt filename with AutoRG analysis results (for pre-analyzed files) - may be better to use results.xml.
--results-xml=FILEPATH path to results.xml filename with AutoRG analysis results (for pre-analyzed files)
--output=FILENAME, -o FILENAME output filename
--output-extrapolated=FILENAME output filename for extrapolated curve
--extrapolation-diagnostics=FILENAME output filename with comparison of Rg's, I0's with their extrapolations
--output-averaged=FILENAME output filename for averaged curve
--verbose, -v print more run info, can be used multiple times
--quiet, -q print hard errors only
--no-scaling do not scale measurements
--merging-point=N picks specific merging point for extrapolated and averaged curves (instead of guessing one)
--fitting-range=S1-S2 picks specific fitting range (instead of guessing one)
--converging-points=MIN_CONVERGING_POINTS number of points over which we expect two curves to converge - for merging/averaging
--signal-to-noise=SIGNAL_TO_NOISE signal-to-noise ratio threshold for averaging
--no-smoothen do not smoothen data sets (default, PRIMUS behaviour)
--smoothing-window=SMOOTHEN how many neighbours (on each side) to pick for smoothening
--no-extrapolation-zero-concentration extrapolate to zero concentration
--p-value=P_VALUE, -p P_VALUE change P-value threshold for successful fit (default P_VALUE=0.01)
--max-s-factor=MAX_S fit is computed up to MAX_S/Rg (default MAX_S=6.5)
--min-rg-quality=Q ignore files with AutoRg's Rg quality of less than Q (default: 5.0)
--correct-errors correct error bars for noise that can be identified from value alone(PRIMUS-incompatible behavior)
--license show license text
--ignore-I0 do not recover magnitude of intensity after extrapolation
--recover-I0 recover magnitude of intensity after extrapolation (default)
--allowed-parse-errors=MAX_ERRORS number of errors accepted in results.txt before parser gives up (default: 5)
--encoding=ENCODING encoding assumed for input/output files
--primus give concentrations explicitly in the command-line (PRIMUS mode)
--refresh refresh output files only if any input is newer than outputs (only with results.xml or results.txt)
--autorg-exec=FILEPATH point to the correct CLI version of autorg.exe (guessed: wine /home/gajda/Projects/automation/autorg.exe)
--drop-nothing switches filtering off
--first-point=N omits first (N-1) points
--last-point=M omits points after Mth point of data
--version show version
--averaging-ranges-slow old algorithm finding strict averaging ranges (use if you notice bumps)
--averaging-ranges-fast new algorithm finding lax averaging ranges (fast and default)
--average-by-merging-points old averaging algorithm with merging points (alike to PRIMUS)
--average-by-estimates averaging algorithm that weights by relative differences in error estimates (statistically robust)
--html-help print help in HTML format and end

Runtime Output

Since program is designed for command-line use, you can use --verbose option once or twice to increase amount of logging information provided at runtime.

AutoMerge Input Files

AutoMerge uses only .dat input files containing scattering data, just as other programs from ATSAS package - columns are s, I(s) and error(I(s)).

All input files must have the same X axis (the same number of points, for the same S values.) This is usually true if they come from the same beamline setup. Otherwise, please use ATSAS package tools to regrid them.

Files results.txt and results.xml are defined as output from AutoPilatus or AutoMar automated data processing system. See documentation of automated pipeline for details.

AutoMerge Output Files

Output name Related option Default filename Description
Merged data automerged.dat Data that has been merged from averaged and extrapolated data.
Extrapolated data --output-extrapolated extrapolated.dat This file contains extrapolated data and two additional columns: 3rd with statistical estimation of error in extrapolation procedure and 4th with lower bound on information gained in extrapolation. Extrapolated file is helpful as long, as the 3rd curve displayed by Windows version of SASPLOT is above the 4th curve.
Averaged data --output-averaged averaged.dat Contains averaged curve, without any extrapolation.
Example of averaged, extrapolated and automerged curves.
Extrapolation diagnostics --extrapolation-diagnostics This will create .dat file that shows radii of gyration and intensities extrapolated to zero angle of all input files, and compare them with linear extrapolation of respective radii and intensities. If there are clear outliers, then linear dependency is broken and these files should not be used for extrapolation.
Example diagnostics of Rg Example diagnostics of I0
Here you see that Rg shows perfect concentration dependency, but I0 is inconsistent. Such situation may indicate beam stability problems.

Troubleshooting

  • You see discontinuity in the output.
    1. Check if the discontinuity is bigger than experimental error and divergence (differences between following points) in the input data. It is so, then it is insignificant from the numeric point of view. You can also use --verbose mode to get information about Feasible merging points, and pick different merging point with --merging-point option. List of plausible merging point is also visible in the output file under a metainformation key Possible merging points. Usually you want earliest merging point that gives a smooth curve.
    2. Check if input data files have this discontinuity - it is quite common that the detector has edge cases at some points, and there may happen discontinuities. These points usually correspond to error peaks in the input .dat files. (At least if errors are properly estimated.) You can see error peaks by invoking Windows version of SASPLOT on the single input file.
    3. Check if there is a warning about merging averaged and extrapolated curves that are too dissimilar. If it is so, then you may want to view both extrapolated and averaged curves in PRIMUS and pick different merging point with --merging-point option (see 1.)
  • Automerge fails to extrapolate data from a single file.
    • Probably other files were filtered out as inconsistent or of low quality, check the reason for filtering out these files in other entries of this manual. You need at least three files for extrapolation. Reasons for filtering out files will be indicate in the merged output.
  • Automerge omits my file because Rg/sigma_Rg/Rg quality are not right or are missing.
    1. You may use --rg/--sigma-rg option to pick Rg by hand, and thus avoid relying on AutoRG.
    2. Or you may prefer to write/modify results.txt or results.xml file and change Rg and sigma_Rg values. You can use --no-autorg and --results-txt or --results-xml option. [They are called Rg and ErrRg, respectively.]

Examples

Example of BSA

Four concentrations of bovine serum albumine have been measured: The simplest way to use AutoMerge is:

$> automerge.py b_0*.dat -o automerged.dat --output-extrapolated extrapolated.dat --output-averaged averaged.dat --extrapolation-diagnostics diag.dat

This gives output message:

SUCCESS with automerged.dat

Result of automerging is visible below:
Automerged data Extrapolated data
Inspection of extrapolated data suggests that less extrapolated data should be used - just until s≤0.4 nm-1. (Green curve is estimated gain from extrapolation, whereas pink is inherent error of extrapolation.) This can be done with --merging-point option:

$> automerge.py b_0*.dat -o automerged.dat --merging-point 146

  Last modified: April 11, 2013

© BioSAXS group 2013