0
EMBL Hamburg Biological
Small Angle Scattering
BioSAXS
SASBDB

DARA manual

dara

Written by A.G. Kikhney, A. Panjkovich and A.V. Sokolova.
Post all your questions about DARA to the ATSAS Forum.

© ATSAS Team, 2003-2016

Manual

The following sections shortly describe the method implemented in DARA, how to use DARA web interface, detail the required input and the produced output.

If you use results from DARA in your own publication, please cite:

Kikhney, A.G., Panjkovich, A., Sokolova, A.V. & Svergun, D.I. (2016) DARA: a web server for rapid search of structural neighbours using solution small angle X-ray scattering data. Bioinformatics 32(4), 616-618.

Introduction

DARA is a DAtabase for RApid search of structural neighbours using solution small angle X-ray scattering (SAXS) data. DARA is a web-server that queries over 150 000 scattering patterns pre-computed from the high resolution structures of macromolecules and biological assemblies in the Protein Data Bank, to find nearest neighbours of a given experimental or theoretical SAXS data. Structures with identical or very similar scattering patterns are grouped into over 85 500 clusters in order to provide a more user-friendly output. Three types of macromolecules are taken into account: proteins, nucleic acids and protein:nucleic acid complexes. The search combines principal component analysis of the scattering patterns and k-d trees for almost instantaneous identification of similar scattering patterns. Identification of the best scattering equivalents provides a rapid automatic structural assessment of macromolecules based on the experimental SAXS profile.

Running DARA

DARA is a web server available at dara.embl-hamburg.de. It is free and open to all users and there is no login requirement.

Arguments and Options

DARA takes experimental data in three-columns (*.dat) or in GNOM (*.out) format; simulated data (*.int) or atomic models (*.pdb) as input.

OptionDescription
Angular units Angular units of the input data: either inverse nanometres (1/nm) or inverse angstroms (1/Å). Angular units are typically not stored in the input file. Once an input file in GNOM format is chosen, an attempt to guess the angular units is made based on the Dmax value in the GNOM file: if Dmax > 20 then 1/Å is recommended; 1/nm otherwise.
Macromolecule type One of:
  • protein;
  • nucleic acid;
  • protein:nucleic acid.
Number of neighbours to show How many neighbours to show in the output. One of: 1, 5, 10, 25, 50, 100. Default is 10.

DARA Input Files

It is recommended to submit experimental data in GNOM format (*.out). Both the "old" (version 4.x) and the "new" (version 5.0) formats are supported. For the search of neighbours DARA uses regularized data IREG(s) where s = 4πsin(θ)/λ, 2θ is the scattering angle, λ is the wavelength. If the input data file contains results of multiple GNOM runs then the last run is taken into account.

Experimental data in a three-column format (*.dat) can be submitted as well. The first column should contain the momentum transfer s, the second column—the background-subtracted experimental intensities I(s), the third column—the experimental errors. In this case an automatic GNOM run will be performed to obtain IREG(s) extrapolated to s = 0.

Theoretical data (*.int) can be submitted as well. The first column should contain the momentum transfer s, the second column—the theoretical intensities I(s), other columns are ignored.

The input data range should be up to smax > 0.4 nm-1. Wide angle data above s = 10 nm-1 are ignored.

Atomic models in PDB format (*.pdb) can be submitted as well. In this case an automatic CRYSOL run is performed to obtain the theoretical intensities. Submitted files should be smaller than 1MB. For larger models the theoretical intensities should be computed by CRYSOL locally.

DARA Output

FieldDescription
Fit Logarithmic plot of the experimental input data (blue) and the scattering pattern computed from the neighbour structure (red).
Χ2 Reduced chi-square goodness of fit, i.e. mean square weighted deviation between the experimental input data and the scattering pattern computed from the neighbour structure. A value close to 1.0 indicates a good fit (assuming the experimental errors are correct). A value much higher than 1.0 indicates a poor fit (or that the experimental errors are underestimated). A value below 1.0 indicates overestimated errors.
R-factor In case of simulated input data (*.int or *.pdb) the R-factor is computed instead of chi-square. Zero value indicates an identical fit.
PDB ID ID of the structure and a link to the respective RCSB PDB page. If the selected biological assembly is different from the one shown by default at the RCSB page - this is indicated in the title of the link (put the mouse pointer on top of the link to see it).
Structures with very similar scattering patterns are clustered; in case the selected model has neighbours with identical or very similar scattering patterns - this is indicated with a plus sign to the left of the PDB ID. Click the plus sign to expand the list of all PDB IDs.
Download model A thumbnail image of the selected biological assembly structure (taken from RCSB). On click the particular biological assembly (or the first model from the ensemble in case of NMR) will be downloaded in *.gz format.
In case of proteins and protein:nucleic acid complexes the percentage of alpha helices (α) and beta sheets (β) is shown below.
MW Molecular weight of the structure in kDa.
Volume Hydrated volume of the structure in nm3.
Rg Radius of gyration of the structure in nm - average of square center-of-mass distances in the structure weighted by the scattering length density.
Dmax Maximum intra-particle distance of the structure in nm.

Examples

DARA web server provides a simple mechanism to try out sample data. To see the sample input data click Sample input: SAXS data from protein solutions. A list of six data sets will be shown. Each data set has a button for automatic loading of the data and a link to download the data. The following data are available:

  • Bovine serum albumin (BSA) in HEPES Download
    Experimental data
  • Apoferritin in PBS Download
    Experimental data
  • Glucose isomerase in PBS Download
    Experimental data from SASBDB ID: SASDAK6
  • Catalase in PBS Download
    Experimental data from SASBDB ID: SASDA92
  • SRB2m RNA in HEPES Download
    Experimental data from SASBDB ID: SASDA54
  • LytTR-comcde (DNA:protein) complex in MES Download
    Experimental data from SASBDB ID: SASDAC7
  • Ubiquitin Download
    Theoretical data from PDB ID: 1UBQ
  • Alcohol dehydrogenase (ADH) Download
    Theoretical data from PDB ID: 4W6Z

  Last modified: October 7, 2016

© BioSAXS group 2016