CREDO
A Package of Four Programs to Add Missing Loops and Domains

to High and Low Resolution Models of Proteins   

 

The package includes programs CREDO, CHADD, GLOOPY and CHARGE.

The programs implement the reconstruction algorithms described by

 

Petoukhov, M.V., N.A.J. Eady, K.A. Brown, and D.I. Svergun,

Addition of missing loops and domains to protein models using X-ray solution scattering.

Biophys. J. 2002 83(6) 3113-3125.

 

The users are referred to this paper for details.

 

Overview

Examples

Reprint

download

 

Last edited : Thursday, 22 July, 2004

 

 

 

Program CREDO is an extension of the original DR program GASBOR and can be used

when the location of the interface between the known and missing portions of the structure

is unknown. The structure of missing part is represented by ensemble of dummy residues

forming a chain-compatible model. The spatial positions of these residues aim at approximately

corresponding to those of the Ca atoms belonging to the searched part of protein structure.

Thus, this program is best suited for generating low-resolution models of missing domains

without using information about primary and secondary structures.

 

Program CHADD is designed to build chains of dummy residues attached to given point(s)

or residue(s) in the known portion of the structure. In contrast to the previous model,

it is explicitly required that the i-th DR is separated by 0.38 nm from the (i+1)-th.

This program is useful for adding missing loops or terminal portions in high-resolution models

but can also be used for missing domain restoration. The variable part forms a Ca chain

attached to the given point(s) in the known structure.

 

Program GLOOPY is similar to the previous one but accounts also for the residue-specific

information containing in the primary structure of the model. This permits to employ further

restrains to generate native-like folds configuration of the missing loop or domain.

 

Program CHARGE is aimed at restoring the conformation of the missing loops, especially

if information about the secondary structure of the fragment is available. The algorithm

implemented in the program determines the conformation of short fragments like the missing

loops as interconnected polypeptide chains accounting to secondary structure prediction.

Thus, if a specific portion of the loop is known to form an a–helix or b–sheet,

an idealized secondary structure template of the appropriate length is inserted.

 

 

 Note that symmetry axis (if any) of the initial model should coincide with Z !

 

 

 

 

The use of the programs is similar to that of GASBOR.

Similarly to GASBOR, the programs read output files of an indirect transformation program GNOM.

Using GNOM one must use high angle portions of the scattering patterns and the programs are able

to fit the data up to the resolution of 5 angstroem,

i.e. momentum transfer s=4*pi sin(theta)/lambda = 1.2 [1/Angstrom].

Most of parameters have the same meaning as in GASBOR.

The number of residues in the whole structure should be equal to that in the protein.

The most important difference from GASBOR is that in all programs excluding CREDO

the ordial sequence of dummy (CHADD) or individual (GLOOPY, CHARGE) residues

in the model is corresponding to the primary sequence of the protein!

 

 

CREDO

 

After starting the program in the default USER mode you will need to specify

 

(i)   Log-file name <name>,

(ii)  project description,

(iii) name of the GNOM output file,

(iv)  pdb file containing CA atoms belonging to the known part of protein structure

(v)  if known, point symmetry of the particle:

            Default group is P1(no symmetry)

            Point groups P2, P22, P3, P32, P4, P42, P5, P52, P6, P62

            are supported

(vi)   the number of residues in the one searched subunit

 

and enter default answers to all other questions.

 

Example: artificial fusion protein (187 residues total, 58 in unknown part).

Enter complex.out, mon1.pdb, P1, 58.

 

 

CHADD

 

After starting the program in the default USER mode you will need to specify

 

(i)   Log-file name <name>,

(ii)  project description,

(iii) name of the GNOM output file,

(iv)  pdb file containing CA atoms belonging to the known part of protein structure

(v)  if known, point symmetry of the particle:

            Default group is P1(no symmetry)

            Point groups P2, P22, P3, P32, P4, P42, P5, P52, P6, P62

            are supported

(vi)   the number of residues in the one searched subunit/loop.

(vii)  ordial numbers (n1 and n2)of residues in the known part to attach the loop

(n2 equals to zero denotes the connection to the terminal corresponding to n1)

 

and enter default answers to all other questions.

 

Example: artificial fusion protein (187 residues total, 58 in unknown part).

Enter complex.out; mon1.pdb; P1; 58; 129; 0.

 

 

GLOOPY

 

After starting the program in the default USER mode you will need to specify

 

(i)   Log-file name <name>,

(ii)  project description,

(iii) name of the GNOM output file,

(iv)  pdb file containing CA atoms belonging to the known part of protein structure

(v)  if known, point symmetry of the particle:

            Default group is P1(no symmetry)

            Point groups P2, P22, P3, P32, P4, P42, P5, P52, P6, P62

            are supported

(vi)  only if the loop attached to the terminal: the ordial number of corresponding residue

(vii)   the number of residues in the one searched loop.

(viii) *.seq, containing one-letter primary sequence of the whole structure

 

and enter default answers to all other questions.

 

Example: lysozyme, (129 residues total, 16 residues (40-55) are missed).

Enter lyz.out; lyzcut.pdb; P1; 16; lyz.seq.

 

 

 

CHARGE

 

After starting the program in the default USER mode you will need to specify

 

(i)   Log-file name <name>,

(ii)  project description,

(iii) name of the GNOM output file,

(iv)  pdb file containing CA atoms belonging to the known part of protein structure

(v)  if known, point symmetry of the particle:

            Default group is P1(no symmetry)

            Point groups P2, P222, P3, P32, P4, P42, P5, P52, P6, P62

            are supported

(vi)  ordial number of residue to which tail to be attached

(vii)   the number of residues in the one searched loop.

(viii)  ordial numbers (n1 and n2) of residues belonging to secondary structure elements

(ix) file *.seq, containing one-letter primary sequence of the whole structure

 

and enter default answers to all other questions.

 

Example: lysozyme, (129 residues total, 15 residues at N-terminal are missed residues 5-15 conform a-helix).

Enter lyz.out; lyzcut2.pdb; P1; 1; 15; 5; 15; lyz.seq.

 

 

 

 

After any program is finished, you will get the files

 

<name>.log   : log file

<name>.fit   : fit to the desmeared and smoothed by GNOM data

<name>.fir   : fit to the raw experimental data

<name>.pdb   : resulting model in PDB-like format that can be viewed

               e.g. with RasMol in the 'spacefill' or ‘backbone’ mode

               or with MASSHA (see Atsas package)

               CA-atoms: positions of residues

                H-atoms: positions of dummy bound waters