ARP/wARP User Guide

Version 7.2
August 21, 2011

Contents

1 General information
 1.1 Introduction
 1.2 Major changes in Version 7.2
 1.3 Latest News, Bug Reports and Troubleshooting
 1.4 Distribution
2 Installing ARP/wARP
 2.1 Intel Mac OSX Installation
 2.2 Command Line Installation on Mac OSX or Linux
3 Using ARP/wARP
 3.1 Automated Model Building
  3.1.1 Running protein model building from the GUI, ARP/wARP Classic
  3.1.2 Command line model building, auto_tracing.sh
  3.1.3 Remote submission of a model building task
   3.1.3.1 Submitting from the GUI
   3.1.3.2 Submitting from a web browser
  3.1.4 Output files, short log file
  3.1.5 Running protein model building from the GUI, ARP/wARP Expert System
  3.1.6 Running flex-wARP from command line
 3.2 Automated Construction of Helical and Beta-Stranded Fragments
  3.2.1 Building secondary structure from the GUI, ARP/wARP Quick Fold
   3.2.1.1 Output files, short log file
  3.2.2 Building secondary structure from the command line, auto_albe.sh
 3.3 Automated Loop Building
  3.3.1 Running loop building from the GUI, ARP/wARP Loops
 3.4 Automated Building of Poly-Nucleotides
  3.4.1 Running nucleotide building from the GUI, ARP/wARP DNA/RNA
   3.4.1.1 Output files, short Log File
  3.4.2 Running nucleotide building from the command line, auto_nuce.sh
 3.5 Automated Ligand Building
  3.5.1 Running ligand building from the GUI, ARP/wARP Ligands
   3.5.1.1 Output files, short Log File
  3.5.2 Running ligand building from the command line, auto_ligand.sh
 3.6 Automated Solvent Building
  3.6.1 Running solvent building from the GUI, ARP/wARP Solvent
   3.6.1.1 Output files, short log file
  3.6.2 Running solvent building from command line, auto_solvent.sh
 3.7 ARP/wARP molecular graphics: ARP Navigator
  3.7.1 Main Menu
  3.7.2 Mouse and Keyboard functions
   3.7.2.1 Rotation
   3.7.2.2 Translation
   3.7.2.3 Scaling
   3.7.2.4 Clip planes
   3.7.2.5 Map contouring
   3.7.2.6 Map extent
   3.7.2.7 Mouse Actions
   3.7.2.8 Keyboard Actions
  3.7.3 Object Buttons
  3.7.4 Quick Actions
4 Additional Remarks
 4.1 Quality of the X-ray Data
5 Citing ARP/wARP
6 Other References
7 Acknowledgements

Chapter 1
General information

1.1 Introduction

ARP/wARP is a software project for automated protein model building and structure refinement. It is based on a unified approach to the structure solution process. It combines electron density interpretation using the concept of the hybrid model, pattern recognition in an electron density map and maximum likelihood model parameter refinement with REFMAC.

The ARP/wARP software is under continuous development. Its present release, version 7.2, can be used in the following ways:

  1. Automated protein chain tracing in the density map and model building (GUI modules ARP/wARP Classic, ARP/wARP Expert System, command line modules auto_tracing.sh and auto_flex_warp.sh). This constructs polypeptide fragments (both main and side chains) for the cases of MR solutions or MAD/M(S)IR(AS) phases. The Classic and the auto_tracing.sh modules use a pre-defined number of cycles for model update, refinement and chain tracing, while Expert System and auto_flex_warp.sh define the sequence of steps on the fly. X-ray data to 2.7 Å resolution or higher are required although partial model building can sometimes be achieved at a resolution of 3.0 Å or lower.
  2. Free atoms density modification (GUI module ARP/wARP Classic). This method provides improvement in a density map by free-atoms update and requires an input PDB and X-ray data to about 2.5 Å resolution or higher.
  3. Automated building of alpha-helical and beta-stranded fragments (GUI module ARP/wARP Quick Fold, command line module auto_ albe.sh). This constructs helical and beta-stranded polypeptide fragments (main chain and CB atoms) in low-resolution density maps. Phased X-ray data to 4.5 Å resolution or higher are required. This module is automatically invoked in protein chain tracing (#1 above) when the resolution of the data is 2.7 Å or higher.
  4. Building poorly defined loops in a protein model (GUI module Loops). This will generate a set of candidate loops for a short stretch of missing residues given the anchors and the sequence of the missing residues. Each candidate will have both main chain and side chain atoms. The user can ask the module to choose a single solution or to suggest several loops. A protein model and an X-ray data to 3.0 Å resolution or higher are required. This module is automatically invoked in protein chain tracing (1 above).
  5. A prototype software for building poly-nucleotide fragments, DNA or RNA (GUI module ARP/wARP DNA/RNA, command line module auto_nuce.sh). This will produce a set of poly-nucleotide chains with guessed bases (A or C, i.e. large or small), the nucleotide sequence is not yet used. Phased X-ray data to about 3.5 Å resolution or higher are required.
  6. Building bound ligands (GUI module ARP/wARP Ligands, command line module auto_ligand.sh). This constructs a ligand in a difference electron density map, after the protein model has been completed and refined, given a template search ligand or a list of putative ligands (cocktail screening). X-ray data to 3.0 Å resolution or higher are required.
  7. Building the solvent structure (GUI module ARP/wARP Solvent, command line module auto_solvent.sh). This builds a solvent structure after the protein model has been refined. X-ray data to 2.5 Å resolution or higher are required.
  8. A molecular graphics ARP/wARP front-end, which allows the display of molecules and electron densities (GUI module ARP Navigator, executable program arpnavigator). It is a high-quality 3D molecular viewer and a user-friendly interface to ARP/wARP functionalities, allowing macromolecular models, ligands and solvents to be viewed as they are built.

1.2 Major changes in Version 7.2

1.3 Latest News, Bug Reports and Troubleshooting

For the latest news and announcements please visit the ARP/wARP page (www.arp-warp.org). Some problems and tips can be found on the Frequently Asked Questions link. The developers will greatly appreciate all bug reports or suggested changes from the users.

1.4 Distribution

The ARP/wARP package (either for download or for remote execution of protein model building) is freely available to academic users provided that they agree to the ARP/wARP license conditions and the applications of ARP/wARP are properly cited. Please consult the ARP/wARP log file for the most relevant citation.

Industrial users are requested to obtain a commercial license via the ARP/wARP web page.

Chapter 2
Installing ARP/wARP

It is recommended that CCP4 is first installed from a dmg file (available from the CCP web page at http://www.ccp4.ac.uk). There could be problems installing ARP/wARP onto a copy of CCP4 installed using 64-bit Fink.

CCP4 6.1.13 (and higher) is the recommended version to use with ARP/wARP 7.2, although older versions of CCP4 may be compatible with some of the ARP/wARP modules.

2.1 Intel Mac OSX Installation

  1. Download arpwarp_7.2.dmg from the ARP/wARP web page.
  2. Double click on the downloaded file.
  3. Double click on the ARPwARP installer.
  4. Agree to the ARP/wARP license.
  5. Select a destination drive.
  6. Choose destination directory if the default /Applications is not suitable.
  7. ARP/wARP should install and start automatically.

If there are any problems, we encourage you to save the installation log that is displayed and send it to the ARP/wARP developers using the link on the ARP/wARP homepage. Reporting bugs improves the software and helps other researchers.

2.2 Command Line Installation on Mac OSX or Linux

  1. Download the full ARP/wARP package arp_warp_7.2.tar.gz from the ARP/wARP web page and save it in a location of your choice. Next, type:
    % gunzip arp_warp_7.2.tar.gz  
    % tar xvf arp_warp_7.2.tar

    The distribution will unpack under the directory called arp_warp_7.2 that will contain all the required files and subdirectories. install.sh is an installation script to help you set the appropriate environmental variables. The ‘README‘ will walk you through the installation process. ARP_wARP_CCP4I-v5.tar.gz includes everything necessary to run ARP/wARP from the CCP4i interface of version 5; ARP_wARP_CCP4I-v6.tar.gz includes everything to run tasks from the CCP4i interface of version 6.

  2. Go to the directory arp_warp_7.2 and run there the install.sh script by simply typing
    % ./install.sh


PIC

Figure 2.1: The CCP4i Model Building menu after ARP/wARP installation


Unless you are already an experienced ARP/wARP user, you should try to get started with the test files provided in the directory arp_warp_7.2/examples. These include the data for protein chain tracing (also with NCS), helix/strands search, nucleotides, ligand and solvent building. A README file is included which gives more detailed information, which data are to be used for what.

If things do not work as expected please consult your more experienced colleagues, system manager or the ARP/wARP developers.

Chapter 3
Using ARP/wARP

3.1 Automated Model Building

3.1.1 Running protein model building from the GUI, ARP/wARP Classic


PIC

Figure 3.1: Protein model building Using ARP/wARP Classic from the CCP4 GUI


This module of ARP/wARP provides the execution of the following tasks:

  1. automated protein building starting from experimental phases
  2. automated protein building starting from existing model
  3. improvement of maps by atoms update and refinement

Applications 1 and 2 (the so-called warpNtrace protocol) start with input experimental / density modified phases or an available (preliminary refined or partially autotraced) model. The warpNtrace protocol aims to deliver an essentially complete model and obviously an improved map by utilising the idea of the hybrid model in which protein and free atoms can co-exist. warpNtrace keeps whatever was recognised as protein (in a form of polypeptide fragments) and the rest as free atoms and refines this hybrid model during a ‘big’ cycle, consisting of several (default is 5) ARP/REFMAC update/refinement cycles. At the end of each ‘big’ cycle the map is interpreted anew - a completely new polypeptide model is constructed with hopefully more residues in less fragments. This whole procedure is iterated (default is 10 times).

The output of warpNtrace is a set of refined polypeptide fragments. If the sequence is available, the traced fragments will be docked in sequence and side chains will be built during the iterative refinement procedure. Loops will be built during the procedure, if possible. After the last building cycle the fragments will be arranged to form a globular structure (or, for a case of NCS, several NCS-related structures). The remainder of the structure (cis-prolines, poorly ordered loops and terminal residues for each fragment) will have to be completed by the user manually. Since the output model is refined, its accuracy is comparable to that of the final refined structure. Mis-tracing (incorrect tracing of polypeptide fragments) is not impossible but should normally not exceed 1% of the whole structure with X-ray data to about 2.5 Å resolution or higher. A rough guess of the correctness of the model is printed after every model building cycle, e.g.:

% Chains 12, Residues 434, Estimated correctness of the model 99.1 %

Application 3 includes no model building but still may provide improvement of the density map. The map is first interpreted as a pseudo protein model, consisting of unconnected free atoms. This model is then refined and updated with iterative cycles of ARP/REFMAC.

Below the application 1 is described in detail, input to applications 2 and 3 is very similar and should be straightforward to figure out.

There are a number of additional parameters that you normally should not worry about. A brief description is given below.

3.1.2 Command line model building, auto_tracing.sh

The script auto_tracing.sh in the $warpbin directory allows running the automated model building from the command line without the use of the GUI. The use of auto_tracing.sh is fairly simple. If invoked without arguments the script will print help information.

Usage:  
auto_tracing.sh                                                                     \  
     datafile {mtzfile}                                                             \  
     [residues {number_of_residues_in_AU}]                                          \  
     [workdir {FULLPATH_WORKING_DIRECTORY}]                                         \  
     [fp {fp_label}] [sigfp {sigfp_label}] [freelabin {freer_label}]                \  
     [fbest {weighted_amplitude_label}] [phibest {phibest_label}] [fom {fom_label}] \  
     [modelin {input_PDB_file_to_use_as_initial_model}]                             \  
     [seqin {sequence_file_for_one_NCS_copy}]                                       \  
     [cgr {number_of_NCS_copies (if seqin is provided, default is 1) }]             \  
     [buildingcycles {the_number_of_autobuilding_cycles (default is 10) }]          \  
     [resol {’rmin rmax’ (default is the full resolution range) }]                  \  
     [albe {1 to_always_invoke_albe, default is 0 for resol < 2.7A, else 1) }]      \  
     [restraints {1 to use conditional restraints, default is 1 }]                  \  
     [twin {1 to try de-twining and twin refinement, default is 0 }]                \  
     [sad {1 to turn on the SAD function refinement,                                \  
       needs also ’wavelength’ and ’heavyin’ on input, default is 0 }]              \  
     [compareto {PDB_file_for_comparison}]                                          \  
     [parfile {parfilename_if_only_parfile_is_to_be_created}]                       \  
 
 - Optional command line arguments are given in square parentheses  
 - Possible combinations of MTZ labels are:  
     For start from phases:  
       fp/sigfp/phibest/fom or fbest/sigfp/phibest to build initial free-atoms model  
       and fp/sigfp to refine the model  
       If ’fbest’ is given, ’fom’ will be ignored  
     For start from a model:  
       fp/sigfp to refine the model  
 
 - All input files are assumed to be located in working directory  
   unless they are given with full path  
 - If workdir is not given, the current directory will be assumed  
 - All output files will be written into workdir/subdirectory  
 
Additional useful tips:  
 - Normally the job runs in a subdirectory called YYYYMMDD_HHMMSS  
   To run the job in the current directory use: auto_tracing.sh jobId ’.’  
 - If you invoke auto_tracing.sh from another script and the keywords with  
   double-word argument are not properly understood, e.g. resol ’20.0 2.5’,  
   try resol 20.0;2.5  
 - If you have a par file from an earlier version of ARP/wARP and would like to  
   re-run that job now, use: auto_tracing.sh defaults OLD_PAR_FILE  
   This will create a par file compatible with the current ARP/wARP version  
   and the keywords, which are new to OLD_PAR_FILE will take their default values  
 - NCS-based chain extension and NCS restraints with Refmac are applied  
   automatically if the resolution of the data is equal to or lower than 2.3 A.  
   Input ’ncsextension 1/0’ to apply / not apply NCS extension regardless of the  
   resolution of the data. Input ’ncsrestraints 1/0’ has similar effect

Required keyword is: datafile (followed by the mtz-file name with the full path).

Optional keywords include: residues (followed by the number of residues), workdir (followed by the absolute path to the working directory), fp (followed by the fp label), sigfp (followed by the sigfp label), freelabin (followed by the Rfree label), fbest (followed by the label for the fom-weighted structure factor amplitudes to be used for initial map calculation), phibest (followed by the best phi label), fom (followed by the figure of merit label), modelin (followed by a starting pdb-file with the full path), seqin (followed by a sequence-file name with the full path), cgr (followed by a number of NSC-related copies), buildingcycles (followed by the number of building cycles), resol (followed by the resolution limit), albe (followed by the flag to enable or not helix/strands building), similarly for restraints, twin and sad. There are additional parameters, which can be customised, and an experienced user should have no problem in figuring out how to do this. Alternatively, please contact the ARP/wARP developers for advice.

If auto_tracing.sh is called with an option parfile, the script will create a parameter file and a directory in the workdir whose name will be printed. The job can subsequently be launched by:

% $warpbin/warp_tracing.sh NAME_OF_PARFILE

If auto_tracing.sh is called without an option parfile, it will also launch the job. The log files and additional output files as well as the building results can be found in the directory created.

3.1.3 Remote submission of a model building task

This option offers you the following possibilities:

  1. Your model building will run using external computational facilities, where the CPU performance may be superior to your local installation.
  2. You can be assured that the most recent working executables will be used, should you have a problem with your local installation.
  3. Should the task crash, an automatic notification will be forwarded to the ARP/wARP developers who can then promptly help you (unless you have declared your task to be confidential, see below).
  4. Upon your wish you can share the results of the completed task with software developers.

3.1.3.1 Submitting from the GUI

Clicking on the button with “Submit the job for remote execution at the Hamburg cluster” within the main ARP/wARP Classic GUI panel allows one to execute an autotracing task remotely. The panel will expand and ask for an email address to be provided. Please also choose one of the options from the drop down menu to indicate how you would like your data to be handled.


PIC

Figure 3.2: Submitting a job to the ARP/wARP cluster from the CCP4 GUI


The options are:

  1. The data must be kept confidential and deleted after the job has finished.
  2. The data can be made available to ARP/wARP, AutoRickshaw or Refmac developers.
  3. The data can be archived and made available to any software developer that requests them (this is default).

Option 2 will only allow the data share to the ARP/wARP, Auto-Rickshaw and Refmac development teams. Option 3 will extend the share to anyone who requests the data. In case of option 1 only the short log file, Wilson/omega log files and the parameter file will be kept by the ARP/wARP developers, all other data (input PDB, PIR and MTZ files) as well as log files will be automatically deleted one week after the job has finished.

Once the job has been submitted for remote execution, the GUI window will indicate that the job has finished. Please inspect the log file from the pull-down menu option “View files from job” for further instructions. An email will be sent to you at the email address that you entered in the GUI window. Please follow the instructions in the email (http link, login and password) to connect to the Hamburg cluster. You can then monitor the log file in your browser window. As soon as the job is finished, you will be provided with a link to the results that you can then download. Keep in mind that once the job is finished, your data will be kept for one week only. Make sure that you download your data within that time.

The remote job submission relies on the curl software installed at your site. Availability of curl is checked while installing ARP/wARP and a warning is given if curl is not available.

3.1.3.2 Submitting from a web browser

Navigate your browser to:

or choose model building via the web at:

  1. View the Disclaimer as well as the ARP/wARP and the CCP4 licensing conditions.
  2. Proceed with the remote services to Step One.
  3. Choose the model building protocol (start from experimental phases or existing model).
  4. Enter your Email address to which instructions on how to view the results will be send.
  5. Provide your MTZ file by using the ‘Browse’ button, the file must have an extension .mtz.
  6. Click ‘Proceed to Step Two’.
  7. Enter starting model (unless you have chosen a protocol to start from experimental phases).
  8. Enter the total number of residues and the number of chemically identical molecules in the asymmetric unit. Please make sure you enter these two numbers right. If, for example, the asymmetric unit contains a dimer with each subunit having 50 residues, then you enter 100 and 2, respectively.
  9. Enter MTZ labels. FP and SIGFP are compulsory for model building starting from the existing model. PHI is additionally needed (and FOM is optional) for start from experimental phases.
  10. Click on ‘I agree to cite the required references and would like to proceed with ARP/wARP remote services’. This uploads the files to the cluster in Hamburg, launches the job and, after a few minutes delay, sends you an Email with instructions for viewing.
  11. Please follow the instructions in the email (http link, login and password) to connect to the Hamburg cluster. You can then monitor the log file in your browser window. As soon as the job is finished, you will be provided with a link to the results that you can then download.

Keep in mind that once the job is finished, your data will be kept for one week only. Make sure that you download your data within that time.

3.1.4 Output files, short log file

The following information could be useful when interpreting the log messages that are produced when running ARP/wARP.

Checking the estimated content
Should the solvent content be too high or too low (e.g. you have mis-typed the total number of residues expected in the AU), ARP/wARP will re-set it to approximately 50%. The target number of residues will be reset accordingly.
Checking the provided sequence file
Should the sequence length, the number of molecules in the AU and the total number of residues in the AU not match each other, the number of molecules in the AU will be reset accordingly. Should the sequence file not be interpretable (e.g. contain unexpected characters), an error message will be given.
Input MTZ file
We have observed that sometimes the MTZ files do not have proper headers, e.g. non-standard space group name or zero space group number. ARP/wARP uses CAD programme to always do a header fix, thus the MTZ file may have an extension .mtz.cad.
Space group number
ARP/wARP version 7.2 supports all standard non-centrosymmetric space groups, P1bar and several non-standard space groups (e.g. 1017 or 2017). The space group is figured out solely from the symmetry operators stored in the MTZ file header.
Input files
The ASCII files (sequence, input PDB or input file with heavy atoms) are always converted to a Unix line feed, thus they have an extension _lf.
Checking whether input PDB contains ligands
This check comes up if the initial model is available. Should the model contain ligands unknown to the Refmac library, they are renamed to the free DUM atoms. This should not affect the model building performance, but the warning is printed.
R factor after Refmac before model building
If the initial model is available, a number of restrained refinement cycles with Refmac is carried out until R factor convergence.
Building cycle zero
Normally one should expect a considerable part of the structure built already at the starting building cycle. If this is not the case, observe the situation for a few further building cycles. If, however, there is essentially nothing autotraced for further building cycles, please inspect whether the initial phases are sufficiently good.
Search for helices and strands
The module for building helical and beta-stranded fragments is invoked if requested or by default with data at 2.7 Å resolution or lower. The number of built helical/stranded residues and chain fragments is printed.
Rounds within building cycle
Each cycle of the main chain tracing is carried out in several rounds. Normally each successive round should result in more residues and in fewer fragments. The maximum length of the traced fragment and the score of the model building are also printed for information.
Chains, residues and estimated correctness of the model
The output from the best tracing round is processed further. Fragments of 4 residues or shorter are converted to free atoms. In addition, the terminal residues of the fragments are removed. The rest is kept and used to provide restraints for subsequent ARP/REFMAC cycles. The value of the estimated correctness of the model should steadily approach 100% if the tracing is successful.
Residues docked into sequence
If the sequence is provided, the autotraced fragments are docked into it and the side chains are built and refined in real space. The results of this are printed out. If the sequence is not provided, side chain guesses only (GLY/ALA/SER/VAL) are built and refined.
Loop building
This is invoked if the sequence is available and if the tracing score is above 0.85. It is also invoked after the last building cycle.
R factor after Refmac during the iterations
The value of the R factor typically oscillates. It goes up after each tracing cycle (because the model is entirely rebuilt) and then decreases during the ARP/REFMAC refinement and update cycles. At the end of the procedure it should reach a value typical for a restrained refinement.
Sequence coverage
If the sequence is provided, the ratio of the number of docked residues to the total number of traced residues is printed. A value higher than 0.8 is deemed as good convergence. All free atoms are then removed from the file and the task is directed into a few cycles of restrained refinement with solvent search. If, however, the value of sequence coverage is lower than 0.8, the free atoms (DUM) are left in the file. You can inspect the density maps, start changing the model on the graphics or, alternatively, submit another model building task using the output of this job.
Job termination
The statement Task completed successfully indicates that the job is finished with no error. An error statement:
  QUITTING ... ARP/wARP module stopped with an error message: name_of_the_program

indicates that one of the modules of the task has terminated with an error message. Please refer to the specified log file.

CPU requirements
Automated protein model building may be time consuming. Using a standard protocol of 10 building cycles interspaced with 5 ARP/REFMAC cycles, one should expect a job for a structure of 500 residues to be completed within about 1 hour (subject to the power of the computer you are using).

3.1.5 Running protein model building from the GUI, ARP/wARP Expert System


PIC

Figure 3.3: Running the ARP/wARP expert system from the CCP4 GUI


This protocol has not changed since version 7.1 except the changes in the underlying model building programs. The Expert System has the same aims as ARP/wARP Classic: to automatically build protein structures, starting either from molecular replacement models or experimental electron density maps.

A main difference is that this module, when a model is ‘more or less complete’, it will use the typically available second CPU core to start a new job to clean up the model, add waters, and refine it, and make it available to the user. In parallel, the old job will continue to see if it can find a better solution, with more residues, but the user does not need to wait for that to finish.

Another difference concerns the sequence file. If you have hetero-multimers in the asymmetric unit of your crystals, you should add each sequence separately, by clicking the Add Input PIR file button. Then, you can define any stoichiometry for complicated hetero-multimers. For each defined sequence the user can select from a pull-down menu the number of copies in the asymmetric unit. Based on that and the contents of the PIR file the contents of the AU in residues will be calculated automatically.

The input files are identical to those with the ARP/wARP Classic module.

There is a dedicated option to select that the Methionines are Se-Met residues if the dataset comes from a SAD or MAD experiment on a selenium edge; the SAD and TWIN functions are also implemented.

The number of refinement and building cycles are not fixed, but are defined on the fly based on the programs progression. The Decision parameters are defining these limits. If you leave the mouse over one of the input fields, a help text will appear explaining the use of each decision parameter.

The parameter maximum number of processes in parallel is important. When Expert System decides that it has reached a more-or-less useful model, it will spawn a ‘cleaning up and completion’ process. However it will continue the iterative building in parallel. If the iterative building results in a better model, a new ‘cleaning up and completion’ process will be requested, possibly before the previous ‘cleaning up and completion’ process has finished. If you have only two processors (typical these days in dual core systems) the new process will be ’queued’; when the previous one is finished the new one will start.

3.1.6 Running flex-wARP from command line

Please type to get on-line help:

 python $pywarpbin/CAutoPyWARP.pyc --help

3.2 Automated Construction of Helical and Beta-Stranded Fragments

3.2.1 Building secondary structure from the GUI, ARP/wARP Quick Fold


PIC

Figure 3.4: Running Quick Fold from the CCP4 GUI


The procedure for building secondary structural elements is based on the use of discriminant analysis in a successive filtering scheme taking into account the geometry of alpha-helical and beta-stranded main-chain fragments. The electron density map is first analysed and a suitable threshold is automatically selected. In the next step stereochemical information on the helix and strand geometry is used; sets of overlapping fragments are constructed and filtered based on their geometric likelihood. All fragments that overlap at a particular location of a helix or a strand undergo an ensemble averaging process to provide the best estimate of CA positions. The output fragments are then regularised and the chain direction is chosen on the basis of their fit to the density. Finally the fragments are refined in real space.

The accuracy of the resulting model depends on many parameters. The module should be able to build helices and strands at resolutions as low as 4.5 Å. However, it may not result in complete helical/stranded structure and it may also contain parts that are mis-interpreted. The expected top performance is the correct location of 90% of the helices and 50% of the strands. The procedure is relatively fast and takes only a few minutes for proteins of moderate size (up to 500 residues).

The secondary structure recognition module is optimised to address lower resolution data and hard cases where, e.g. the full model building protocol has not been successful. For a resolution higher than 2.6 Å the module will automatically trim the resolution and Wilson B-factor of the data to approach its design conditions.

There are a number of additional parameters that you normally should not worry about. A brief description is given below:

3.2.1.1 Output files, short log file

The following information could be useful when interpreting the log messages that are produced when running Quick Fold.

Checking the estimated content
Should the solvent content be too high or too low (e.g. you have mis-typed the total number of residues expected in the AU), ARP/wARP will re-set it to approximately 50%. The target number of residues will be reset accordingly.
Residues and chain fragments
The important numbers are highlighted in red/bold in the short log file, indicating the number of residues and the number of fragments into which these residues are arranged. The higher the values of the Connectivity index and the Tracing score, the more complete and reliable the resulting model is. The length of the longest chain is also printed.
Further extension of the model
You may try to feed the PDB output of the module into Classic or flex-wARP. However, subject to the resolution of the data, this may not provide enough seed for subsequent automatic tracing of the full chain.
Job termination
The statement Task completed successfully indicates that the job has finished with no error. An error statement:
QUITTING ... ARP/wARP module stopped with an error message: name_of_the_program  
    

indicates that one of the modules of the task has terminated with an error message. Please refer to the specified log file.

3.2.2 Building secondary structure from the command line, auto_albe.sh

The script auto_albe.sh (where albe stands for alpha-beta) in the $warpbin directory allows you to run the secondary structure building as a single-line command without the use of the GUI. The use of auto_albe.sh is fairly simple. The script prints out help information if it is invoked without arguments.

Usage:  
$warpbin/auto_albe.sh                                                     \  
             datafile {mtzfile}                                           \  
             [residues {number_of_residues_in_AU}]                        \  
             [workdir {FULLPATH_WORKING_DIRECTORY}]                       \  
             [helixfileout {output_PDB_file}]                             \  
             [jobId {desired_job_id_used_for_subdirectory_naming}]        \  
             [fp {fp label} sigfp {sigfp label} phib {phi label}]         \  
             [fom {fom label}] (input ’fom none’ if no fom is to be used) \  
             [compareto {PDB_file_for_comparison}]                        \  
             [nostrands {0 or 1, default=0}]                              \  
             [parfile {parfilename_if_only_parfile_is_to_be_created}]  
 
 - Optional command line arguments are given in square parentheses  
 - All input files are assumed to be located in working directory  
   unless they are given with full path  
 - If workdir is not given, the current directory will be assumed  
 - All output files will be written into workdir/subdirectory

Required keyword is: datafile (followed by the mtz-file name with the full path).

Optional keywords include: residues (the expected number of residues in the asymmetric unit), workdir (followed by the full path to the working directory), helixfileout (the name of the PDB file where the traced both helical and stranded fragments will be output to), jobId (if you wish that the working sub-directory has a particular name), fp (followed by the fp label), sigfp (followed by the sigfp label), phib (followed by phibest label) and fom (followed by the label to fom). The defaults are FP, SIGFP, PHI and FOM, respectively. Alternatively, if the mtz file contains only one column for structure factor amplitudes and only one column for their standard deviations, these will be taken. If you wish FOM not to be used, please input fom none. For test purposes, the constructed helices/strands can be compared to known reference models (hand- or pre-fitted). The required keyword is compareto (followed by the full-path name of a PDB file). You can also enable/disable the construction of strands using the keyword nostrands, the default is 0 (build the strands). If auto_albe.sh is called with an option parfile, the script will create a parameter file and a directory in the workdir whose name will be printed. The job can subsequently be launched by:

% $warpbin/warp_albe.sh NAME_OF_PARFILE

If auto_albe.sh is called without an option parfile, it will also launch the job. The log files and additional output files as well as the building results can be found in the directory created.

3.3 Automated Loop Building

3.3.1 Running loop building from the GUI, ARP/wARP Loops


PIC

Figure 3.5: Loop building from the CCP4 GUI


This module tries to find likely loops to connect fragments of a partial protein structure based on the sequence and the density map. It builds the loops in three phases. First a tree of possible CAs between the fragments is build, next the unlikely ones are removed and the rest of the main chain atoms determined, and finally the best loops are selected. The tree can be build either towards the C-terminus of the N-terminus of the protein, or both. The built loops are ordered (in descending order) according to the density correlation at the main chain atoms (including CB if present) or the correlation of the side chains, or a combination of both. If the number of loops exceeds the chosen number only the best are saved to file.

There are a number of options that can be added. A brief description is given below.

3.4 Automated Building of Poly-Nucleotides

3.4.1 Running nucleotide building from the GUI, ARP/wARP DNA/RNA


PIC

Figure 3.6: Building Poly-Nucleotides from the CCP4 GUI


This module builds fragments of DNA or RNA. The input is an MTZ file containing the phases from which the map best describing the nucleotide region can be computed. Thus the map could be a difference map (e.g. after the protein model is completed) or a sigma-weighted map for the whole asymmetric unit. The nucleotide building procedure within ARP/wARP Version 7.2 proceeds in several steps: first it locates putative phosphates in the density map, then uses them in a manner analogous to the CA-candidates for protein chain tracing. After the nucleotide fragments are obtained, a likely base is built and refined in real space. The type of the base is currently limited to A (large) or C (small) and the nucleotide sequence is not yet used.

The produced poly-nucleotides are quite accurate, a typical rmsd for the built backbone atoms is 0.6 Å with X-ray data extending to around 3.0 Å resolution. The method is not sensitive to a particular DNA or RNA conformation. The module is not very CPU efficient and may take about 10 minutes for a 20-nucleotide structure.

There are a number of options that can be added. A brief description is given below.

3.4.1.1 Output files, short Log File

The following information could be useful when interpreting the log messages that are produced when building DNA/RNA.

Checking the estimated content
Should the solvent content be too high or too low (e.g. you have mis-typed the total number of residues expected in the AU), ARP/wARP will re-set it to approximately 50%. The target number of residues will be reset accordingly.
Phosphate candidates
The identified number of phosphate candidates is typically 100 times higher than the number of nucleotides in the structure.
Nucleotides and chain fragments
The important numbers are highlighted in red/bold in the short log file, indicating the number of nucleotides and the number of fragments into which these residues are arranged. The length of the longest chain is also printed.
Job termination
The statement Task completed successfully indicates that the job has finished with no error. An error statement
QUITTING ... ARP/wARP module stopped with an error message: name_of_the_program  
    

indicates that one of the modules of the task has terminated with an error message. Please refer to the specified log file.

3.4.2 Running nucleotide building from the command line, auto_nuce.sh

The script auto_nuce.sh in the $warpbin directory allows you to run the secondary structure building as a single-line command without the use of the GUI. The use of auto_nuce.sh is fairly simple. The script prints out help information if it is invoked without arguments.

Usage:  
$warpbin/auto_nuce.sh                                                         \  
     datafile {mtzfile}                                                       \  
     [residues {number_of_protein_residues_in_AU}]                            \  
     [nucleotides {number_of_nucleotides_in_AU}]                              \  
     [workdir {FULLPATH_WORKING_DIRECTORY}]                                   \  
     [fp {fp_label}] [sigfp {sigfp_label}] [fbest {weighted_amplitude_label}] \  
     [phib {phib_label}] [fom {fom_label}]                                    \  
     [resol {’rmin rmax’ (default is the full resolution range) }]            \  
     [compareto {PDB_file_for_comparison}]                                    \  
     [parfile {parfilename_if_only_parfile_is_to_be_created}]                 \  
 
 - Optional command line arguments are given in square parentheses  
 - Possible combinations of MTZ labels for map calculation are:  
     fp/sigfp/phib/fom or  
     fbest/sigfp/phib if fbest is already fom-weighted.  
 - In the latter case if ’fbest’ is given, ’fom’ will be ignored  
 
 - All input files are assumed to be located in working directory  
   unless they are given with full path  
 - If workdir is not given, the current directory will be assumed  
 - All output files will be written into workdir/subdirectory

Required keyword is: datafile (followed by the mtz-file name with the full path). In difference to the functionality offered from the CCP4 GUI, datafile can also be a density map.

Optional keywords include: residues (the expected number of residues in the asymmetric unit), nucleotides (the expected number of nucleotides in the asymmetric unit), workdir (followed by the full path to the working directory), fp (followed by the fp label), sigfp (followed by the sigfp label), phib (followed by phibest label) and fom (followed by the label to fom). The defaults are FP, SIGFP, PHI and FOM, respectively. Alternatively, if the mtz file contains only one column for structure factor amplitudes and only one column for their standard deviations, these will be taken. If you wish FOM not to be used, please fbest. You can set resol (followed by the resolution limit). For test purposes, the constructed model can be compared to known reference model. The required keyword is compareto (followed by the full-path name of a PDB file).

If auto_nuce.sh is called with an option ‘parfile’, the script will create a parameter file and a directory in the workdir whose name will be printed. The job can subsequently be launched by:

% $warpbin/warp_nuce.sh NAME_OF_PARFILE

If auto_nuce.sh is called without an option ‘parfile’, it will also launch the job. The log files and additional output files as well as the building results can be found in the directory created.

3.5 Automated Ligand Building

3.5.1 Running ligand building from the GUI, ARP/wARP Ligands


PIC

Figure 3.7: Building Ligands from the CCP4 GUI


The ligand building procedure within ARP/wARP Version 7.2 proceeds in three steps: first it locates the binding site in the difference density map, then builds there a number of putative ligand models and, finally, selects the best model, which is geometrised and real-space fit into the density.

The binding region is selected automatically by matching ligands shape-related properties to the regions of high density. The chosen region is parameterised by a sparse set of putative positions (grid nodes) for the ligand atoms. For the construction of the ligand into this sparse set two algorithms are used. One exploits the combinatorial assignment of the ligand atom identities to the grid nodes, ‘label swap’. Another algorithm maximises the overlap between the sparse set and the ligand model by a random search in conformational space. The output from both algorithms is merged and then undergoes a last stage of real-space refinement before the final model is selected.

The accuracy of ligand building is mainly dependent on ligand size and the resolution of the X-ray data. As a rough guide, about 75% of well-ordered ligands of a size around 20 to 40 non-hydrogen atoms should be built within r.m.s.d. of 1.0 Å from their correct location. Thus the constructed models should be accurate enough for REFMAC5 to straightforwardly refine the protein-ligand complex. The procedure can be iterated to locate additional ligands, if any are present.

The ARP/wARP ligand building module requires the X-ray data (in MTZ format), the built protein without ligands (in PDB format) and a template model of the ligand to build (in PDB format). Options include the possibility to specify the binding site and the number of starting grids, the ability to compare the run result to some reference ligand(s), and the possibility to build a ligand taken from a list of candidates (‘cocktail’). In the latter case the coordinates of the ligand candidates should be concatenated into a single PDB file. The different ligands must be distinguished by their residue name (columns 18-20), chain identifier (column 22) or residue sequence number (columns 23-26). ARP/wARP will automatically choose the best-matching ligand candidate and will attempt to build it at the binding site, either determined automatically or supplied by the user. However, since this feature is new, the specification of the binding site (see below) is recommended. One can also specify that only well-resoloved parts of a partially occupied ligand are built and this can be done automatically.

There are a number of options that can be added either in the main GUI panel (scrolling bar Build the ligand) or under the Parameters section. You normally should not need to worry about these (except you want the ligand to be build around the known location or you would like to screen a list of candidates, ‘ligand cocktail’). A brief description is given below.

3.5.1.1 Output files, short Log File

The following information could be useful when interpreting the log messages that are produced when building ligands.

Refinement with refmac
The R factor (and R free if requested) are printed after refinement of the protein part only with Refmac. Check that the value of the R factor is reasonable. A value higher than about 30% may indicate that the computed difference map may be too noisy for location of the ligand. A failure may indicate invalid atom nomenclature in your PDB file.
The ligandbuild program
The mapping of the difference density synthesis parameterised with grid points onto the ligand atoms (ligandbuild and M_ligandbuild) is run as many times as defined by the number of ligand building cycles. A failure may indicate incorrect identification of the binding site. This can be amended by defining the binding site manually prior to the run (see above).
Real space fit
Up to 108 top constructed ligand models undergo a real-space refinement with respect to the difference density map. The best solution is output. If the test and comparison option is selected, the r.m.s.d. to the reference PDB file (XYZREF) is also printed. There will be a warning given if the stereochemistry of the constructed ligand is poor. Also a warning will be given if the constructed ligand molecule has severe steric clashes, which may be a sign of an incorrect ligand building. You may want to inspect the ligand and the density and, if there is a clear part of the ligand that is disordered, try to remove it from the ligand target PDB file and to re-run the job.
Job termination
The statement Task completed successfully indicates that the job has finished with no error. An error statement:
QUITTING ... ARP/wARP module stopped with an error message: name_of_the_program

indicated that one of the modules of the task has terminated with an error message. Please refer to the specified log file.

3.5.2 Running ligand building from the command line, auto_ligand.sh

The script auto_ligand.sh in the $warpbin directory allows you to run the ligand building as a single-line command without the use of the GUI. The use of auto_ligand.sh is fairly simple. The script prints out help information if it is invoked without arguments.

Usage:  
auto_ligand.sh                                                                       \  
           datafile {either mtzfile or mapfile}                                  \  
           protein {starting_PDB_file_without_ligand}                               \  
           ligand {PDB_file_with_ligand_to_fit}                                  \  
               [workdir {FULLPATH_WORKING_DIRECTORY}]                                \  
           [ligandfileout {output_PDB_file}]                                        \  
           [fp {fp_label}] [sigfp {sigfp_label}] [freer {freer_label}]              \  
               [nligandcycles {number_of_ligandbuild_cycles (default is 2)}]         \  
           [search_model {PDB_file_with_model_at_expected_ligand_site}]             \  
           [search_position {X Y Z}]                                                \  
           [search_radius {radius_in_angstroms}]                                    \  
           [reflist {textfile_with_FULLPATHnames_of_fitted_ligands_for_comparison}] \  
           [extralibrary {user_defined_library_for_Refmac5}]                        \  
           [partial {0 for modelling the whole ligand and 4 or higher number to     \  
              model partially occupied ligand (giving 4 would mean to consider      \  
              4-atoms as the smallest ligand fragment)]                             \  
           [parfile {parfilename_if_only_parfile_is_to_be_created}]  
 
 - Optional command line arguments are given in square parentheses  
 - All input files are assumed to be located in working directory  
   unless they are given with full path  
 - If workdir is not given, the current directory will be assumed  
 - All output files will be written into workdir/subdirectory

Required keywords are: datafile (followed by the mtz-file name with the full path), protein (followed by the pdb-file name of the protein model without the ligand with the full path) and ligand (followed by the pdb-file containing the ligand(s) description with the full path). In difference to the functionality offered from the CCP4 GUI, datafile can also be a density map.

Optional keywords include: workdir (followed by the full path to the working directory), fp (followed by the fp label), sigfp (followed by the sigfp label). The defaults are FP and SIGFP, respectively. Alternatively, if the mtz file contains only one column for structure factor amplitudes and only one column for their standard deviations, these will be taken. The number of ligand building cycles (default is 2) can be changed with keyword nligandcycles. The approximate location of the binding site can be supplied by the user either by providing the pdb-file(s) of a ligand (or a just a list of atoms) located at the binding site (search_model), or by specifying the (XYZ) coordinates of a point defining the binding region using search_position and search_radius (default value for the latter is 5 Å. For test purposes, the constructed ligand can be compared to known reference models (hand- or pre-fitted). The required keyword is reflist (followed by the full-path name of a text file, containing a list of pdb-files with the reference ligands and their absolute paths). Building of partially occupied ligand can be requested using the keyword partial following by a number 4 or higher. A user-defined ligand library can be input using keyword extralibrary.

To build the ligand from a list of candidates (‘cocktail’), the coordinates of the ligand candidates should be concatenated into one file specified by the above mentioned keyword ligand. The different ligands must be distinguished by their residue name (columns 18-20) in the concatenated pdb file (different chain identifier or residue sequence number will do as well, however we recommend to use different residue names). ARP/wARP will automatically choose the best-matching ligand candidate and will attempt to build it at the binding site, either determined automatically or supplied by the user. Supplying the binding site using search_model or search_position keywords is an alternative to this method.

To build the partially occupied ligand enter keyword partial with the appropriate parameter defining the size of the smallest ligand fragment. ARP/wARP will automatically choose the best-matching ligand fragment and will attempt to build it at the binding site, either determined automatically or supplied by the user.

If auto_ligand.sh is called with an option parfile, the script will create a parameter file and a directory in the workdir whose name will be printed. The job can subsequently be launched by:

% $warpbin/warp_ligand.sh NAME_OF_PARFILE

If auto_ligand.sh is called without an option parfile, it will also launch the job. The log files and additional output files as well as the building results can be found in the directory created.

3.6 Automated Solvent Building

3.6.1 Running solvent building from the GUI, ARP/wARP Solvent


PIC

Figure 3.8: Solvent Building from the CCP4 GUI


Within solvent building module restrained reciprocal space refinement is carried out with REFMAC while ARP/wARP is performing automatic adjustment of the solvent structure. Resolution of the data should be 2.5 Å or higher. The output is the protein model with the solvent molecules transformed with symmetry operations to lie around the protein.

The ARP/wARP solvent building module requires the X-ray data (in MTZ format) and the protein model (in PDB format) without solvent or with a partial solvent model.

There are a number of options that can be added. A brief description is given below.

3.6.1.1 Output files, short log file
Refinement with REFMAC
The R factor (and R free if requested) are printed after refinement of the protein with Refmac. Check that the value of the R factor is decreasing upon solvent building.
Job termination
The statement Task completed successfully indicates that the job has finished with no error. An error statement
QUITTING  ARP/wARP module stopped with an error message: name_of_the_program

indicates that one of the modules of the task has terminated with an error message. Please refer to the specified log file.

3.6.2 Running solvent building from command line, auto_solvent.sh

The script auto_solvent.sh in the $warpbin directory allows you to run the solvent building as a single-line command without the use of the GUI. The use of auto_solvent.sh is fairly simple. The script prints out help information if it is invoked without arguments.

$warpbin/auto_solvent.sh                                               \  
           datafile {mtzfile}                                          \  
           protein {starting_PDB_file}                                 \  
           [workdir {FULLPATH_WORKING_DIRECTORY}]                      \  
           [solventfileout {output_PDB_file}]                          \  
           [fp {fp_label}] [sigfp {sigfp_label}] [freer {freer_label}] \  
           [restrcyc {number_of_cycles (default is 20) }]              \  
           [extralibrary {user_defined_library_for_Refmac5}]           \  
           [tlsin {fixed pre-refined TLS tensors from Refmac5}]        \  
           [parfile {parfilename_if_only_parfile_is_to_be_created}]  
 
 - Optional command line arguments are given in square parentheses  
 - All input files are assumed to be located in working directory  
   unless they are given with full path  
 - If workdir is not given, the current directory will be assumed  
 - All output files will be written into workdir/subdirectory

Required keywords are: datafile (followed by the mtz-file name with the full path) and protein (followed by the pdb-file name of the protein model with the full path).

Optional keywords include: workdir (followed by the full path to the working directory), solventfileout (followed by the name of the PDB file where the output will be written), fp (followed by the fp label), sigfp (followed by the sigfp label) and freer (followed by the Rfree label). The defaults for the first two are FP and SIGFP, respectively. Alternatively, if the mtz file contains only one column for structure factor amplitudes and only one column for their standard deviations, these will be taken. The number of cycles (default is 20) can be changed with keyword restrcyc. The user-defined library and the tls-tensor for Refmac can be supplied by using the keywords extralibrary and tlsin.

If auto_solvent.sh is called with an option parfile, the script will create a parameter file and a directory in the workdir whose name will be printed. The job can subsequently be launched by:

% $warpbin/warp_solvent.sh NAME_OF_PARFILE

If auto_solvent.sh is called without an option parfile, it will also launch the job. The log files and additional output files as well as the building results can be found in the directory created.

3.7 ARP/wARP molecular graphics: ARP Navigator

The graphical front-end to ARP/wARP Version 7.2 is an OpenGL/X-window based graphics program that can be launched by pressing the ARP Navigator button in the CCP4 gui. The program can also be started from the command line by typing arpnavigator.


PIC

Figure 3.9: ARP Navigator


3.7.1 Main Menu

The main menu sits at the top of the ARP Navigator screen.

3.7.2 Mouse and Keyboard functions

3.7.2.1 Rotation
Left mouse button pressed and mouse moved
the scene rotates about the x and y axes (screen plane).
Left mouse button + r-key pressed and mouse moved left-right
the scene rotates about the z axis (perpendicular to screen plane).

3.7.2.2 Translation
Right mouse button pressed and mouse moved
the scene is translated in the xy-plane (screen plane; maps are infinitely repeated).
Left mouse button + t-key pressed and mouse moved
an alternative way to translate the scene in the xy-plane.
Left mouse button + z-key pressed and mouse moved up-down
the scene is translated in z-direction (perpendicular to screen plane).

3.7.2.3 Scaling
Middle mouse button pressed and mouse moved left-right
zooming, the scene is scaled and a scale-o-meter is shown on the right.
Left mouse button + s-key pressed and mouse moved
an alternative way to zoom.

3.7.2.4 Clip planes
Left mouse button + f-key pressed and mouse moved left-right
changes the front clip position.
Left mouse button + b-key pressed and mouse moved left-right
changes the back clip position.
Left mouse button + g-key pressed and mouse moved left-right
changes the front and back clip position together.
Left mouse button + d-key pressed and mouse moved left-right
changes the position of the rotation-center (similar to translation).

3.7.2.5 Map contouring

The mouse wheel is used for changing the contour level of a map. The map must be activated by pressing the corresponding object button at the bottom of the graphics window.

Left mouse button + c-key pressed and mouse moved up-down
An alternative way to change the contour level.

3.7.2.6 Map extent
Left mouse button + e-key pressed and mouse moved
size increases.

3.7.2.7 Mouse Actions
Left mouse button pressed in graphics area
marks atoms or density (switch this in Options menu). Double-click will also centre on atoms.
Right mouse button pressed on top of an object button
opens the Mini menu of the related object (Parameters, close, save, etc.).
Right mouse button pressed in graphics area
opens the Quick actions menu.

3.7.2.8 Keyboard Actions
w
Hide the menu and all attached information as long as pressed
W (=shift-w)
Lock the function of ’w’ and do not show the menu when released. To unlock, press ’w’ or ’shift-w’ again, then the menu will be visible again.
G (=shift-g)
Launch a goto-atom dialog (see ’goto atom’ below).
C (=shift-c)
Center on the last mark set irrespective of whether this was an atom or a density region.
D (=shift-d)
Activate the display of distances between the most recent mark and all other marks set so far.
m
Toggle the control of a detached model: move the model only vs move the crystal frame alone with the model fixed.
k
Toggle the control of a detached model: move the model and the crystal frame together vs move the crystal frame alone.

3.7.3 Object Buttons

When a file is loaded and put on display, there will be small boxes appearing in the bottom left corner representing each of the graphical objects. Only one object can be active at a time.

An object can be made active by clicking on the box with the left mouse button. A little eye symbol shows whether this object is currently on display or if it’s hidden. Clicking with the right mouse button on this box will pull out the mini-menu with actions applied to this object only (see also Mini menu).

3.7.4 Quick Actions

When the right mouse button is pressed with no movement, then a green button box is displayed that contains functionalities to be applied ’ad-hoc’ and with no input dialog.

Goto Atom
This button launches the ’goto-atom’ dialog as ’shift-g’ does.

The goto-atom dialog expects that atoms are specified as e.g. CA/123/A for the CA atom of residue 123 in chain A. Just specifying CA/123 means the first occurrence of CA in residue 123. Specifying /123/ means the first atom in residue 123. Typing //Z will be interpreted as the first atom of chain Z. The program will centre on the atom if found. In case the atom cannot be found, the dialog gets coloured in pink.

Real Space Refine Ligand
The ligand to be refined is a detached molecule and there is one density map on display. The ligand gets refined to that density map locally and the initial ligand position must be in the radius of convergence. The output will replace the detached model. Please note that the refinement is restrained to the ligand stereochemistry which is derived from the input ligand model. Thus continuous play with the ligand by taking it out and then refining it back in to its density will successively change ligand’s stereochemistry.
Find Ligand Binding Site
The ligand to be located is a detached molecule and there is one density map on display. Furthermore all other models displayed are taken as occupants of space and the binding site can not intersect with them. In return a dummy atom model of the located density blob is shown.
Fit Ligand Here
The ligand to be fit is the detached model, there is at least one density map on display that has one of its blobs marked. The output will replace the detached model.
Build Helices
At least one density map must be on display (or activated). Helices are built and side chains are modelled up to C-gamma atoms.

Chapter 4
Additional Remarks

4.1 Quality of the X-ray Data

The X-ray data should be as complete as possible, especially in the low resolution range (5 Å and worse). If the low resolution strong data are systematically incomplete (e.g. missing or overloaded reflections), the density map, even in the case of a good model, may be discontinuous and inconsistent with the model. Because ARP/wARP involves model building on the basis of density maps, such discontinuity can lead to slower convergence or adversely affect the performance.

ARP/wARP automatically checks the fit of your data to the expected Wilson plot and will report if necessary. If suggested to cut the data from the high resolution side - follow the suggestion. If suggested to cut the data from the low resolution side - do so but do not cut to a resolution below 8 or 10 Å. If suggested to ignore all data or there are still other complaints after the cut - you may consider inspecting your data processing. The current version of the ARP/wARP Wilson plot check might be too stringent. Nevertheless the user is advised to visually inspect the Wilson plot and apply his/her critical judgment whether or not the data should be cut. It has sometimes proved beneficial to cut the data which were flagged as poor, though in some cases the presence of these data were crucial for the model building.

Chapter 5
Citing ARP/wARP

Please cite the applications of ARP/wARP that you have used. Please consult the ARP/wARP log file for the most relevant citation.

Chapter 6
Other References

The most recent overview of ARP/wARP can be found in:

Applications are presented in:

For other publications please refer to the references therein or to the ARP/wARP web page.

Chapter 7
Acknowledgements

The current ARP/wARP developers are:

The Hamburg team (European Molecular Biology Laboratory (EMBL) Hamburg Outstation, c/o DESY, Notkestrasse 85, 22603 Hamburg, Germany):

The Amsterdam team (Molecular Carcinogenesis Programme, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands):

Former members

The authors are especially grateful to:

We would also like to take this opportunity to thank for the support of ARP/wARP: the EMBL and the NKI, for hosting the research groups; the EMBL for hosting the ARP/wARP download servers and remote computational infrastructure, funding agencies for research and infrastructure grants; our industrial users, for generating a license income which strengthens our ability to keep to our commitment for free distribution to the academic community.