SHELXL: Frequently Asked Questions


Problems/comments/suggestions: trs@shelx.uni-ac.gwdg.de

Last update: Fri Apr 16 19:00:04 CEST 2004


Q1: Where can I find some information to get started with SHELXL ?

Some useful information for starting protein refinements in the ACA Workshop notes (file aca2000.doc on the SHELX ftp site).
A Tutorial is available at p1lys.html
A manual is available in a variety of formats. Please go to the SHELX Homepage to find out more about what's available.

Q2: How do I transfer Rfree flags from XPLOR to SHELXL ?

Use the 'y' option in SHELXPRO (Release 97-2). This should work for X-PLOR and CNS with .fob files with and without headers and with any number of lines per reflection.

Q3: How do I produce data-files containing I's or F's with the same reflections marked for Rfree ?

use the following cshell-script: rfreeh.csh. Please check all numbers very carefully. The script is not tested to be space-group general etc..


Q4: After using SHELXL for refinement using isotropic B's, my rms deviation are too high. What can I do ?

It might well be a good idea to reduce the esd on SIMU slightly. Theoretically you can use Rfree to select the best value, but in practice Rfree may not change enough. You should also look at the distribution of the B-values along typical side-chains. E.g. if the values oscillate widely along a lysine side-cahin, then the restraints are too slack, but if they increase steadily - even if the increases are large - it would be chemically fully acceptable.


Q5: How can I move from program X/Y to SHELXL ?


Q6: Why are my R-values much higher after migrating from program X/Y to SHELXL ?

  • You might actually not be using the same model. Not everything that describes the model is in the pdb-file. A typical example are the numbers that describe a bulk-solvent correction. They are not only not documented in the pdb-file, but on top of this, the bulk-solvent corrections used by different programs can be very different.
  • The weighting schemes used by different programs differ substantially and different weighting schemes will lead to different models. A particular problem is, if the errors on you data are grossly wrong. This will confuse SHELXL completely as it relies on the standard deviations of the intensities it finds in the hkl-file to determine the weigths used for refinement.
  • You might not be using the same data. There are several possibilities to mess up when you convert data from one format to another. Just to name a few: different R-free flagging conventions; numbers corresponding to very high/low intensities are not transferred correctly due to format problems; flagging of unobserved reflections; format for negative intensities; ...
  • R-values might be calculated differently in different programs:
  • To escape from a local minimum reached by another program, it may help to use the STIR (STepwise Increase of Resolution) -command to let the structure go a bit.

    Manfred Weiss told me that, if you use the CCP4 program MTZ2VARIOUS, amplitudes are automatically squared if you give OUTPUT SHELX. Nasty trap ... This problem does not appear if you work with OUTPUT USER


    Q7: How do I tell SHELXL about a Se-Methionine in my sequence ?

  • tell the program about the Se by adding it to the list of scattering factors, e.g:
    SFAC  C  H  N  O  S  SE
    UNIT  10100 19152 2772 3088 48 96
  • rename the respective MET-residues to, let's say, MSE.
  • add the appropriate restraints to the dictionary, e.g.:
    .
    .
    .
    DFIX_MSE 1.458 N CA
    DFIX_MSE 1.525 C CA
    DFIX_MSE 1.930 CG SE
    DFIX_MSE 1.930 SE CE
    DANG_MSE 2.401 O CA
    DANG_MSE 2.462 C N
    .
    DANG_MSE 2.881 CB SE
    DANG_MSE 2.504 C CB
    ..
    
    There is no warranty for the distances given here - let me know if you find better ones (trs@shelx.uni-ac.gwdg.de)
  • change the scattering factor of the sulphurs from what is normally 5 to 6.
  • carefully inspect the list of 'Disagreeable restraints' in the .lst-file for any inconsistencies.

    Q8: Is SHELXL really the best program to refine my structure against my 1.1 A data ?

    Yes. :-)
    People like to use it in particular for:
  • anisotropic refinement against very high resolution data.
  • estimation of parameter esd's
  • refinement against merohedrally twinned data
  • in the presence of severe disorder
  • refinement against Laue-data
  • refinement in presence of strong anomalous scattering
  • refinement of non-standard amino acids, substrates etc.

    Q9: Is there any references I could use to compare to my own refinement ?

    There is very many, here are a few:
  • Current Opinion in Structural Biology 1997, 7, 681-688
  • Current Opinion in Structural Biology 1998, 8, 730-737
  • Acta Cryst. 1998, B54, 443-449;
  • Acta Cryst. 1999, D55, 1158-1167 and 1773-1784;
  • JMB 1998, 282, 1043-1059;
  • Nature 1998, 392, 206-209;
  • JBIC 1999, 4, 162-165;
  • Structure 1995, 3, 1159-1169;
  • Structure 1996, 4, 1509-1515;
  • Structure 1999, 7, 55-63;

    Q10: Is SHELXL using standard pdb-format ?

    Yes it is. One possible source of confusion is the fact that SHELXL likes to call the oxygen atoms in the C-terminal carboxylate group OT1 and OT2 (pdb format wants O and OXT). But this nomenclature is only used for refinement purposes. In fact, at high resolution, the bonds between the C-terminal carbon and the carboxylate oxygen have different restraints from a 'normal' carbonyl groups plus another oxygen. Depending on the pH, the bondlength are the same or different - and this can and should be included in the restraints used. When the refinement is finished, SHELXPRO can be used to convert OT1 and OT2 back to the 'proper' names O and OXT.

    Q11: From the CCP4-BB (Anthony Addlagatta): 'Is there anybody who knows 'how to convert the SHELX map files to view in QUANTA?'

    Lisa Edberg: 'I wrote the O map file, and then converted it to ccp4 format using xdlmapman, then converted it in quanta. It's driving around the block a few times, but it works.'

    Q12: How do I print R-value versus resolution ?

    Use SHELXPRO, 'S' or 'L' option.

    Q13: SHELXPRO does not want to calculate my map and tells me: "** Memory too small for Fourier calculation **". What can I do ?

    George: You may be able to make the map fit in memory by reducing the number of grid points, but otherwise you will have to recompile SHELXPRO. For Linux I recommend the Portland FORTRAN compiler (http://www.pgroup.com) but it might be possible with g77 (unfortunately the O/TurboFrodo map file format forces me to use some non-standard FORTRAN and I'm not sure that g77 will like this). You will need to increase LA and the dimension of array A (to the same number). Be careful to set NOS to 4 for Linux or 3 for SGI in order to obtain the correct binary file format.

    To recompile shelxpro on most systems: f77 shelxpro.f -o shelxpro. Depending on he computer you may wish to optimize (e.g. -O2) but it is not really necessary (and if you overoptimize the results may be wrong).


    Q14: I have a very nicely refined model that I use for molecular replacement for a mutant structure that should be very similar. Using STIR 3.0 0.01 to a resolution of 2.0 A blows up. What can I do ?

  • do a rigid body refinement at 3.0 A before getting started.
  • start STIR at, let's say 2.5 A. SHELXL restraints may be to weak to keep things together at 3 A.
  • remove all waters. Straying waters may confuse the minimizer. It is easy to put waters back using SHELXWAT. This also is a bit saver in terms of bias.
  • localize trouble-makers by looking at the atoms giving the maximum shifts (in .lst-file).

    Q15: I have two molecules in the a.u. each containing a heme group and data to 1.1 A. What shall I do ?

  • divide your data into a work and a test set. As there is NCS, the division should be done in thin shells.
  • forget about NCS-restraints at this resolution.
  • use the HEME-parameters from the Cytochrome C6 structure by Frazao et al. (1ctj.pdb).
  • first refine isotropically against all data. Maybe leave out suspicious lowres reflections (SHEL or OMIT commands). Do not refine any multiple conformations at this stage.
  • use PLAN 200 -1 0.1 to get all Fo-Fc peaks. I like to go through these (higher than 4.5-5.0 sigma) to find interesting places in my model.
  • check whether you find any signs of anisotropy in your best isotropic maps, i.e. doughnuts around sulphurs or irons. This is fun and shows that anisotropic refinement is necessary.
  • put double conformations as decribed somewhere else in this FAQ.
  • try to find a computer with a lot of memory to do a full matrix inversion at the end of the refinement.

    Q16: Should I refine occupancies of water molecules ?

    We usually keep the water occupancies fixed at 1.0. With high resolution data we use occupancies fixed to 1.0 or 0.5 (put 11.0 or 10.5 in the .ins-file), and allow the U-values to refine. In cases (with very high resolution data) were we were able to compare this procedure with refining U's and occupancies simultaneously, Rfree was usually lower for the 1.0/0.5 model. For this reason SHELXWAT can also apply the 1.0/0.5 model but not refine the occupancies freely.
    However, non-fully occupied waters are only 'allowed' if:
  • they are coupled to a disordered part of the protein (then you can refine the occupancy together with the occupancy of the respective part of the protein using PART and FVAR, or
  • they form a cluster of at least 3 or 4 half waters.

    Q17: How do I make sure that the correct scattering factors are used ?

    Depending on the wavelength you provide on the CELL card, SHELXL will select the radiation as CuKa, MoKa or AgKa, whichever is nearest. If you are not happy with this, use a DISP instruction to specify the correct values for the wavelength you are using, or use the full form of the SFAC instruction. Both instructions are described in the manual !

    Q18: When I try to refine my protein as a rigid body using AFIX -statements, I get 'NON POSITIVE DEFINITE' messages. What is going on ?

    You probably forgot to put a BLOC 1 statement to fix the B-factors and only refine coordinates.

    Q19: Can you send us the correct scattering factors for ions (Ca2+ and Mg2+) ?

    George: We NEVER use ionic sfacs, either for macromolecules or small molecules. Extensive tests have shown that the neutral atom scattering factors are just as good in all cases. Any differences get mopped up by the thermal parameters anyway. Also see page 7.2 of the manual.

    Q20: When I use OMIT_* $H, I get an error-message: ** CB_306 CANNOT RIDE **. What is wrong ?

    For an OMIT-job, SHELXL only set the scattering factor of the omitted atoms to 0. These atoms nevertheless stay in the model, so the AFIX-statements have to be in order. In the case shown above, a closing AFIX 0 statement is probably missing just before the CB-atom.

    Q21: What about the HOPE command ?

    HOPE MUST NOT be used simultaneouly with anisotropic B factor refinement. This does not make sense ! If you want to use it, refine the HOPE parameters with an isotropic model and then fix them (HOPE -1).

    Q22: When I include my high resolution data, SHELXL produces nan's in some places and then crashes. What is going on ?

    If you have very pronounced anisotropy, some of the parameters in the HOPE correction will go negative (like atoms going Non Positive Definite in ADP refinement). This will cause all kinds of problems in the rest of the program. And it is also more likely to happen for the high resolution reflection as these the ones most affected by anisotropy.

    There is a special version of SHELXL that circumvents this problem. Let us know if you need to use this version.


    Q23: How much memory do I need for the matrix inversion ?

    A typical example is the full matrix inversion for the structure of mersacidin. Matrix Inversion for 7438 parameters against 33449 observables took 112 MByte of memory and took about 6 hrs. on a Pentium II PC running at 450 MHz. If SHELXL complains about too small arrays, use SHELXH ('shelx huge') to run the job. If that doesn't help, you have to recompile (which is easier than most people think ...).

    Q24: How do I set up the matrix inversion job ?

  • remove all restraints (SIMU,DELU,ISOR,DFIX,DANG,CHIV,BUMP,SUMP,FLAT). Make sure that you get the ones including small-letters as well. If all your restraints are called by capital letters you can run the following script to produce the necessary res-file: prep_ls.csh(NO WARRANTY !) Let me know if something doesn't work (trs@shelx.uni-ac.gwdg.de).
  • Remove CGLS whatever
  • Put L.S. 1
  • Put DAMP 0 0
  • Before you start the job, make sure that you understand the number of parameters fully (see next question). If this is not the case, you may need to rerun it because the results are not consistent.
  • Make sure that all restraints were REALLY turnt off. ... obviously they were not if you still get a list of 'Disagreeable restraints' ... You may still have a restraint left in polar space groups - this is used for fixing the origin.
  • If a job inverting the matrix for all parameters is too big for your computer or takes to lang to finish, invert the matrix only for those parameters you are most interested in, e.g. coordinates. Use the BLOC command to do this.

    Q25: How do I make sense of the numbers of parameters ?

    To be sure about what you are doing, sometimes it is useful to try to understand the number of parameters used in refined as given by SHELXL.
  • If you are doing isotropic B factor refinement, each atomic site corresponds to 3 (coordinates) plus 1 (B factor) parameters.
  • If you are doing anisotropic B factor refinement, each atoms corresponds to 3 (coordatinates) plus 6 (the 6 independent elements of the displacement parameter matrix) parameters.
  • The SHELXL bulk solvent correction corresponds to 2 parameters
  • There is always one extra parameter for the overall scaling of the observed to the calculated intensities ("overall scale factor", OSF).
  • If you have treated parts of the protein as rigid groups, each group will contribute only 6 parameters for coordinate refinement. E.g. forgotten AFIX 66 statements (to fix the geometry of aromatic rings) will lead to an unexspectedly low number of parameters.
  • Certain hydrogen mounting statements (HFIX mn) create parameters, e.g. for refining the torsion angle of a methyl group.
  • Using the twin card will introduce additional parameters depending on how many twin components you are dealing with.
  • If you are confused, use MORE 3 to inflate the lst-file. With MORE 3, the resulting file will contain a list of every single parameter, so that you can check.

    Q26: What is the number below the atom name after running a least-squares matrix inversion ?

    The number under the atom name is the 'radial atomic positional esd'. It characterizes the error in the coordinates of an atom taking into account the variance of the coordinates along the crystallographic axes and the covariances (i.e. the off-diagonal elements of the inverse of the Least-Squares matrix) between them. The exact formula for how this is done is only documented in the source code. If you want to calculate an error along a specific direction you should divide this number by sqrt(3), as in such a case the errors perpendicular to that direction do not matter.

    Q27: Some Waters are moving away from the centre of density toward the corner of the density. What is going on ?

    George: We have sometimes seen atoms move to the edge of density in a high resolution refinement. We attribute this to (a) Fourier series termination errors because of missing reflections (including Rfree reflections), and (b) the weights for refinement are of necessity different to those used to make a Fourier map (which theoretically requires unit weights). In either case these refined positions are the 'correct' ones, the map is misleading. However anti-bumping (BUMP) and other geometric restraints can also move atoms out of density, in this case you need to check whether you need to include partial occupancies. BUMP is not applied if the occupancy sum is less than 1.1.

    Q28: My substrate is pushed away by anti-bumping restraints that do not make sense to me. What is happening ?

    Make sure that you have chosen the correct scattering factor number for all atoms in the substrate.

    Short distances may be for real if the substrate is not fully occupied. You can model this situation using multiple conformation (see other questions in the FAQ).


    Q29: Our data/parameter ratio is about five, can we get rid of some restraints ?

    George: We always retain restraints for the protein. For those parts of the structure that are well determined from the data, the data will carry a much higher weight than the restraints. Leaving them in handles the less well determined parts of the structure (e.g. Glu sidechains with high B-values and disordered regions in general) without needing to take special individual action for each of them. We usually leave the restraints off for metal ions but not for organic substrates; the latter may have high B-values or partial occupancies and so be in need of restraints. If the structure of the substrate is in the CSD (or can be determined by doing a small-molecule structure), you can use SHELXPRO to generate the restraints automatically.

    Q30: What should be the final R-values of a protein structure at atomic resolution

    Depends on the problem ... Given good data (high completeness, reasonable redundancy, I/sig(I) approx 2 in outer shell (with sigmas estimated correctly)), Rwork should be below 12% with R freenot much higher, i.e. delta(Rwork,Rfree) < 3.0 percent. There are a few structures with Rfree lower than 10% and many with Rfree larger than 15%.

    Q31: Is it possible to refine a structure on two or more data sets measured at two or more wavelengths?

    Yes, using the LAUE and HKLF 2 instructions it is possible to have a different wavelength for each reflection.

    Q32: SHELXL-93 gave esd's for bond angles involving riding hydrogen atoms, SHELXL-97 doesn't - is this a bug or a feature ?

    George: The program SHELXL-93 calculated esds involving riding H-atom in a way that was mathematically correct, assuming that the magnitude and direction (but not position) of the X-H vector were exactly fixed; the resulting esds thus depended on the combined scattering power of H and X rather than just X. This gave rather small H-atom esds that confused people, so I made the 97 version output zero esds for riding hydrogens. If you want realistic esds for hydrogens you have to refine them freely (not recommended for proteins)!

    Q33: How do I include the anomalous signal of Se-atoms into my refinement ?

    Normally, macromolecules are refined with MERG 4, i.e. all symmetry equivalents and Friedel-pairs are merged and all f" values are set to 0.0. If there is a strong anomalous scatterer, like Se in a seleno-methionine substituted protein, it may be better to keep the Friedel mates apart and explicitly take the anomalous signal into account. To properly treat the anomalous signal you have to provide f' and f" values to SHELXL by using the DISP command.
    It may be worth playing with different f' and f" values to get the best results. I am interested in this myself; so if you make any interesting observations in this respect, please let me know: trs@shelx.uni-ac.gwdg.de

    When you refine against anomalous data and you are using Rfree, make sure that for all Friedel pairs either none or both reflections are in the test set (this is a case similar to NCS, where correlated reflections should be all in the test set to avoid artificially low values for Rfree).


    Q34: I have been warned that SHELXL was not really written to refine at 2.4 A resolution. Can I do it anyway ... ?

    Maybe ... you will need to decrease the sigma on some restraints to keep the model in order. For example, for SIMU restraints the default of 0.1 is probably to loose and should be set to something between, let's say 0.05 and 0.02. This will probably lead to an increase in the R-value, but may be benefical to your free R-value (and you model :-)).

    Q35: After a few small changes, my refinement jobs against 0.83 A data all over sudden always blow up with an error message ** REFINEMENT UNSTABLE **. How can I fix this ?

    In many cases, unstable refinements are difficult to fix. Sometimes removing the atoms giving rise to the maximum shifts will improve things. If this does not work, I normally use a STIR instruction of, let's say, STIR 1.1 0.05. This has helped in most cases.

    Q36: SHELXL complains that that I have too many atoms on a single SIMU/DELU instructions. What can I do ?

    George: This limit is more or less impossible to increase, but you can simply split the offending DELU (and possibly SIMU) instructions, e.g.

    DELU N_1001> OT2_9999 ! One chain per DELU, identify by first and last atoms
    DELU O_1 > O_999 ! Waters, cations etc. - can split further if required

    When I originally wrote the program for small molecules, I thought that this limit would be fairly safe, but it never pays to hardwire such limits in, however much easier it makes writing the code.


    Q37: I have to do a refinement of a protein containing some unusual cofactors and ligands. What is the best approach to create a parameter file ?

    You can use the 'J' option in SHELXPRO to create SHELXL restraints for any fragment for which you have coordinates in a file in PDB, CSD or SHELXL ins-format.
    If you want, I'd be happy to put your parameter set on our web site.

    Q38: How do I restrain an SCN molecule to be straight ?

    It is difficult to restrain a linear molecule to be straight. A very small sigma on the DFIX and DANG restraints envolved will make the molecule almost straight. But as the restraint is not very sensitive to small deviations from linearity, there will always be a small distortions. Making the sigma extremely small (i.e. < 0.001 or so) will not have any effect, as SHELXL has an internal cut-off for small sigmas to avoid problems during minimization.

    The best solution probably is to put SCN into the refinement as a rigid group with ideal geometry (from CSD or other source) and only refine the translational and rotational parameters of that group (AFIX 6, manual page 7-15). Refinement with AFIX 9 (page 7-15) would allow the bondlengths to shrink or expand uniformly - this would give the model a chance to take librational effects etc into account (not really, but better than nothing ...).


    Q39: Any suggestions for restraints for glycerol ?

    Here is what I (TRS) use:
    DFIX_ALK  1.417 CA OA CC OC             ! Engh&Huber CH2E-OH1
    DFIX_ALK  1.530 CA CB CB CC             ! Engh&Huber CH1E-CH2E
    DFIX_ALK  1.433 CB OB                   ! Engh&Huber CH1E OH1
    DFIX_ALK  2.431 CB OA CB OC CA OB CC OB ! shelxl DFIX_SER CA OG
    
    ... no guarantees, as always ...

    Q40: How do I straighten out an azide molecule ?

    George: There is no easy way to restrain it effectively so I recommend making the esds of the DFIX and DANG instructions for the azide very small (e.g. 0.001). For the record, there is a difficult way involving 9 extra free variables and three SUMP restraints. FLAT doesn't help - all groups of three atoms are coplanar - so it can be removed.

    Q41: How do I generate a restraint across a symmetry element ?

    George: To apply a restraint accross a symmetry element you will need to specify an Eqiv instruction and then refer to the symmetry generated atom(s) using _$1 etc. For example, if a disulfide bond invloving SG of Cys29 and its symmetry equivalent is bisected by a crystallographic twofold axis at 0.5, y, 0.5 you need to specify:

    EQIV $1 1-x, y, 1-z
    DFIX_29 2.031 SG SG_$1
    DANG_29 3.035 CB SG_$1 SG CB_$1

    For the disorder you should specify PART numbers and you should use a free variable for the occupancies (starting value on the FVAR instruction, e.g. 21 and -21 instead of 11 for occupancy) so that you can constrain the sum of occupancies to one. This is all fully explained in the documentation!


    Q42: How do I generate restraints for a linear arrangement of atoms, i.e. a cyano-group ?

    There is no standard way to generate such restraints in SHELXL. Here is the recommendation from George: To make a cyano group into a rigid group, put it immediately after the atom to which it is attached, put an AFIX 6 instruction before that atom and AFIX 0 after the N. If there are hydrogens attached to the first atom, the AFIX instructions should be combined using 5 as the code for a subsequent atom in a rigid group, e.g.
    AFIX 6
    C12 ...
    AFIX 25
    H12A ...
    H12B ...
    AFIX 5
    C ...
    N ...
    AFIX 0
    
    The initial geometry of the C12-C-N group will be retained, so it is important to start from an accurate geometry, obtained e.g. by a rigid group fit of an accurately detemined small molecule structure (either using FRAG...FEND etc. in SHELXL or by some other program).

    The 'standard restraints' are unlikely to be adequate for most ligands, they will often leave parts of it unconstrained and it will fly apart. The recommended procedure is to search the Cambridge database for a suitable small molecule structure (or if there isn't one, actually grow crystals of the ligand and determine its structure - SHELX works for small molecules too!) then use the J option in SHELXPRO to generate restraints form the atom coordinates. You may still need to add some FLAT and CHIV 0 restraints by hand, but usually it is obvious where.


    Q43: When should I put Hydrogen atoms ?

    Don't put hydrogens before the very end - they do not improve the phases much, but cost a lot of computer time. Do not put the hydrogens you want to see (e.g. the ones on histidines). It does not count to first put them, remove them and then proudly present Fo-Fc density for hydrogens. There is a bias-problem, even for these small atoms.

    Q44: Can I put Hydrogens if I have good data to 1.6 A ?

    George: If you have good 1.6A data you can expect to 'see' some of the hydrogens. In general these will be the N-H, CH and CH2 groups in which the N or C atom has a low temperature factor. There is no harm in putting them in if they reduce Rfree, because they do not add any extra parameters.

    Q45: Will putting Hydrogens introduce more parameters into my refinement ?

    George: Adding hydrogens using SHELX improves the model and does not add any extra parameters, so the only reason not to do so is that is costs computer time. For this reason we tend to add them late in the refinement, after making the atoms anisotropic (if justified) and modelling disorder. This has the advantage that the program usually adds the right hydrogens automatically (using HFIX) even for disordered residues.
    However it is better NOT to add the hydrogens on -OH groups (Tyr, Ser and Thr) because (a) it is rare that they can be seen in difference maps, (b) they have little effect on the R-values etc., (c) it is difficult for the program to predict their positions accurately, (d) if the program accidentally assigns two hydrogens to the same H-bond and then they are refined using the 'riding model' (AFIX 83), and antibumping restraints are switched on (BUMP), the repulsion between the hydrogens can introduce mechanical distortions into the structure.
    Adding hydrogens to proteins usually reduces R1 and R1(free) by the same amount, typically 0.5 to 1%. The effect can be bigger for accurate data from small molecules, recently we had an example where adding the hydrogens reduced R1 from 10.6% to 3.7%!

    Q46: How do I attach hydrogens to waters ?

    George:If you can see them in a difference map then you can insert the peaks and refine them with DFIX or SADI restraints for O-H and H...H. If you can't see them I doubt if it is worth trying to guess the positions, someone might believe them.

    Q47: I have 0.95 A data and I can see hydrogens in 2Fo-Fc maps. How do I make hydrogens anisotropic ?

    You don't ... :-). Even for small small molecules, we NEVER refine hydrogens anisotropically.

    Q48: When I add hydrogens, SHELXL shows a a warning "** BAD AFIX CONNECTIVITY: N_1 BONDS TO CA_1 CD_1 **. I did specify the N terminus when I ran shelxpro in the first round of refinement and I checked that I had the HFIX 33 N_1 statement before all other HFIX instructions. What else should I do?

    Your first residue is probably a proline, for which HFIX 33 does not make sense. To find the appropriate HFIX statements, try to find something sensible in some other place in the ins-file and use it for the Proline.

    Q49: The pdb file generated using the 'G' option (from .res) in shelxpro doesn't have a distinction for e.g. the three gamma hydrogens of Ile (they are all named 'HG2'). Bug or Feature ?

    George: 'It is a problem with the PDB format, because only three characters may be used for the name (an extra character is reserved for 2-letter element names). Partly for this reason, we never deposit hydrogens, most programs that need them recalculate their positions anyway.'

    Q50: After adding hydrogens, SHELXL complains: "** BAD AFIX CONNECTIVITY: N_1 BONDS TO CA_1 **". What is happening ?

    The program is trying to make the terminal N into an amide instead of NH3+, which doesn't work because it only bonds to one atom. Maybe you didn't specify the N terminus when you ran SHELXPRO? The quick solution is to include: HFIX 33 N_1 before any other HFIX instructions. The program always applies the first HFIX that is appropriate to a given atom.

    Q51: The hydrogens in my pdb-file have B-values of -94.75 A^2 - what is going wrong here ?

    When refined as 'riding hydrogens', the number in the B-factor field of the .res-file is -1.2. If this value is not interpreted correctly (i.e. the B-value of the hydrogen is constrained to be 1.2 times the B-value of the parent non-hydrogen atoms), you get these strange numbers (-1.2 times 8 x pi^2) = -94.75 . The best is to simply ignore such B-values.

    Q52: When should I start modelling disordered atoms ?


    Q53: How can I find disordered parts of my protein ?

    Look in the diagnostics tables produced by SHELXL for:

    Q54: How do I get a disordered residue under control ?

    to model a disordered sidechain:
    1. Group everything from CB onwards by using a PART instruction.
    2. Reset B's to isotropic.
    3. Set occupancies to 0.66 - the first conformation you find is usually the major one. Therefore 0.66 is a good starting point.
    4. Refine for a few cycles starting with a few cycles of isotropic refinement then anisotropic (use the second parameter of the CGLS statement for this). If you are using the second parameter of CGLS for Rfree, forget about the isotropic refinement or do it in a separate job if necessary.
    5. Try to pick up the second conformer in 1fo-1fc maps.
    6. Put in the second conformer and couple its occupancy to the first by using free variables (FVAR). The 'u'-option in SHELXPRO is useful for this.
    7. If you use the correct nomenclature and PART instructions, SHELXL will put the the necessary restraints automatically.

    Q55: What do I do with a sidechain for which the electron density is a complete mess ?

    We generally, try to model all disorders, but if we really can't see anything we truncate the sidechain.
    SHELXL usually doesn't mind this, except that it may not be able to place hydrogen atoms, e.g. if you cut at CB, the program cannot make an idealized CH2 out of it. The remedy is either to change the name of the residue to ALA (very confusing !) or to override HFIX with say HFIX_resnum 0 CB (or HFIX_resnum 3 CB), which must come before the original HFIX. The latter is the better.

    Q56: The most important atom in my structure is the OG of a Serine. It is not really in one position, but no matter what I do I cannot find any density for a second conformer. I would like to locally release the SIMU restraint to let it the OG loose a bit. But it does not work ...

    This is a bit tricky. If there is two SIMU restraints addressing the same pair of atoms, the program keeps whichever allows the smallest deviation. There is a trick to get around this. Let's say, your Ser has residue number 5. Then you should: This will give the following local restraint:
    0.6000   CB_5 - OG_5
    
    The 0.6 is used here because OG is a terminal atom.

    Q57: I have a sidechain in two alternate positions. A water molecule is consistent with position one, but will clash with position two. What can I do ?

    You need to couple the water and the tyrosine disorders using PART numbers and use a free variable for the occupancy.

    Q58: I have a disulfide that looks like it might only be partially in the reduced state. How can I model it ?

    Thomas Pape in our lab had the same problem. He used PART 1 for the unbroken part. PART 2 (for which restraints are automatically generated) for the part that was broken, but still in disulphide position and PART 3 for the part of the disulphide bridge that was broken and had moved away. This way, atoms in PART 3 are not affected by the disulphide restraints.

    You may need to create some extra restraints for PART 3 but the rest should work automatically.

    Hope this is not too confusing ...


    Q59: I have a Lysine in my structure that sometimes forms a Schiff base and sometimes not. How can I model this ?

    This is one of the rare case where you need to apply different restraints to different conformations of the same side chain (normally, if you have to do this, something is fishy ...).

    As there is currently no way to put restraints specifically for PART identifiers, you will have to rename the two conformations into artificial residues, let's say, ALYS and BLYS for which you then can speficy all the restraints you need. Having ALYS for the non-Schiff base Lysine will also allow you to use special non-standard Lysine restraints for that particulare amino acid.


    Q60: I managed to partially photoactivate my crystal, can I somehow model the ground state and the activatated state simultaneously ?

    George: In principle you can make the whole of the ground state PART 1 and the whole of the excited state PART 2. The occupancies can be refined in the usual way with free variables. Either PART can be fixed or refined freely using AFIX instructions. It becomes appreciably more complicated (but still possible) if you wish to model disordered side-chains as well (you need to use extra PART numbers, plus some BIND instructions) or when you include hydrogens (you need to modify the AFIX instructions [that are generated from HFIX] by hand if you wish to keep parts of the structure fixed).

    Although this all works - we recently refined a 20 amino-acid peptide for which the whole molecule was in two separate conformations on top of one another - it is not trivial and could result in an awful mess.


    Q61: How can I interface with O ?


    Q62: How are alternative conformations marked in the pdb-files written by SHELXL ?

    SHELXL-97 strictly obeys pdb-rules for alternative conformations, i.e. a disordered serine would look like:

    ATOM    749  N   SER    98     -47.798  18.206  40.432 1.000 10.05
    ATOM    750  CA  SER    98     -47.369  18.674  39.124 1.000 10.02
    ATOM    751  C   SER    98     -46.129  17.884  38.664 1.000  8.53
    ATOM    752  O   SER    98     -45.304  17.445  39.458 1.000 10.25
    ATOM    753  CB ASER    98     -46.887  20.145  39.241 0.456 10.19
    ATOM    754  OG ASER    98     -47.931  20.977  39.669 0.456 10.51
    ATOM    755  CB BSER    98     -46.911  20.164  39.247 0.544 10.53
    ATOM    756  OG BSER    98     -46.430  20.638  37.992 0.544 12.85
    
    the identifiers (" ",A,B,...) can be mapped 1-to-1 to the PART instruction used in input files for refinement.

    Q63: How can I work with alternative conformations in O ?

    the easiest way to deal with double-conformations is probably to read in two molecules each corresponding to one conformation, work on them as usual, save both of them, and then do some manual cutting and pasting ...

    Following is a shell-script to make two pdb-files out of one (written for SHELXL-97 output, but easily changeable for other formats).

    #/bin/csh -f
    #
    # filename: pdb2pdbo.sh
    #
    # convert SHELXL96-pdb 
    #   to something useful for O
    #
    # Thomas Schneider     2-Feb-96
    #########################################
    #
    # some cleanup on pdb-file
    # ------------------------
    grep -v ANIS $1.pdb  \
    | grep -v "W   HOH" \
    | grep -v ^CRYST \
    |  awk '(substr($0,14,1)!="H")&&(substr($0,13,1)!="H") {print}' \
    > tmp1.pdb
    #
    # extracting Cryst1 card
    #
    echo Extracting CRYST1 from pdb-file
    head -10 $1.pdb | grep ^CRYST1 > t_cryst
    cat t_cryst
    #
    # separating multiple conformations
    # -------------------------------
    nawk ' \
      BEGIN {OFS=""}\
      {s1=substr($0,16,1);s2=substr($0,17,1)} \
      (s2=="A") {print substr($0,1,16)," ",substr($0,18,50)} \
      (s2==" ") {print} \
    ' tmp1.pdb > tmp1_A.pdb
    nawk ' \
      BEGIN {OFS=""}\
      {s1=substr($0,16,1);s2=substr($0,17,1)} \
      (s2=="B") {print substr($0,1,16)," ",substr($0,18,50)} \
      (s2==" ") {print} \
    ' tmp1.pdb | grep -vi WAT > tmp1_B.pdb
    foreach v (A B)
      nawk ' \
        BEGIN {OFS=""} \
        {print substr($0,1,21),"'${v}'",substr($0,23,45)} \
      ' tmp1_${v}.pdb > tmp2
      cat t_cryst tmp2 > $1.pdb${v}
    end
    
    BTW: using the undocumented 'BUD' format when writing coordinates will write fractionals :-) .

    Q64: Why can't SHELXL use 0/1, 1/0, 0-999/0-999 to flag reflections belonging to the free/work set ?

    A word from the author of the program: 'One line of the .hkl file contains h,k,l,I,sigma(I) and (optionally) a 'batch number'. The remaining space on the card (!) is reserved for the direction cosines. This format has remained unchanged since the late 1960's (Microsoft and CCP4 please note) and under no circumstances whatsoever will I agree to a change that could introduce an incompatibility (SHELX users appreciate my obstinacy in such matters). The batch number is normally little used (it was originally designed for film data) except for non-merohedral twins (HKLF 5 format), in which case it becomes the component number. The only reasonable and compatible way that I could introduce an Rfree flag was to make the 'batch number' negative for the reference set and positive for the working set. Since the default batch number is 1, this means that the Rfree flag is usually +1 or -1.

    Q65: Why does SHELXL not model my low resolution data, even if I use the bulk solvent correction ?

    Possible reasons are:

    Q66: I have excellent data to 1.4 A. R dropped from 19.3 to 16.3 and Rfree from 21.6 to 21.0 on going anisotropic, obs/par went from 4.8 to 2.1 Can I go/stay anisotropic ?

    Whether or not your data allow anisotropic refinement depends not only on the resolution but also on solvent content (which affects the data to parameter ratio), data quality and other factors. The only way to find out whether anisotropic refinement is justified is to do an Rfree-test. Generally, if the drop in Rfree is less than 1 % you should revert to isotropic.
    We recently had a dataset measured (weakly) to 1.2 A that gave a 0.5 % drop in Rfree so we switched back to isotropic. On the other hand we have one 1.4 A dataset (but the crystals would have diffracted further) where Rfree dropped 3.5 % on going anisotropic.

    Q67: I have data to 1.35 A for a protein containing a heme. Rwork drops by 3.2 and Rfree by only 1.4 % on going anisotropic (approx 77000 data / 33000 parameters). What shall I do ?

    Try to just refine the Fe and S atoms and perhaps the heme anisotropically. Since this is many fewer atoms than for full anisotropic refinement, a drop of say 0.4 % or better in Rfree would be acceptable. If Rfree drops by even less, then you should go back to the fully isotropic model.

    Q68: I have 1.6 A data, Rfree drops by 0.2% on going anisotropic. Is this significant ?

    In general, on would not regard a drop in Rfree of 0.2% to be significant, since Rfree, which is based on relatively few reflections, has quite a large standard deviation.

    Q69: On going anisotropic Rfree converges after 3 cycles, Rwork only settles after about 10 cycles. Should I worry ?

    Rfree always "converges" faster than R1 in cases where Rfree doesn't drop by the same amount as Rwork. R1 should be allowed more or less converge, knowing well that the structure is "overfitted" to have a clear-cut situation.

    Q70: Rfree doesn't go down all that much on going anisotropic, but the maps are much better. Can't I continue with anisotropic refinement ?

    Your maps look 'better' because they look more like your model - that's what overfitting is about.

    Q71: I get loads of 'may be split' messages. What does this mean ?


    Q72: Some of the Uij's in the .res file and in the .pdb file match, although they are in different order, others are completely different. Is this a bug or a feature ?

    It is a feature. The order in the .res file is: 11 22 33 23 13 12 (see SHELXL-manual). The order in the .pdb file is: 11 22 33 12 13 23 (pdb convention). The differences in the numbers arise from the Uij's in the .res file being relative to the crystal axes and the values in the .pdb file being relative to orthogonal axes and multiplied by 10000.

    Q73: Upon PAVARTI analysis of Shelxl anisotropically refined protein structures, I find the majority of atoms with large anisotropy ( axis ratio < 1:5 ) are from residues with alternate conformations. Could this be a systematic problem due to the refinement of both occupancy and anisotropic thermal parameters together?

    George: I suspect that it is generally true that atoms in disordered residues have higher equivalent Uiso values and are more anisotropic. This is chemically eminently plausible, but might also be influenced by the types of restraint we apply. I note that the author of REFMAC prefers to use an ISOR rather than SIMU type of restraint in such cases, maybe you should try this (and see how it affects Rfree), since it would tend to make the atoms less anisotropic.

    Q74: Is it possible to read in a partial structure factor (F/phi), to be combined with the F/phi calculated from the atomic coordinates ?

    George: SHELXL does not have a way of reading in partial structure factors, and it would be very difficult to modify the code to do this.
    The bulk solvent correction is a weak link in SHELXL and I am trying to find a way of improving it (as of 3-Dec-2000).

    Q75: I get very high R-values especially at low resolution (with Fobs systematically weaker than Fcalc). Upon refining the Extinction coefficient in SHELXL, Rwork and Rfree decrease dramatically. Is this Extinction for real ?

    George: I have never encountered extinction with a protein crystal and (for good theoretical reasons) don't expect to. Even with small molecules, freezing the crystal often eliminates extinction very effectively. Note that since the solvent and extinction affect the data in a very similar way, it is not possible to refine both, so SHELXL is designed to make this logically impossible (I frequently have to prevent users from doing things that don't make sense). There are three common causes of large apparent extnction (in order of frequency): (1) You are reading in F (obtained by e.g. converting from an XPLOR input file) but have specified HKLF 4 (for F-squared) in the .ins file. (2) 'Overloads' have not been removed from the data. (3) Your detector (or the data processing software) is faulty.

    Q76: I am refining a structure against twinned data. Can a structure with a twin-ratio of 0.5 be refined ? Are the F's calculated in the .fcf-file 'detwinned' ? How can I check the progress of the refinement ?

    George: The .fcf file is 'detwinned', so you can make maps etc. in the normal way using SHELXPRO. Generally the quality of such maps is not as good as that of maps from normal not-twinned crystals at the same resolution, but using SHELXL it is entirely possible to complete the structure determination even if the twin factor is exactly 0.5. Some other programs use detwinning methods which are mathematically unstable when the twin factor approaches 0.5. Details of the algorithms used are given in the manual. Exactly the same procedures can be used to monitor the refinement as for untwinned crystals, but care is needed with Rfree because twinning introduces correlations if one of a twin-related pair of reflections is in the working set and the other is in the reference set. You can use the thin-shell method of choosing the reference reflections to get around this (in the V option in SHELXPRO).

    One more thing: refining against twinned data it can be very dangerous to use automatic structure extension methods that are based on difference Fouriers (i.e. ARP, SHELXWAT). When used with twinned data, such methods tend to produce artifacts, especially in the early stages of a refinement and if the twin-ratio is close to 0.5.


    Q77: Does SHELXL calculate estimated observed F-values for a single twin component ?

    George:
    For the purpose of calculating a difference electron density map and the final Fo/Fc tables (but not of course for refinement) SHELXL partitions the observed intensity in the ratio of the calculated contributions, i.e.
    Fo^2 (for component 1) = Fo^2 (total) * Fc^2 (component 1) / Fc^2 (sum of all components)
    where these Fc values include the twin scale factors k.
    This is the weakest link in the treatment of twins and if you have a better idea please let me know!

    Q78: Where can I read about twinning ?

  • Regine Herbst-Irmer & George M. Sheldrick, Acta Cryst. B54, 443-449 (1998).
  • Schneider TR, Kärcher J, Pohl E, Lubini P, Sheldrick GM: Ab initio structure determination of the lantibiotic mersacidin Acta Cryst. D56, 705-713 (2000).

    Q79: I have brilliant 1.4 A data on a 40 kDa protein. Is it possible to phase the thing ab initio ?

    At the moment (Jul-99) you still need at least 1.2 A data to have a chance if your protein does not contain any metal. If the molecule has a metal bound and/or lots of sulfurs, you might get away with somewhat lower resolution data. And, 40 kDa is on the big side.

    Q80: How do I get going with a molecular replacement solution ?

    Run 20 cycles of full-matrix least-square refinement with Rfree: L.S. 20 -1. Start with reflections at 4.0 A and work your way towards, let's say 3.0 A: STIR 4.0 0.3 and SHEL 10 3.0. Refine only the coordinates of rigid bodies: BLOC 1 and AFIX 6 before the first and AFIX 0 after the last atom of each rigid body; outcomment the BUMP statement to not cause confusion by clashes between molecules.

    Q81: Where can I find information about SHELXL ?

  • General Information: http://shelx.uni-ac.gwdg.de/SHELX/index.html
  • FAQ http://shelx.uni-ac.gwdg.de/SHELX/index.html
  • Sheldrick GM, Schneider TR: "SHELXL: High Resolution Refinement", Methods in Enzymology (R.M. Sweet and C.W. Carter Jr., eds.), Academic Press; Orlando, Florida, 277:319-343 (1997).
  • Online version of the manual (ps,pdf,doc-format on the server shelx.uni-ac.gwdg.de.

    Q82: How can I display the pdf-version of the manual ?

    to look at pdf-files you need a program called 'Acrobat Reader'. You can get it for free from the web (www.adobe.com, lower left corner). But it may be on your mashine already. Just try 'acroread'. Maybe it starts up.

    Q83: I am trying to understand the number of parameters refined from my number of atoms etc., but it doesn't make sense ... What is the problem ?

    Every atom normally contributes 3 coordinates and 1 or 6 B-factors. Deviations occur if:
  • a coordinate/occupancy/b-factor has been constrained to a particular values (look for '11' and '10' in the .ins file)
  • (very nasty) some 'AFIX 66' have remained or crept into the ins file. This reduces the number of parameters.
  • some atoms that should be anisotropic are accidentally isotropic.
  • something else - please let me know trs@shelx.uni-ac.gwdg.de.
  • in desperation: user MORE 3 to get a list of all parameters used. Do NOT print the resulting .lst file. To convert the list of refined parameters to something more readable, I suggest the following:
    1. paste the list into a new file
    2. vi and ':s/[0-9] /#/g'
    3. run: awk 'BEGIN {RS = "#"} {print $0}' on the resulting file.
    4. look the output and get depressed.

    Q84: I am following exactly what is said in the manual, but SHELXWAT always stops with "** Unsuitable format for file try7.res **". What can I do ?

  • Check that you are really exactly following what is in the manual.
  • Check for straying PART-instructions in the list of waters. Even a PART 0, which actually doesn't do anything, will stop SHELWAT.
  • It is deadly and difficult to spot if you have a residue called HOH before the first residue in the list that SHELXWAT should work on.

    Q85: I have good 1.9 A data. Things look fine but Rfree is 10 percent higher than Rwork. What can I do ?

    A large difference between Rwork and Rfree indicates that you are probably 'overfitting' your data, i.e. you are refining parameters that are not well determined by the data. There is a number of things to try:
  • Tighten the restraints with smaller values in the DEFS statement
  • Add non-crystallographic symmetry restraints
  • Remove high-B waters

    Q86: My systems administrator accidentally killed my shelxl refinement at atomic resolution on the last cycle of a 10 cycle run. What is the proper way to resume this refinement?

    George: There is an good chance that the .res file, which is rewritten after each cycle, can simply be renamed as .ins and resubmitted. This feature was included in SHELXL to handle such accidents, but if you are unlucky the operating system has held it in a buffer and not written it to the disk. Check that this file is complete, you may have to add HKLF 4 to the end. You can reduce the number of cycles, but because of possible rounding errors I would recommend at least two cycles. You cannot use the .tmp files.

    Q87: I am having problems refining a complex structure with a ligand near a twofold crystallographic axis. After the refinement the geometry of the ligand is wrong.

  • Check that all your restraints are correct - you can use SHELXPRO to automatically generate restraints from a template molecule.
  • If some atoms are close to a special position, SHELXL tries to automatically deal with that by halfing their occupancy etc.. This leaves some comments in the .lst-file. If you want to avoid the automatic treatment you can use a negative PART number for the respective part of the structure.
  • If there is a problem with symmetry atoms crashing, you can make the first parameter of the BUMP statement negative. This turns anti-bumping restraints between symmetry-related atoms of.

    Q88: My structure contains 2656 amino acids and I'd like to refine it with SHELXL. When starting, I get an error message: "** TOO MANY ATOMS REFERENCED IN A SINGLE DELU INSTRUCTION **" What is happening ?.

    As SHELXL originally was a small molecule program, DELU can only accept 10000 atoms in this version of the program. But there is an updated version (which you should get anyway). This version can handle more atoms. But, if you have more than 65398 atoms you should contact us ...

    Q89: How do I find places where I can improve my model ?

  • The list of 'Disagreeable restraints' in the .lst file gives you very good hints to places in the structure that need some work
  • B-values. plots of these against residue number can be made using SHELXPRO.
  • Systematically check all Fo-Fc difference peaks higher than, let's say, 5 sigma.
  • Clean up you restraints if you get warning messages about missing restraints or restraints for which the respective atoms are missing.

    Q90: What criterion does SHELXL (both 93 and 97) use for defining equivalents as 'inconsistent'?

    George: The test used in the program is as follows. When N reflections are merged to give a single unique intensity, two estimates are made for the esd of this mean intensity:

    S1 = Sum |I-<I>| / N.Sqrt(N-1) (esd from agreement of equivalents)

    S2 = Sqrt[Sum(1/sigma^2)] (esd from combined input esds)

    where I = individual intensity before merging and sigma^2 is the square of its esd, <I> is the mean intensity and |x| means absolute value.


    Q91: When I read an O-map produced by SHELXPRO, O complains: "Couldn't read map file header." Am I doing something wrong ?

    This is probably due to the 'endian', i.e. the byte order, of the map. The endian in mashine dependent and if the mashine where you produced the map has a different endian of the mashine where you are trying to display it, it will not work. Try to run SHELXPRO on the very same mashine where you run O. If that doesn't work you can still try to put in an Endian code by hand (one of the questions that SHELXPRO asks) - just try all possible codes.

    Q92: I have been trying to run Shelxd on a protein problem to locate the Se from SAS data and I get the error message** Not enough space to store patterson ** .

    George:FORTRAN77 requires me define the array dimensions in advance and it is difficult to find a memory size that suits everyone. Try PSMF 3 (or even 2.5) instead of the default 4 (the memory used is approximately proportional to the cube of this number), this should be fine for the purpose and require appreciably less memory. Curtailing the resolution with SHEL also uses less memory, 3.5 to 4A is usually enough (and for weak SAD may even be best) to find the Se.

    Q93: What is the reference for shelxd ?

    Schneider TR, Sheldrick GM: Substructure solution with SHELXD, Acta Cryst D58:1772-1779 (2002)

    Q94: How can I produce a histogram of omega angles ?

    A pragmatic solution is to take the table that starts after "Dihedral angle OMEG" in your last lst-file, paste it into another file and then use some Office-Software to make the histogram. Not very elegant, but works ...

    Q95: I have noticed that SHELX calculates a different solvent content for my crystal than is calculated by using the Matthews method (or using CCP4 RWCONTENTS or CNS). What is the story.

    George: This is a known BUG in shelxpro. It will be fixed in the next release (ca. 2007?). For the meantime, you will have to edit this line of the final PDB file by hand to put in a better value.
    For the record I now prefer Kevin Cowtan's more intuitive and direct method of estimating the solvent content (assume that an average amino-acid has a volume of 140 cubic Angstroms)

    Problems/comments/suggestions: trs@shelx.uni-ac.gwdg.de