CHARMM c24 nbonds.doc



File: Nbonds ]-[ Node: Top
Up: (commands.doc) -=- Next: Syntax


                    Generation of Non-bonded Interactions

        Nonbonded interactions (frequently abreviated nbond) refer to
van der Waals terms and the electrostatic terms between
all atom pairs that are not specifically excluded from nonbond calculations
as for example are directly bonded atoms *note nbx: (struct.doc)nbx.
These terms are defined on atom pairs and to a first aproximation would
require the number of atoms squared calculations. To avoid this burden
various truncation and approximation schemes can be employed in the
program, breaking the nonbonded calculation into two parts,
initialization and actual energy calculation.

        The method of approximation, cutoffs, and other relevant
parameters can be entered any time the nbond specification parser is
invoked. See the syntax section for a list of all commands that invoke this
parser.


* Menu:

* Syntax::              Syntax of the nonbond specification
* Defaults::            Defaults used in the nonbond specification
* Function::            Description of the options
* Tables::              Using nonbond lookup tables in place of analytic
                        potential energy functions
* Cubes::               Alternative way to compute the nonbonded
                        contact list--when to use it
* ZTBL::                Compressed-table representation of the
                        exclusion and 1-4 lists--implementation details


File: Nbonds ]-[ Node: Syntax
Up: Top -=- Next: Defaults -=- Previous: Top


[SYNTAX NBONDs]

{ NBONds       }   { [INBFrq integer] nonbond-spec  }
{ UPDAte ...   }   {                                }
{ ENERgy ...   }   {                                }
{ MINImize ... }   {                                }
{ DYNAmics ... }   {                                }

COMPARE  ... NBOND [nonbond-spec] <del>  ...

NOTE: The INBFrq value is remembered. If its value is zero,
no interpretation of [nonbond-spec] will be made, as well as no
modifications of the nonbond lists. It's default value is -1 .

In all cases as many keywords and values as desired may be specified.
The keywords are:

nonbond-spec::= [method-spec] [distances-spec] [misc-specs] [INIT] [RESET]

method-spec::= [ ELEC electrostatics-spec ] [ VDW vdw-spec  ] [ st2-spec ]
               [ NOELectrostatics         ] [ NOVDwaals     ]

                  [BYCUbes]
                  [BYGRoup]

electrostatics-spec::=
[ ATOM                                         ] [ CDIElec ] [ SHIFted  ]
[ GROUp [ EXTEnded [ GRADients ] [ QUADrip ] ] ] [ RDIElec ] [ SWITched ]
[       [          [ NOGRad    ] [ NOQUads ] ] ]             [ FSWItch  ]
[       [ NOEXtended                         ] ]             [ FSHIft   ]


             [ VGROUP [ VSWITched ]        ]
             [                             ]
vdw-spec::=  [ VATOM  [ VSHIfted  ]        ]
             [        [ VSWItched ]        ]
             [        [ VFSWitch  ]        ]

st2-spec::= [ ST2List   ]
            [ ST2Nolist ]

distances-spec::= [general-dist] [warning-dist]

general-dist::=
        [ CUTNB  real ] [ CTONNB real ] [ CTOFNB real ] [ CTEXNB real ]

warning-dist::=
        [ WMIN   real ] [ WRNMXD real ]

misc-specs::= [ EPS real ] [ E14Factor real ] [ NBXM integer]
              [ NBSCale real] [IMSCale real]

                        [ NORXN                ]
                        [ RXNFLD  rxnfld-spec  ]
                        [ RXNNB   rxnfld-spec  ]

rxnfld-spec::= [ EPSEXT real ] [ ORDER integer ] [ SHELL real ]


File: Nbonds ]-[ Node: Defaults
Up: Top -=- Next: Function -=- Previous: Syntax



        The defaults for the nonbond specification reside with the
parameter file.  The defaults are specified at the begining of the van der
Waal section.  These defaults are the recommended options.

        The following command contains all defaults for one of the
older protein parameter files, and is equvalent to the command "NBONDS INIT"
in it usage when this parameter file is present.

NBONDS ELEC ATOM NOEX NOGR NOQU SWIT RDIE  VATOM VDW VSWI VDIS ST2L -
       CUTNB 8.0  CTEXNB 999.0 CTOFNB 7.5 CTONNB 6.5 WMIN 1.5 WRNMXD 0.5 -
       EPS 1.0 NORXN EPSEXT 80.0 ORDER 10 SHELL 2.0

        Values do not change unless explicitly specified, except for
the ON/OFF values which cascade when the cutoff values are changed as;
                CTOFNB=CUTNB-0.5
                CTONNB=CTOFNB-1.0

WARNING::
      These old defaults have been shown to be detrimental to protein behavior.
It is generally better to use the defaults in the parameter sets.


RECOMMENDED:

Presented here is a suggested list of options. Where specifications are
missing, substitute the defaults:

For atom based cutoffs:

NBONDS  ATOM  FSHIFT CDIE  VDW VSHIFT  -
        CUTNB 13.0  CTOFNB 12.0 CTONNB 8.0  WMIN 1.5  EPS 1.0

or (perhaps better for longer cutoff distances, but more expensive)

NBONDS  ATOM  FSWITCH CDIE  VDW VSHIFT  -
        CUTNB 13.0  CTOFNB 12.0 CTONNB 8.0  WMIN 1.5  EPS 1.0

For group based cutoffs (doesn't vectorize well):

NBONDS  GROUP  FSWITCH CDIE  VDW VSWITCH  -
        CUTNB 13.0  CTOFNB 12.0 CTONNB 8.0  WMIN 1.5  EPS 1.0

For extended electrostatics :

NBONDS  GROUP  SWITCH CDIE  VDW VSWI  EXTEND GRAD QUAD -
        CUTNB 13.0  CTOFNB 12.0 CTONNB 8.0  WMIN 1.5  EPS 1.0

For a better description of these methods and how they perform, see:
 P.J. Steinbach, B.R. Brooks: "New Spherical-Cutoff Methods for Long-Range
 Forces in Macromolecular Simulation,"  J. Comp. Chem. 15, 667-683 (1994).



File: Nbonds ]-[ Node: Function
Up: Top -=- Previous: Defaults -=- Next: Tables


NBSCale & IMSCale
=================

   The first time that the primary or image non-bond list is generated, an
estimate is made, based on the number of atoms, of how much memory will be
needed to store the pair list.  If too large an estimate is made, memory will
be wasted.  If too small an estimate is made, a second (and larger) estimate
will be made and the memory allocated on the first attempt is wasted.  NBSCale
is a correction factor to the intial estimate allowing better control of 
memory allocation.  For example NBSCale 1.5 allocated 50% more memory than
CHARMM would usually guess and NBSCale 0.8 allocated 20% less.  IMSCale does
the same thing when the image pair list is generated.  The values of NBSCale
and IMSCale must be determined empirically, but they can generate huge
memory savings on large systems.

These keywords are valid wherever nonbond options may appear, e.g. ENERgy,
DYNAmics, MINImiz, and UPDAte.  Note that NBSCale must be used in the first 
statement which generates a nonbond list; an UPDAte without NBSCale followed 
by DYNAmics with NBSCale is ineffective. 

For a system of about 17,000 atoms, a value of NBSCALE 1.5 was effective in
providing about 25 MB of memory reduction compared to the the default NBSCale
value (1.0); while for a system of about 10,000 atoms, the optimum of 1.4 gave
a reduction of about 12 MB.  For example:

mini abnr nstep 500 nprint 10 tolenr 1.e-6 cutnb 14. ctofnb 12. ctonnb 10. -
 fshift cdie eps 1.0 vshift inbfrq 20 imgfrq 20 cutim 14. nbscale 1.5

Likewise for DNYAmics, ENERgy, or the UPDAte commands; the latter is useful in
doing some trial and error probes to determine the optimum NBSCale value,
with fixed sizes for CUTNB and CUTIM:

update cutnb 14. ctofnb 12. ctonnb 10. fshift cdie eps 1.0 vshift -
  inbfrq 20 imgfrq 20 cutim 14. nbscale @1 -

where the csh or tcsh command line might be something like:

% charmm medium 1:1.5 < tst_nbscal.inp >& nbscal_1.5 &

The above is based on single processor calculations; the same general idea
applies to parallel calculations, but the optimum value for NBSCale will
be less than 1.0, perhaps 0.7 to 0.8 for systems in the 10K to 17K atom
range.  Memory usage can be further reduced using the IMSCale keyword; 
some experimentation will be required depending on the number of atoms
and the cutoffs being used.

INBFrq n
========

   Update frequency for the non-bonded list. Used in the subroutine ENERGY()
to decide whether to update the non-bond list. When set to :

     0 --> no updates of the list will be done.

    +n --> an update is done every time  MOD(ECALLS,n).EQ.0  . This is the old
           frequency-scheme, where an update is done every n steps of dynamics
           or minimization.

    -1 --> heuristic testing is performed every time ENERGY() is called and
           a list update is done if necessary. This is the default, because
           it is both safer and more economical than frequency-updating.

Description of the heuristic testing algorythm :
-----------------------------------------------
   Every time the energy is called, the distance is computed each atom moved
since the last list-update.
   If any atom moved by more than (CUTNB - CTOFNB)/2 since the last list-update
was done, then it is possible that some atom pairs in which the two atoms are
now separated by less than CTOFNB are not in the pairs-list. So a list update
is done.
   If all atoms moved by less than (CUTNB - CTOFNB)/2 , then all atom
pairs within the CTOFNB distance are already accounted for in the non-bond list
and no update is necessary.

Description of the code for the heuristic testing :
--------------------------------------------------
   This section describes how programmers can control the list-updating
behavior when their routines call the ENERGY() subroutine.
   All list-updating decisions, whether they are frequency based or heuristic
based, are made in the subroutine  UPDECI(ECALLS) , which is called from only
one place : at the very beginning of ENERGY().
   UPDECI(ECALLS) can be controled through INBFRQ (via the CONTRL.FCM common
block) and ECALLS (via the ENERGY.FCM common block) as follows :

     If INBFRQ = +n --> non-bond list is performed when MOD(ECALLS,n).EQ.0 .
                        Image and H-bond lists are updated according to IMGFRQ
                        and IHBFRQ.

     If INBFRQ =  0 --> non-bond list update is not performed. Image and H-bond
                        lists are updated according to IMGFRQ and IHBFRQ.

     If INBFRQ = -1 --> all lists are updated when necessary (heuristic test).

     (note that ECALLS is incremented by ICALL every time ENERGY(,,,ICAL) is
      called. In most cases, ICALL=1 )

   The current implementation of UPDECI() will work (without modifications) to
decide whether the image/crystal non-bond lists need updating, provided the
periodicity parameters don't change (i.e. constant Volume).
    UPDECI() is easily adapted to variable Volume dynamics/minimizations. This
is described in comments of the routine itself.

Further computational economy in update-testing :
-------------------------------------------------
   A programmer can sometimes skip the heuristic test itself, making the
decision whether to do list-updating even more economical.
   This option is only available if the size of the step taken since the last
call to ENERGY() is known. For an example of usage, see the subroutine ENERG()
in TRAVEL.




NON-BOND ENERGY TERMS.
======================
        The electrostatic options are separate from the van der Waal
options, though some keywords are shared between them. The following is
a description of all options and keywords.

1) Electrostatics

        The ELEC keyword (default) invokes electrostatics. The NOELec
keyword supresses all electrostatic energy terms and options.
There are two basic methods for electrostatics, GROUp and ATOM.  A 
model based on the GROUp method is the extended electrostatics model 
which approximates the full electrostatic interaction and eliminates
the need for a cutoff function. This model is based on the partitioning
of the electrostatic term into two contributions.  One comes from the
interaction between particles which are spatially close and is treated
by conventional pairwise summation. The second contribution
comes from interactions between particles which are spatially distant from
one  another and is treated by a multipole moment approximation.
[The original model was described in B. R. Brooks, R. E. Bruccoleri,
B. D.  Olafson, D. J. States, S. Swaminathan, M. Karplus.  J. Comp. Chem.,
4, 187, (1983) and more recently in R.H. Stote, D.J. States and M. Karplus,
J. Chimie Physique (1991)]


        A) Functional Forms

Atom electrostatics indicates that interactions are computed on an
atom-atom pair basis. There are two options that specify the radial
energy functional form. The keywords CDIE and RDIE select the basic
functional form. The SWIT and SHIF keywords determine the long-range
truncation option.

CDIE - Constant dielectric. Energy is proportional to 1/R.
RDIE - Distance dielectric. Energy is proportional to 1/(R-squared)

SWIT - Switching function used from CTONNB to CTOFNB values.
SHIF - Shifted potential acting to CTOFNB and zero beyond.

        B) Atom electrostatics

Atom electrostatics indicates that interactions are computed on an
atom-atom pair basis. There are two options that specify the radial
energy functional form. The keywords CDIE and RDIE select the basic
functional form. The SWIT and SHIF keywords determine the long-range
truncation option.

[ ATOM ] [ CDIElec ] [ SHIFted  ]
         [ RDIElec ] [ SWITched ]
                     [ FSWItch  ]
                     [ FSHIft   ]

CDIE - Constant dielectric. Energy is proportional to 1/R.
RDIE - Distance dielectric. Energy is proportional to 1/(R-squared)

SWIT - Switching function used from CTONNB to CTOFNB values.
SHIF - Shifted potential acting to CTOFNB and zero beyond.
FSWI - Switching function acting on force only.  Energy is integral of force.
FSHI - Classical force shift method for CDIE (force has a constant offset)

        C) Group Electrostatics
electrostatics-spec::=
                                                              
[ GROUp [ EXTEnded [ GRADients ] [ QUADrip ] ] ] [ CDIElec ] [ SWITched ]
[       [          [ NOGRad    ] [ NOQUads ] ] ] [ RDIElec ] [ FSWItch  ]
[       [ NOEXtended                         ] ]                         

SWIT - Switching function used from CTONNB to CTOFNB values.
FSWI - Switching function, but QiQj/Rcut is added before switching.
         (FSWI has no effect on neutral groups).


        D) Electrostatic Distances
electrostatic-dist::=
        [ CUTNB  real ] [ CTEXNB real ]        [ CTONNB real ] [ CTOFNB real ]

CTEXNB - defines the cutoff distance beyond which interaction pairs are
         excluded from the Extended Electrostatics calculation.


        E) Extended (group) Electrostatics
electrostatics-spec::=
[ ATOM                                         ] [ CDIElec ] [ SHIFted  ]
[ GROUp [ EXTEnded [ GRADients ] [ QUADrip ] ] ] [ RDIElec ] [ SWITched ]
[       [          [ NOGRad    ] [ NOQUads ] ] ]
[       [ NOEXtended                         ] ]

EXTE - invokes the extended electrostatics command for calculating long
       range electrostatic interactions.
NOEX - suppress the extended calculation.
GRAD - keyword flags the inclusion of the field of the extended gradient in
       calculating the force on each atom,i.e. include first and second
       derivatives.
QUAD - flags the inclusion of the quadrupole in the multipole expansion.


        F) Reaction Fields
misc-specs::= [ EPS real ] [ E14Factor real ] [ NORXN                ]
                                              [ RXNFLD  rxnfld-spec  ]
                                              [ RXNNB   rxnfld-spec  ]

rxnfld-spec::= [ EPSEXT real ] [ ORDER integer ] [ SHELL real ]


2) Van Der Waal Interactions

The VDW keyword (default) invokes the van der waal energy term.
To supress this term, the NOVDw keyword may be used.

        A) Distance specified van der Waal Function
vdw-spec::=  [ VSHIfted  ] [ VDIStance ]
             [ VSWItched ]


3) Miscellaneous options and keywords

        A) Dielectric specification
misc-specs::= [ EPS real ] [ E14Factor real ]


        B) Warning Distance Specifications
warning-dist::=
        [ WMIN   real ] [ WRNMXD real ]

 WRNMXD - keyword defines a warning cutoff for maximum atom displacement from
          the last close contact list update (used in EXTEnded)


        C) Initialization

        D) ST2-ST2 interaction methods
st2-spec::= [ ST2List   ]
            [ ST2Nolist ]

In all cases as many keywords and values as desired may be specified.
The key words, their functions, and defaults are:

        1) The method to be used

        2) Distance cutoff in generating the list of pairs

                CUTNB value (default 8.0)

        3) Distance cut at which the switching function eliminates
                all contributions from a pair in calculating energies.
                Once specified, This value is not reset unless respecified.

                CTOFNB value (default CUTNB-0.5)

        4) Distance cut at which the smoothing function begins to reduce
                a pair's contribution. This value is not used with SHFT.
                Once specified, This value is not reset unless respecified.

                CTONNB value (default CTOFNB-1.0)

        6) Dielectric constant for the extened electrostatics routines
                (RDIE option sets the dielectric equal to r
                times the EPS value)

                EPS value (default 1.0 for r dielectric)
                EPS 0.0  or  NOELec  (zero elecrostatic energy)

        7) Warning cutoff for minimum atom to atom distance. Pairs are
                checked during close contact list compilation.

                WMIN value (default 1.5)

        8) Warning cutoff for maximum atom displacement from the last
                close contact list update (used only in EXTEnded)

                WRNMXD value (default 0.5)

ALGORITHMS

        There are four algorithms used in calulating the nonbonded
energies, each making different approximations in an attempt to speed
the calulation. Electrostatic interactions are the most difficult to
deal with for two reasons. They do not fall off quickly with distance
(so it is inappropriate to simply ignore all interactions beyond some
cutoff), and they depend on odd powers of r necessitating expensive
square root caluculations for each pair evaluated. The approximations
used to make the electrostatics calculation more tractable are setting
the dielectric constant equal to r or using a constant dielectric but
only calculating distant interactions periodically (and storing the
value in between).

        Setting the dielectric constant equal to the atom atom distance
times a constant factor ( determined by the EPS keyword value )
makes the computation easier by eliminating the need to calculate square
roots and by making the calculated contribution fall off more quickly.
It also introduces problems. The force calculated using an r dependent
dielectric will be larger than the force from a constant dielectric at
short distances (5.0 angstroms or less by comparsion to a constant
dielectric of 2.5). In addition, the electrostatic contribution still
falls off relatively slowly and large distance cutoffs are needed. As
the number of atom pairs included will be proportional to the cutoff
cubed, this is a significant disadvantage.

        The SHIFt option is similar to SWITch except, the potential:

        E = (QI*QJ/(EPS*R)) * ( 1 - (R/CTOFNB)**2 )**2

is used when ( R < CTOFNB ) and zero otherwise. This potential
and it first derivative approach zero as R becomes CTOFNB,
without the messy computation of switching functions and steep
forces at large R. 

CDIE uses a constant dielectric everywhere.  This requires a square root
to be calculated in the inner loop of ENBOND, slowing things down a bit,
but it is physically more reasonable and widely employed by other groups
doing empirical energy modelling (ex. ST2 water).  This form allows a
small CUTNB (5.0 angstroms with EPS=2.5) even though the electrostatic
terms are still varying rapidly at that distance.  The short range 
forces are identical to those calculated with the other options, reflecting
the decrease in dielectric shielding at short ranges.

        The constant dielectric routines compile the close contact list
using the same two stage minimum rectangle box search that is described
above. In this way the efficiency of a residue by residue search is
exploited while being certain that all necessary pairs are included. For
close residue pairs an atom by atom search is then performed. Atom pairs
are either included in the list of close contacts or their electrostatic
interactions are calculated and stored.

Description of the Extended Electrostatics method
-------------------------------------------------
For the long range forces there is effectively no cutoff in the
electrostatic energy when using the Extended Electrostatics model.
The Extended Electrostatics model approximates the full electrostatic
interaction by partitioning the electric potential and the resulting forces
at any point ri into a near and extended contribution. The near contribution
arises from the charged particles which are spatially close to ri while the
extended contribution arises from the particles which are spatially distant
from ri. The total electrostatic potential can be written as a sum of the
two.  The near region is defined in terms of a radial distance, CUTNB, for
each atom.  Interactions between atoms separated by a distance greater than
CUTNB are calculated using a time saving multipole approximation when the 
nbond list is updated.  These interactions are stored together with their 
first (NOGRad) or first and second (GRADients) derivatives.  Interactions
between particles within CUTNB are calculated by the conventional pairwise 
additive scheme. (For a more complete development of the model, see
R.H. Stote, D.J. States and M. Karplus, J. Chimie Physique Vol. 11/12, 1991).
The energy is calculated by explicitly evaluating pairs in the list and using 
the stored potentials, fields, and gradients to approximate the distant
pairs. In essence the routines assume that for distant pairs the atom
movements will be small enough that the changes in their electrostatic
interactions can be accurately calculated using local expansions.
In using this model the GROUp method for constructing the nonbond list must
be used.  The interactions between particles within CUTNB are truncated 
rather than having a SHIFt or SWITch function applied.  Additionally, as
one is calculating all electrostatic interactions in the system, the 
dielectric constant should be set to 1.0.

Not Available at this time:
        An option is offered to increase the accuracy of residue residue
interactions by using a multipole expansion of one residue evaluated for
each atom of the other. This cutoff for this treatment is CUTMP. For
residue pairs outside of CUTMP only a single multipole evaluation is
made and second order polynomial expansion is used to extrapolate to
each atom.  Ordinarily this is sufficient and CUTMP is set to 0.0.

IMPLEMENTATION AND DATA STRUCTURES

        The initialization and list compilation is performed by the
subroutine NBONDS. It functions by guessing how much space will be
needed to store the close contact list, allocating that space (and space
for electrostatic potentials and gradients if necessary) on the heap,
and calling the appropriate subroutine to actually compile the nonbonded
list (NBONDG). If sufficient space was not available 1.5 times as
much is allocated and another attempt is made.

        ENBOND evaluates the nonbonded energy, calling EEXEL to evaluate
the stored electric potentials and fields. Double precision is used for
all arithmatic.

        All of the nonbonded cutoffs and lists are stored on the heap.
BNBND is the descriptor array passed through most of the program (in
some of the analysis routines an additional array BNBNDC is used for the
comparision data structure). BNBND holds heap adresses and LNBND holds
the lengths of the elements in the data structure. To actually access
the data it is necessary to include INBND.FCM (an index common
block) and specify HEAP(BNBND(xxx)) where xxx is the desired element
name in INBND.FCM. This is arrangement has the advantage of allowing
dynamic storage allocation and easy modification of the types of
information passed from routine to routine.

        The contents of the nonbonded data structure are described in
INBND.FCM.


FAST VECTOR/SHARED-MEMORY PARALLEL ROUTINES

         If FASTER is greater than or equal to zero; the SPECify FNBL OFF
has not been issued; VATOM and ATOM keywords have been used; groups 
and extended electrostatics have not been invoked and the cubing method
has not been specified; the fast nonbond list routines will be used.




File: Nbonds ]-[ Node: Tables
Up: Top -=- Previous: Function -=- Next: Cubes


                        Nonbond Lookup Tables


        The nonbond energy terms may be specified with a user supplied
binary lookup table. The command;

                READ TABLE UNIT int

will invoke this feature and disable all other energy term options.
The nonbond list specifiers will still be used (cutoff distances...).

        This feature is not designed for casual users, and is not supported
with test cases. Also, in version17, there is an uncorrected bug in
the second derivative determination.

        To use this feature, first read the common file ETABLE.FCM
for a description of variables, and then create a file the the routine
REATBL (consult the source) can read. The sources for this option
are contained in the file ETABLE.FLX.


File: Nbonds ]-[ Node: Cubes
Up: Top -=- Previous: Tables -=- Next: ZTBL


    [Important note to developers:  see the bottom of this node.]

SUMMARY

        CHARMM has two ways of building the nonbonded interaction list.
The old way, which we call "BYGRoups," and the new way, which we call
"BYCUbes," should be equivalent in all ways except efficiency.
Therefore the very best way to decide which option to use is to run a
few steps of dynamics with each method, and to use the faster one.
If you find that the two methods give results that differ even in the
last decimal place, this is a BUG---please report it.  There are some
situations in which the results are expected to differ, but BYCUbes
will warn in those cases.

        Actually, there is a *tiny* difference in behavior:  the
BYGRoups method will warn about any pairs of atoms that are too close
together.  The BYCUbes method does not.

        The remainder of this node describes the key differences
between the two algorithms, and gives hints about when each algorithm
is expected to be preferred.

O(N) VS O(N*N) TIME

        The purpose of using finite cutoffs is to reduce, from O(N*N)
to O(N), the number of nonbonded interaction terms that actually need
to be computed.  Both the BYGRoups and the BYCUbes methods accomplish
this; in fact, both should decide to include exactly the same set of
terms.  The difference in execution time lies in the method used to
determine which terms to include: the BYGRoups method builds the list
on O(N*N) time, while the BYCUbes method uses O(N) time.

PAYOFF THRESHOLD

        Given any pair of functions F1(N) and F2(N), which are O(N)
and O(N*N) respectively, there will always be some constant N0 such
that F2(N) > F1(N) for all N > N0.  Let us call this constant N0 the
"payoff threshold," because it is the system size above which BYCUbes
will always be better than BYGRoups.

        The payoff threshold varies considerably with the machine
type, the composition of the system you are simulating, and the
options you have specified.  Here are a few rules to follow.
Justifications for some of the rules may be found further on, in the
descriptions of the algorithms.

(1) The payoff threshold is smaller on a vector machine than on a
    scalar machine.  That is, BYCUbes is more vectorizable than
    BYGRoups.

(2) Consider your choices for electrostatics-spec and vdw-spec.  If
    you choose ATOM/VATOm, only the atom list will be built.  If you
    choose GROUp/VGROup, only the group list will be built.
    Otherwise, both will be built.  The payoff thresholds for the atom
    list and the group list are different:  the threshold is smaller
    for the group list than for the atom list.  That is, there are
    some system sizes for which BYCUbes will be better for the group
    list, but BYGRoups will be better for the atom list.  Because of
    the way CHARMM is set up, you must either use BYCUbes for both
    lists, or use BYGRoups for both lists.

(3) The shape of your system will not have a big impact on the payoff
    threshold.  BYCUbes operates by drawing a rectangle around the
    system and dividing it into cubes.  You would think that any empty
    cubes are wasted, but in fact on a serial machine the empty cubes
    take negligible time.

(4) The payoff threshold usually grows steeply with the cutoff
    distance---the asymptotic behavior should be sixth-power in CUTNB.
    This is because the BYCUbes algorithm gets its speedup from not
    having to look at pairs that sit in cubes that are too far apart
    to interact.  If the cutoff distance is large, then the cubes are
    large, and not many cube pairs can be eliminated.  Note: Because
    the execution time is sixth-power in CUTNB (for large systems), it
    usually does not make sense to have a buffer region when using the
    BYCUbes option: set CUTNB = CTOFNB.  The update frequency should
    be set to 1.

(5) However, rule (4) is not entirely true on a vector machine.  The
    BYCUbes implementation in CHARMM contains many vectorized loops
    which do not fill a typical vector machine, and so on such a
    machine you may be able to increase your cutoff distance without
    having to pay very much in terms of CPU time.

(6) BYCUbes trades space for time.  You may not be able to run it at
    all if you don't have very much memory.

These guidelines will not tell you which option to choose for your
system without your having to experiment a little bit; however, they
should help guide you as you play with the parameters.

ALGORITHMS

        The BYCUbes algorithm is based on what is traditionally called the
"linked-list" procedure.  (That is kind of a misnomer; the algorithm
does use linked lists, but the O(N) performance comes not from the
linked lists, but from the division of space into cubes.)  Step by
step, here is how it works:

(1) Find a rectangle that bounds the system (with margins), is aligned
    with the Cartesian axes, and has sides that are integral multiples
    of the cutoff distance.  Divide this rectangle into cubes.

(2) Decide which cube each atom belongs in.

(3) For each cube C, loop over each atom A contained in C.  For each
    such A, loop over each atom A' in the 27-cube surrounding region.
    If the A--A' pair falls within the cutoff distance, see whether
    the pair should be included in the list.  The criteria for
    inclusion should be the same as those used in BYGRoups, barring
    bugs.  Criteria include membership in the 1-2, 1-3, and 1-4 lists.

The BYGRoups algorithm is easiest to explain for the case of building
the group list: it simply loops over each pair of groups, and includes
the pair if it meets the distance and other criteria.  In the case of
building the atom list, it begins by building a group list, then looks
only at atom pairs that are within qualifying group pairs.

NOTE ON EFFICIENCY

        If the pure BYCUbes algorithm (as implemented in XDIST, in
NBNDGC.SRC) is compared with the stupid way of building a pair list
(testing every possible particle pair) the payoff threshold should be
so small that you would essentially never want to use the stupid way.
There are a number of factors that make the decision less clear-cut in
the case of CHARMM:

(1) The BYGRoups algorithm is not that stupid.
(2) The internal format of the nonbonded interaction list is very
    different from the format that the BYCUbes algorithm prefers to
    generate.  The conversion is slow and not vectorizable.
(3) Distance is not the sole factor determining inclusion in the list.

	The XDIST function has recently been converted to use the
compressed-table representation (*Note ZTBL::) of the nonbonded
exclusion and 1-4 lists.  This should produce some improvement in
performance. 

	IMPORTANT NOTE to developers:  As presently implemented, the
code in NBNDGC.SRC is about half as fast as it could be.  To double
its speed, remove the Quicksort step, which is clearly marked as being
nearly superfluous.  Unfortunately, removing that step is not an
option at the moment since the rest of CHARMM--in particular, the
energy code--expects to see a sorted nonbonded list.  That situation
should be rectified by someone who has the time.  In addition, since
the BYCUbes option is most efficient when there is no buffer region
and therefore when the update frequency is 1, the energy code could be
altered to avoid re-processing the nonbonded list.



File: Nbonds ]-[ Node: ZTBL
Up: Top -=- Previous: Cubes -=- Next: Top


	This node of concern to you only if you are interested in
implementation details.  It explains the compressed-table
representation of the nonbonded exclusion and 1-4 interacton lists.
Also, if you want the *whole* scoop on that representation, consult
ZTBL.SRC itself for detailed explanation.

	Roughly speaking, an atom-atom pair is placed on the nonbonded
interaction list if (a) the atoms fall within the cutoff distance, and
(b) the atoms are not topologically close to each other.
In CHARMM, to test for the latter condition you normally have to
consult the pairlist IBLO14/INB14, which is constructed in
NBEXCL.SRC:MAKINB.  For a given pair of atoms I and J, where I < J,
you would test to see whether J = ABS(INB14(X)) for some IBLO14(I-1)+1
<= X <= IBLO14(I).  (IBLO14(0) is defined as 0.)  If the sign of
INB14(X) turns out to be positive, then I,J is excluded; if negative,
then I,J is treated as a 1-4 pair.

	Unfortunately, hunting through those possible values of X is
cumbersome, and the operation is probably comparable in magnitude to
the distance test itself.  We would like to replace the hunt by a
small, predetermined number of array accesses and other operations.

	The simplest way to do this would be to store a
two-dimensional matrix TYPE(I,J) such that TYPE(I,J) is

    +1  if (I,J) is an excluded pair;
    -1  if (I,J) should be treated as a 1-4 pair;
     0  otherwise.

Unfortunately, for a medium-sized 10,000-atom system and word-sized
integers, this would require 100 megawords of storage.  

	However, if written as a slanted table SLANT(I,J-I) =
TYPE(I,J), the table contains many duplicated rows and columns.  A
compressed table takes advantage of this duplication.  The resulting
two-dimensional table has a size roughly independent of protein size.
For systems containing BPTI and barnase that had 580 and 10,100 atoms
respectively, the compressed tables were 83 x 33 and 82 x 18,
respectively.  (Yes, the latter compressed table was smaller even
though the barnase system was bigger.)  The compression arrays, which
hold the mapping from the uncompressed indices to the compressed
indices, occupy exactly 2N words, where N is the number of particles. 

	Of course, the preceding remarks apply to the group-group
nonbonded interaction list as well; there, the relevant CHARMM
pairlist is IGLO14/ING14 and the routine that constructs it is
NBEXCL.SRC:MAKGRP.

	Compressed-table representations of the TYPE tables are always
constructed for both atoms and groups after the corresponding CHARMM
pairlists are built.  This way, it is always guaranteed that the two
will return the same answers.  The time required by the conversion is
bounded by O(N^3), but because of an efficient hashing scheme the
expected running time is O(N).  For the barnase test it took under 2
seconds on a Convex.

	The output generated during MAKINB or MAKGRP includes hash
statistics.  If the hash functions operate efficiently, the number of
"hash matches" should nearly match the "actual matches" for a given
class of matches.  If the hash functions turn out to be bad for a
given system, then the number of hash matches will greatly exceed the
number of actual matches for some class.

CHARMM .doc Homepage


Information and HTML Formatting Courtesy of:

NIH/DCRT/Laboratory for Structural Biology
FDA/CBER/OVRR Biophysics Laboratory