.. _all_param: Parameters ========== SEED user-modifiable parameters are contained in three main input files. The file ``seed.inp`` (`input_param`_) contains the most frequently modified input parameters, as they regard a specific SEED run (path and name of structural input files, list of residues forming the binding pocket, switch between polar and apolar docking, ...). The ``seed.par`` (`par_param`_) file contains less frequently modified input/output options, parameters for docking, energy and clustering. Modification of most of these parameters is recommended only to advanced users who wish to fine tune the energy model. The ``seed_kw.par`` (`KW_param`_) file contains additional parameters that are specified in a keyword-based format rather than a sequential one. This allows more flexibility and easier addition of new parameters. In general, all the newly introduced analysis methods and options will be specified in this file. .. _input_param: Input Parameters ---------------- Here we define all the parameters of the ``seed.inp`` file. .. _i1: **i1** | first line: name of parameter file (``seed.par``) | second line: name of the keyword-based parameter file (``seed_kw.par``) .. _i2: **i2** name of coordinate file for the receptor (in SYBYL mol2 format) .. _i3: **i3** | Binding site residue list. | First line: number of residues in the binding site. | Following lines: residue indices (one per line). | Note that residues are renumbered sequentially starting from 1 within SEED and the residue index refers to this new numbering; if for example ARG38 is the first residue of the protein, its index is 1 and not 38. The SEED residue indices can be retrieved from ``seed.out`` after the line ``Data for the receptor :``. To avoid ambiguity we recommend to renumber the residues starting from 1 in the input file. Binding site metal ions have to be in the list as well. .. _i4: **i4** | List of points (*e.g.* ligand heavy atoms of a known ligand-receptor complex structure) in the binding site used to select polar and apolar receptor vectors which satisfy the angle criterion (see :ref:`angle_criterion`) and discard vectors pointing outside of the binding site. | First line: number of points (``0``: no removal of vectors using the angle criterion). | Following lines: coordinates of the points (one point per line). .. _i5: **i5** | Vectors for the metal ions in the binding site. | Make sure that the residue number of the metal is in the binding site residue list. | First line: total number of coordination points. | Following lines: atom number of metal / x y z of coordination point (vector extremity) .. _i6: **i6** | Spherical cutoff for docking: | coordinates of the center and radius of a sphere in which the geometry center of the fragment position must be in order to be accepted. This filter can be discarded by selecting ``n`` instead of ``y`` as first value. | ``y``, ``n`` / sphere center / sphere radius .. _i7: **i7** | Fragment library specifications | **First line**: one character specifying the running mode of SEED: :ref:`dock-runmode` (``d``) or only :ref:`energy-runmode` (``e``). | **Second line**: the first column contains the path of the fragment mol2 file and the second column allows the selection of apolar, polar docking or both (``a``, ``p``, ``b``). The fragment position is accepted if the total energy (according to the fast energy model) is smaller than a cutoff given in the third column. The second :ref:`clustering` is applied on the poses for which the binding energy of the cluster representative is smaller than a cutoff value specified in the 4th column. In summary: | Fragment library filename - apolar docking, polar docking, or both (``a``, ``p``, ``b``) - energy cutoff in kcal/mol - 2nd clustering cutoff in kcal/mol | **Third line**: Reading mode, either ``single`` or ``multi``. This option is only relevant when using the MPI parallel version and only concerns the way the input mol2 library is read. With ``single`` SEED expects a single mol2 input file; molecules are read from this file by the master rank, which dispatches them to the first available rank, balancing the computational load among the processes. This is especially important when running Monte Carlo minimization as the variance of the running time per molecule can be large. With the ``multi`` option each rank reads from a separate mol2 file. This requires the user to preemptively split the library into a number of parts equal to the number of ranks. In order to relieve the possible load imbalace, whe recommend shuffling the library file before splitting it (scripts are provided). The ``multi`` option can be useful when reanalyzing SEED output poses, as each rank writes to a separate output mol2 file, or when running with a limited number of MPI ranks, as with ``single`` the master rank only reads and dispatches molecules without doing any conmputation. For the serial version the chosen reading mode is inconsequential as only one process will be started. As you do not need to modify all the parameters and in most of the cases default values will give good results, we recommend not to write an input file from scratch, but to modify a default template. You can do this here through the `par_generator`_. .. _par_param: Parameter File -------------- Here we define all the parameters of the ``seed.par`` file. As mentioned, newly introduced analysis methods make use of the keyword-based parameter file ``seed_kw.par`` (`KW_param`_). As the latter keyword-based format (with meaningful defaults) is more flexible and easier to read/write, for each parameter in ``seed.par`` we have also defined an equivalent keyword (specified in brackets, with its default value). The keyword-based format can be used to write an intermediate parameter file that can be converted to the corresponding ``seed.par`` and ``seed_kw.par`` files with the utilities in Python module ``seed_param_module.py`` in the ``scripts/python_scripts`` directory. .. _p1: **p1 (prot_diel = 2.0)** Dielectric constant of the solute (receptor and fragment) .. _p2: **p2 (kept_vec_ratio = 1.0 1.0)** Ratio of kept vectors for docking : polar / apolar .. _p3: **p3 (write_mol2 = n y)** | Output control for structure files (two values on the same line). | First value: write \*_clus.mol2 file (y/n) | Second value: write \*_best.mol2 file (y/n) .. _p4: **p4 (write_energy = n y)** | Output control for energy table files (two values on the same line). | First value: write \*_clus.dat summary table file (y/n) | Second value: write \*_best.dat summary table file (y/n) .. _p5: **p5 (max_poses = 5 1)** | Maximum number of saved clusters and poses (two values on the same line). | First value: maximum number of cluster members saved in \*_clus\* output files. Note that this value determines the maximum number of poses per cluster that go through slow energy evaluation. Second value: maximum number of poses saved in \*_best\* output files. .. _p6: **p6 (log_out = ./outputs/seed.out)** | Filename for output log file. This is the main SEED output file (``seed.out``). | The docked fragments are saved in the directory ./outputs .. _p7: **p7 (coul_grid = w ./scratch/coulomb.grid)** write (w) or read (r) Coulombic grid / grid filename .. _p8: **p8 (vdw_grid = w ./scratch/vanderwaals.grid)** write (w) or read (r) van der Waals grid / grid filename .. _p9: **p9 (desol_grid = w ./scratch/desolvation.grid)** write (w) or read (r) receptor desolvation grid / grid filename .. _p10: **p10 (bump_check_slow = 2.0 0.89 0.6)** | Bump checking: used only for slow energy evaluation (three values) | n x atoms = maximum tolerated bumps / | scaling factor for interatomic distance / | severe overlap factor (beta factor in PROTEINS paper) .. _p11: **p11 (bump_check_fast = 1.0)** van der Waals energy cutoff (kcal/mol): this is used as bump checking for the fast energy model. .. _p12: **p12 (hbond_geometry = 50.0 100)** Angle (deg) and number of points on the sphere around the ideal hydrogen bonding vector direction. .. _p13: **p13 (num_rotations = 72)** Number of fragment rotations around each axis. .. _p14: **p14 (angle_criterion = 70.0 10.0 1.2 0.8)** Settings for the reduction of the seeding vectors (four values). * angle_rmin if distance <= (multipl_fact_rmin\*minDist) * angle_rmax if distance >= (multipl_fact_rmax\*maxDist) * linear dependence (range between angle_rmin and angle_rmax) for other distances .. _p15: **p15 (vdw_probe_radius = 1.83)** Van der Waals probe radius for removal of the receptor polar vectors. .. _p16: **p16 (coul_grid_sizes = 1 20.0 0.5)** | Settings for the Coulombic term in the fast energy model (three values). | ``1`` = distance dependent dielectric / grid margin / grid spacing .. _p17: **p17 (vdw_grid_sizes = 20.0 0.3)** | Settings for the van der Waals term in the fast energy model (two values). | grid margin / grid spacing .. _p18: **p18 (slow_energy_vdw_cutoff = 12.0 1.0)** | Settings for the van der Waals accurate energy model (two values). | nonbonding cutoff / grid spacing | Note that the Coulombic cutoff for formal charges is automatically set to 1.3 x van_der_Waals_cutoff .. _p19: **p19 (apolar_k = -0.333333)** | Multiplicative factor (k) for apolar docking to skip evaluation of electrostatics. The van der Waals energy cutoff is: | k x Number of fragment atoms, including hydrogen atoms .. _p20: **p20 (solv_grid_sizes = 24.0 0.25)** | Settings for the solvation grid (two values): | grid margin / grid spacing .. _p21: **p21 (water_radius = 1.4** | **point_density_SAS = 500** | **solv_diel = 78.5)** | Settings for the solvation term evaluation (three values): | water radius for solvation / number of points per sphere to generate SAS / solvent dielectric constant .. _p22: **p22 (Hydrophobicity_map = 1.0 1.0 1.4 1.0 1.0)** | Setting for the Hydrophobicity maps (five values): | point densities (A^-2) on the SAS for apolar vectors on the receptor / on the fragment / probe radius to generate SAS for apolar vectors / scaling factor for desolvation and / van der Waals interactions .. _p23: **p23 (scaling_factors = 1.0 1.0 1.0 1.0)** Scaling factors for fast and also accurate energy evaluation (four values): van der Waals / electrostatic interaction / receptor desolvation / fragment desolvation Clustering parameters ^^^^^^^^^^^^^^^^^^^^^ The clustering with GSEAL proceeds in two steps: the first clustering yields large clusters which contain almost overlapping as well as more distant fragments; the second clustering is done on each cluster found in the first clustering to eliminate fragments which are very close in space. .. _p24: **p24** | Non-default similarity weight factors (150 atom elements) for GSEAL: | First line: 0 or number of non-default elements | Following lines: list (first element number / second element number / value ) .. _p25: **p25 (gseal1 = 0.9 0.4)** | Parameters for first clustering (overall clustering): | GSEAL similarity exponential factor / cutoff factor .. _p26: **p26 (gseal2 = 0.9 0.9)** | Parameters for second clustering (to discard redundant positions): | GSEAL similarity exponential factor / cutoff factor .. _p27: **p27 (max_clu_poses = 20)** Maximal number of poses to be clustered .. _p28: **p28 (print_level = 100 1)** | Setting for the amount of information to be written to the output ``seed.out``: | Maximum number of lines to be written in the output file for the sorted energies and the two clustering procedures / | print level (``0`` = lean, ``1`` = adds sorting before postprocessing, ``2`` = adds 2nd clustering). Force field parameters ^^^^^^^^^^^^^^^^^^^^^^ .. _p29: **p29 (vdw_params)** | Van der Waals radius and energy minimum (absolute value). | First line: number of records | Following lines: each record contains five values: | sequential index / atom type / element number / van der Waals radius / van der Waals energy minimum .. _p30: **p30 (hbond_params)** | Hydrogen bond distances between donor and acceptor. | First line: Default distance for all atom and element types. | First block: * First line: number of records * Following lines: element number i / element number j / donor-acceptor distance | Second block: * First line: number of records * Following lines: atom type i / atom type j / donor-acceptor distance .. _p31: **p31 (atomic_weights)** | List of relative atomic weights. | First line: number of elements (without element 0) | element name / element number / atomic weight .. _KW_param: Keyword-based parameter file ---------------------------- In order to allow more flexibility and easier addition of SEED parameters, we have decided to move from the original sequential format of the ``seed.par`` to a keyword based format. This, for legacy reasons, only involves the newly added settings, so that an older ``seed.par`` can be used as is, without the need of modifications or rewritings. The new keyword based parameters should be specified in the format `` = `` as for example: :: # Additional parameters do_mc = y # activates MCSA sampling mc_temp = 500 mc_max_xyz_step = 0.7 0.1 Comments can be introduced by *#* and will be ignored. Note that some keywords require multiple values. If the same keyword is repeated multiple times in the file, the last instance will be used. The additional keyword-based parameter file, that we will refer to as ``seed_kw.par`` should always be present (even if blank) and its path has to be specified in the second line of `i1`_. If a keyword is not specified in the ``seed_kw.par``, its default value will be used. The keywords that can be set are the following (defaults are given in brackets): .. _MC_param: Monte Carlo parameters ^^^^^^^^^^^^^^^^^^^^^^ The following parameters are needed for running a Monte Carlo Simulated Annealing (MCSA) minimization of the top poses. This option can be enabled by setting `do_mc`_ to ``y`` (yes) and adding the following related keywords. If `do_mc`_ is set to ``n`` (no), all the additional MC parameters in this section play no role. See :ref:`mc_minimization` for further details on MCSA. .. _do_mc: **do_mc** (n) | Perform MCSA refinement? (``y`` / ``n``) .. _mc_temp: **mc_temp** (0.0) | Starting temperature of MC run. .. _mc_max_xyz_step: **mc_max_xyz_step** (0.0, 0.0) | Maximum rigid body translation step (in Angstrom): coarse (1st value) | and fine (2nd value) moves. .. _mc_max_rot_step: **mc_max_rot_step** (0.0, 0.0) | Maximum rigid body rotation step (in degrees): coarse (1st value) | and fine (2nd value) moves. .. _mc_rot_freq: **mc_rot_freq** (0.5) | MC move set frequencies: | Frequency :math:`p` of rigid body rotation moves (the frequency of | rigid body translation move will be :math:`q = 1 - p`). .. _mc_xyz_fine_freq: **mc_xyz_fine_freq** (0.5) | Relative frequency (w.r.t. the number of translation move) of fine translation moves. .. _mc_rot_fine_freq: **mc_rot_fine_freq** (0.5) | Relative frequency (w.r.t. the number of rotation moves) of fine rotation moves. .. _mc_niter: **mc_niter** (0, 0) | Number of steps :math:`N_{out}` of the outer MC chain (1st value). | Number of steps :math:`N_{in}` of the inner MC chain (2nd value). .. _mc_sa_alpha: **mc_sa_alpha** (1.0) | Annealing parameter :math:`\alpha`. .. _mc_rseed: **mc_rseed** (-1) | Seed for the pseudo-random number generator used by the MC sampler. A value of ``-1`` uses the current CPU time. .. _SD_param: Steepest Descent parameters ^^^^^^^^^^^^^^^^^^^^^^^^^^^ The following parameters are needed for running a steepest descent (SD) minimization of the top poses in rigid-body space. This option can be enabled by setting `do_sd`_ to ``y`` (yes) and specify the following relevant keywords (or using the defaults). If `do_sd`_ is set to ``n`` (no), all the additional SD parameters in this section are ignored. Note that rigid-body SD minimization is performed after a the MCSA minimization (if the latter is enabled). See :ref:`steepest_descent` for further details on SD. .. _do_sd: **do_sd** (n) | Perform SD refinement? (``y`` / ``n``) .. _do_gradient_check: **do_gradient_check** (n) | Compare analytical and numerical gradients and print them to the ``log`` file. This is mainly useful for troubleshooting and debugging. .. _sd_max_iter: **sd_max_iter** (20) Maximum number of SD iterations. .. _sd_eps_grms: **sd_eps_grms** (0.02): Stopping threshold on the minimum value of the gradient (:math:`\| \boldsymbol{\alpha} \circ \nabla U(\mathbf{x}_i) \|`). .. _sd_alpha_xyz: **sd_alpha_xyz** (0.1): Base increment size for rigid-body translations. Expressed in Angstrom. .. _sd_alpha_rot: **sd_alpha_rot** (0.01): Base increment size for rigid-body rotations. Expressed in degrees. .. _sd_learning_rate: **sd_learning_rate** (0.1): Starting learning rate :math:`\eta_0` for SD. .. _par_generator: Parameter File Generator ------------------------ The parameter file generator helps you preparing the input parameter files for a SEED run: ``seed.inp``, ``seed.par``, and ``seed_kw.par``. You can load a template with predefined default values (and CHARMM/CGenFF parameters), edit the user-specific information and save it. The template for ``seed_kw.par`` shows example settings for a run with additional MCSA minimization of the poses. .. .. raw:: html

Here you can edit the file with user-specific information. Fields you necessarily have to edit are marked by XXXX