Clusters

class structural.clusters.Atom(index: int, mol: moleculekit.molecule.Molecule)[source]

Class to handle ILV atoms index: the vmd atom index mol: an htmd.htmd.molecule object

get_neighbors(mol: moleculekit.molecule.Molecule)numpy.core.multiarray.array[source]

Provides all indices of atoms within 6.56 A of this atom. 6.56 is the upper bound of a possible neighbor 1.88 (C) + 1.4 + 1.4 + 1.88 (C).

class structural.clusters.Cluster(area, residues, contacts, ratio_contacts_residue, ratio_area_residue)[source]
property area

Alias for field number 0

property contacts

Alias for field number 2

property ratio_area_residue

Alias for field number 4

property ratio_contacts_residue

Alias for field number 3

property residues

Alias for field number 1

structural.clusters.add_clusters(mol, g: graph_tool.Graph, components: graph_tool.PropertyArray)[source]
Parameters
  • mol

  • g

  • components

Returns

Molecule representations A list with Cluster objects

structural.clusters.create_graph(resid_matrix: numpy.core.multiarray.array, resid_list: numpy.core.multiarray.array, cutoff_area: float = 10.0)graph_tool.Graph[source]
Parameters
  • resid_matrix – the ILVresid x ILVresid area matrix

  • resid_list – the index x index area matrix

Returns

A Graph object where each component is a ILV cluster

structural.clusters.fill_matrices(atom: structural.clusters.Atom, mol: moleculekit.molecule.Molecule, atom_matrix: numpy.core.multiarray.array, resid_matrix: numpy.core.multiarray.array, indices: numpy.core.multiarray.array, resids: numpy.core.multiarray.array)Tuple[numpy.core.multiarray.array, numpy.core.multiarray.array][source]
Parameters
  • atom – An Atom class

  • mol – an htmd.htmd.molecule object

  • atom_matrix – the index x index area matrix

  • resid_matrix – the ILVresid x ILVresid area matrix

  • indices – the indices that belong to ILV sidechain heavy atoms

  • resids – the resids that belong to ILV sidechain heavy atoms

Returns

Updated atom_matrix and resid_matrix

structural.clusters.filter_mol(inputmolfile: str)moleculekit.molecule.Molecule[source]

Loads, filters the object to chain A and writes filtered pdb out. :param inputmolfile: path to the pdb file :return: the Molecule object

structural.clusters.generate_sphere_points(coords: numpy.core.multiarray.array, n: int = 610, radius: float = 1.88)numpy.core.multiarray.array[source]
Parameters
  • coords – The coordinates of the center of the atom

  • n – number of points to sample

  • radius – the radius of the atom

Returns

a nx3 vector with the point coordinates

structural.clusters.postprocess_session(inputmolfile: str, outputname: str)None[source]

Modifies the VMD session to not include tmp files :param outputname: The vmd session (output file) :param inputmolfile: Path to the pdb file already processed (filtered and or protonated) :return: None. Modifies the file inline

structural.clusters.retrieve_indices(matrix: numpy.core.multiarray.array, coords: numpy.core.multiarray.array, neighborpositions: numpy.core.multiarray.array, radius: float = 1.88)numpy.core.multiarray.array[source]

Computes if each of the n sphere points are penetrating neighboring spheres :param matrix: n x m Distance matrix where n is the number of sphere points and m the number of neighbors :param coords: the coordinates of the atom :param neighborpositions: Coordinates of the neighbors :param radius: radius of the atom :return: The atoms that are in closest with each n sphere points.

structural.clusters.retrieve_neighbor_positions(atom: structural.clusters.Atom, mol: moleculekit.molecule.Molecule)Tuple[numpy.core.multiarray.array, Dict[int, int]][source]
Parameters
  • atom – an Atom object

  • mol – a htmd.htmd.molecule object

Returns

A tuple object with the positions of the neighboring atoms. A dictionary indexing column positions to resid positions

structural.clusters.write_clusters(g: graph_tool.Graph, components: graph_tool.PropertyArray, inputmolfile: str, outputname: str)None[source]

This function prints the clusters to the terminal and outputs them into a VMD session :param g: A Graph object :param outputname: The pdb filename. :param outputname: The file name to output the VMD session to. :return:

structural.clusters.write_largest_cluster(g: graph_tool.Graph, inputmolfile: str, outputname: str)None[source]

This function prints the clusters to the terminal and outputs them into a VMD session :param g: A Graph object :param outputname: The pdb filename. :param outputname: The file name to output the VMD session to. :return:

HH-networks

structural.hh_networks.add_networks(mol, g: graph_tool.Graph, components: graph_tool.PropertyArray)[source]
Parameters
  • mol – the Chimera where to add the representations to.

  • g – a graph-tool graph object with the networks property

  • components – the components from the graph

Returns

INcludes the representations in the Chimera object.

structural.hh_networks.make_graph_hh(hbonds: numpy.core.multiarray.array, trajectory: mdtraj.core.trajectory.Trajectory)graph_tool.Graph[source]
Parameters
  • hbonds – a np.array indicating indexes of atoms that have an h-bond

  • trajectory – an mdtraj trajectory. It can also be a pdb file processed with mdtraj as trajectory

Returns

A graph-tool graph object

structural.hh_networks.protonate_mol(inputmolfile: str)moleculekit.molecule.Molecule[source]

Loads, filters the object to chain A and protonates the selection. It then writes the filtered protonated pdb out. :param inputmolfile: path to the pdb file :return: a Molecule object with the correct protonation state

structural.hh_networks.write_networks(g: graph_tool.Graph, components: graph_tool.PropertyArray, inputmolfile: str, outputname: str)None[source]
Parameters
  • g – A graph-tool graph object

  • components – the components of the graph

  • inputmolfile – the pdb of the input protein

  • outputname – a path wehre to write the output

Returns

Writes a file named outputname + “hh-networks.txt”

Salt-bridges

structural.salt_bridges.write_salt_bridges(data: numpy.ndarray, mapping: pandas.core.frame.DataFrame, mol: moleculekit.molecule.Molecule, outputname: str)None[source]

This function outputs the salt bridges into a VMD session :param data: A MetricDistance object :param mapping: A DataFrame object including the index - residue mapping :param mol: The pdb filename. :param outputname: The file prefix name to output the VMD session to. Example: “protein2” :return: A file with the VMD session named outputname+”-salt-bridges.txt”

Further structural analysis

structural.analysis.calc_contact_order(chimera: Optional[protlego.builder.chimera.Chimera] = None, filename: Optional[str] = None, diss_cutoff: int = 8)[source]

The contact order of a protein is a measure of the locality of the inter-amino acid contacts in the native folded state. It is computed as the average seqeuence distance between residues that form contacts below a threshold in the folded protein divided by the total length of the protein” :param chimera: A Chimera object with n residues. :param filename: path to a pdb file :param diss_cutoff: The maximum distance in Armstrong between two residues to be in contact, default 8 Angstroms :return: the contact order (%)

structural.analysis.calc_dist_matrix(chimera: Optional[protlego.builder.chimera.Chimera] = None, filename: Optional[str] = None, selection: str = 'residue', type='contacts', plot=False)[source]

Returns a matrix of C-alpha distances for a given pdb :param chimera: A Chimera object with n residues. :param filename: path to a pdb file :param selection: How to compute the distance. ‘residue’ (the closest two :param type: between contacts (contact map when distances are below 8 armstrongs) or distances atoms between two residues) or ‘alpha’ distance of the alpha carbons. :param plot: whether to plot the distance matrix. Default is False :return: matrix. np.array. An n by n distance matrix.

structural.analysis.calc_dssp(chimera: Optional[protlego.builder.chimera.Chimera] = None, filename: Optional[str] = None, simplified: bool = True)[source]

Compute Dictionary of protein secondary structure (DSSP) secondary structure assignments. This funcion uses the MDtraj compute_dssp implementation as a basis. :param chimera: A Chimera object. :param filename: path to a pdb file :param simplified: Use the simplified 3-category assignment scheme. Otherwise the original 8-category scheme is used. :return: assignments np.ndarray. The secondary structure assignment for each residue

structural.analysis.calc_sasa(chimera: Optional[protlego.builder.chimera.Chimera] = None, filename: Optional[str] = None, probe_radius: float = 0.14, n_sphere_points: int = 960, sasa_type='total')[source]

Computes the Solvent Accessible Surface Area of the protein. This funcion uses the MDtraj shrake_rupley implementation as a basis. :param chimera: A Chimera object. :param filename: Path to a pdb file :param probe_radius: The radius of the probe, in nm. :param n_sphere_points: the number of points representing the sufrace of each atom. Higher values lead to more accuracy. :param sasa_type: Type of calculation to perform. To select from polar, apolar, or total. :return: areas: np.array containing the area of the chimera in Angstrom^2

structural.analysis.hhbond_plot(chimera: Optional[protlego.builder.chimera.Chimera] = None, filename: Optional[str] = None)[source]

Computes a hhbond plot of a chimera object or a file. One of the two inputs must be provided. :param chimera: the chimera from where to compute the hydrogen bond plot :param filename: a path where to find the pdb file. :return: A contact map with hydrogen bond as metric