TEMPy.math package

Submodules

TEMPy.math.cluster module

class Cluster

Bases: object

A class to clustering an ensemble of structure instance

RMSD_ensemble(rank_fit_ensemble, ensemble_list, CA=True)

Calculates the pairwise RMSD matrix for all Structure Instance in the ensemble.

Arguments:
rank_fit_ensemble

Ensemble of Structure Instance ranked using cluster.rank_fit_ensemble

ensemble_list

Input list of Structure Instances

CA is set to True if only CA-RMSD is needed

Return:

A numpy array

cluster_fit_ensemble_top_fit(ensemble_list, score, rms_cutoff, res_target_map, sigma_coeff, number_top_mod=0, write=False, targetMap=False)

RMSD clustering of the multiple “fits” starting from the best scoring model accordingly with a chosen score. Cluster the fits based on Calpha RMSD (starting from the best scoring model)

Arguments:
ensemble_list

Input list of Structure Instances.

targetMap

Target Map Instance.

score

Scoring function to use. See ScoringFunctions class for a list of the available Scoring Function. E.g. set score=’CCC’ to use the Cross-correlation coefficient.

Score option are:

i ‘CCC’ - Cross-correlation coefficient;

ii ‘LAP’ - Laplacian-filtered cross-correlation

coefficient: useful for maps with resolutions worse than 10-15 A

iii ‘MI’ - Mutual information score: a good and robust

score but relatively slow to calculate

iv ‘ENV’ - Envelope score: the fastest score to

calculate due to binarisation of the map.

v-vii ‘NV’,’NV_Sobel’,’NV_Laplace’- Normal vector score:

a vector-based surface superimposition score with or without Sobel/Laplace filter.

viii ‘CD’ - Chamfer Distance: a score used in computer

vision algorithms as a fast similarity metric

rms_cutoff

float, the Calpha RMSD cutoff based on which you want to cluster the solutions. For example 3.5 (for 3.5 A).

res_target_map

the resolution, in Angstroms, of the target Map.

sigma_coeff

the sigma value (multiplied by the resolution) that controls the width of the Gaussian.

Default values is 0.356. Other values used :

0.187R corresponding with the Gaussian width of the Fourier transform falling to half the maximum at 1/resolution, as used in Situs (Wriggers et al, 1999)

0.225R which makes the Fourier transform of the distribution fall to 1/e of its maximum value at wavenumber 1/resolution, the default in Chimera (Petterson et al, 2004)

0.356R corresponding to the Gaussian width at 1/e maximum height equaling the resolution, an option in Chimera (Petterson et al, 2004);

0.425R the fullwidth half maximum being equal to the resolution, as used by FlexEM (Topf et al, 2008);

0.5R the distance between the two inflection points being the same length as the resolution, an option in Chimera (Petterson et al, 2004);

1R where the sigma value simply equal to the resolution, as used by NMFF (Tama et al, 2004).

number_top_mod

Number of Fits to cluster. Default is all.

write

True will write out a file that contains the list of the structure instances representing different fits scored and clustered. note the lrms column is the Calpha RMSD of each fit from the first fit in its class

rank_fit_ensemble(ensemble_list, score, res_target_map, sigma_coeff, number_top_mod=0, write=False, targetMap=False, cont_targetMap=None)

RMSD clustering of the multiple “fits” accordingly with a chosen score. Cluster the fits based on Calpha RMSD (starting from the best scoring model)

Arguments:
ensemble_list

Input list of Structure Instances.

targetMap

Target Map Instance.

score

Scoring function to use. See ScoringFunctions class for a list of the available Scoring Function. E.g. set score=’CCC’ to use the Cross-correlation coefficient.

Score option are:

i ‘CCC’ - Cross-correlation coefficient;

ii ‘LAP’ - Laplacian-filtered cross-correlation

coefficient: useful for maps with resolutions worse than 10-15 A

iii ‘MI’ - Mutual information score: a good and robust

score but relatively slow to calculate

iv ‘ENV’ - Envelope score: the fastest score to

calculate due to binarisation of the map.

v-vii ‘NV’,’NV_Sobel’,’NV_Laplace’- Normal vector score:

a vector-based surface superimposition score with or without Sobel/Laplace filter.

viii ‘CD’ - Chamfer Distance: a score used in computer

vision algorithms as a fast similarity metric

rms_cutoff

float, the Calpha RMSD cutoff based on which you want to cluster the solutions. For example 3.5 (for 3.5 A).

res_target_map

the resolution, in Angstroms, of the target Map.

sigma_coeff

the sigma value (multiplied by the resolution) that controls the width of the Gaussian. Default values is 0.356.

Other values used :

0.187R corresponding with the Gaussian width of the Fourier transform falling to half the maximum at 1/resolution, as used in Situs (Wriggers et al, 1999)

0.225R which makes the Fourier transform of the distribution fall to 1/e of its maximum value at wavenumber 1/resolution, the default in Chimera (Petterson et al, 2004)

0.356R corresponding to the Gaussian width at 1/e maximum height equaling the resolution, an option in Chimera (Petterson et al, 2004);

0.425R the fullwidth half maximum being equal to the resolution, as used by FlexEM (Topf et al, 2008)

0.5R the distance between the two inflection points being the same length as the resolution, an option in Chimera (Petterson et al, 2004)

1R where the sigma value simply equal to the resolution, as used by NMFF (Tama et al, 2004).

number_top_mod

Number of Fits to cluster. Default is all.

write

True will write out a file that contains the list of the structure instances representing different fits scored and clustered. note the lrms column is the Calpha RMSD of each fit from the first fit in its class

TEMPy.math.consensus module

class Consensus

Bases: object

A class to clustering an ensemble of structure instance

vote(ensemble_list, score_list, res_target_map, sigma_coeff, number_top_mod=0, write=False, targetMap=False)

Borda consensus scoring calculation between multiple “fits” using a user defined set of scores. The Borda count is a single-winner election method in which voters rank candidates in order of preference.

Arguments:
ensemble_list

Input list of Structure Instances.

score_list

Input list of scoring function to use.

See ScoringFunctions class for a list of the available Scoring Function. E.g. set score=’CCC’ to use the Cross-correlation coefficient.

Score option are:

i ‘CCC’ - Cross-correlation coefficient;

ii ‘LAP’ - Laplacian-filtered cross-correlation

coefficient: useful for maps with resolutions worse than 10-15 A;

iii ‘MI’ - Mutual information score: a good and robust

score but relatively slow to calculate;

iv ‘ENV’ - Envelope score: the fastest score to

calculate due to binarisation of the map.

v-vii ‘NV’,’NV_Sobel’,’NV_Laplace’- Normal vector score: a

vector-based surface superimposition score with or without Sobel/Laplace filter.

viii ‘CD’ - Chamfer Distance: a score used in computer

vision algorithms as a fast similarity metric

res_target_map

the resolution, in Angstroms, of the target Map.

sigma_coeff

the sigma value (multiplied by the resolution) that controls the width of the Gaussian.

Default values is 0.356.

Other values used :

0.187R corresponding with the Gaussian width of the Fourier transform falling to half the maximum at 1/resolution, as used in Situs (Wriggers et al, 1999);

0.225R which makes the Fourier transform of the distribution fall to 1/e of its maximum value at wavenumber 1/resolution, the default in Chimera (Petterson et al, 2004)

0.356R corresponding to the Gaussian width at 1/e maximum height equaling the resolution, an option in Chimera (Petterson et al, 2004)

0.425R the fullwidth half maximum being equal to the resolution, as used by FlexEM (Topf et al, 2008)

0.5R the distance between the two inflection points being the same length as the resolution, an option in Chimera (Petterson et al, 2004)

1R where the sigma value simply equal to the resolution, as used by NMFF (Tama et al, 2004).

number_top_mod

Number of Fits to cluster. Default is all.

write

True will write out a file that contains the list of the structure instances representing different fits scored and clustered. note the lrms column is the Calpha RMSD of each fit from the first fit in its class

targetMap

Target Map Instance.

vote_list(score_lists)

Borda consensus scoring calculation between multiple “fits” using a user defined set of scores. The Borda count is a single-winner election method in which voters rank candidates in order of preference.

Arguments:
ensemble_list

Input list of Structure Instances.

score_list

Input list of list. Each list is a list of Structure Instances associated with a score.

vote_mode(ensemble_list, score_list, res_target_map, sigma_coeff, number_top_mod=0, write=False, targetMap=False)

Mode consensus scoring calculation between multiple “fits” using a user defined set of scores.

Arguments:
ensemble_list

Input list of Structure Instances.

score_list

Input list of scoring function to use.

See ScoringFunctions class for a list of the available Scoring Function. E.g. set score=’CCC’ to use the Cross-correlation coefficient.

Score option are:

i ‘CCC’ - Cross-correlation coefficient;

ii ‘LAP’ - Laplacian-filtered cross-correlation

coefficient: useful for maps with resolutions worse than 10-15 A

iii ‘MI’ - Mutual information score: a good and robust

score but relatively slow to calculate

iv ‘ENV’ - Envelope score: the fastest score to

calculate due to binarisation of the map.

v-vii ‘NV’,’NV_Sobel’,’NV_Laplace’- Normal vector score:

a vector-based surface superimposition score with or without Sobel/Laplace filter.

viii ‘CD’ - Chamfer Distance: a score used in computer

vision algorithms as a fast similarity metric

res_target_map

the resolution, in Angstroms, of the target Map.

sigma_coeff

the sigma value (multiplied by the resolution) that controls the width of the Gaussian.

Default values is 0.356.

Other values used :

0.187R corresponding with the Gaussian width of the Fourier transform falling to half the maximum at 1/resolution, as used in Situs (Wriggers et al, 1999);

0.225R which makes the Fourier transform of the distribution fall to 1/e of its maximum value at wavenumber 1/resolution, the default in Chimera (Petterson et al, 2004)

0.356R corresponding to the Gaussian width at 1/e maximum height equaling the resolution, an option in Chimera (Petterson et al, 2004);

0.425R the fullwidth half maximum being equal to the resolution, as used by FlexEM (Topf et al, 2008)

0.5R the distance between the two inflection points being the same length as the resolution, an option in Chimera (Petterson et al, 2004);

1R where the sigma value simply equal to the resolution, as used by NMFF (Tama et al, 2004).

number_top_mod

Number of Fits to cluster. Default is all.

write

True will write out a file that contains the list of the structure instances representing different fits scored and clustered. note the lrms column is the Calpha RMSD of each fit from the first fit in its class

targetMap

Target Map Instance.

TEMPy.math.quaternion module

class Quaternion(q_list)

Bases: object

A class representing quaternions.

conjuate(q)

Return a conjugate of a quaternion.

Arguments:
q

A list of type [w,x,y,z] to represent a quaternion vector. NOTE: The argument seem to be not used. Will have to be removed and tested.

copy()

Return an instance of the Quaternion object.

mag()

Return the magnitude of the quaternion.

multiply_3(obj1, obj2, obj3)

Return a quaternion object which the product of three quaternion.

Arguments:
obj1

A list of type [w,x,y,z] to represent a quaternion vector.

obj2

A list of type [w,x,y,z] to represent a quaternion vector.

obj3

A list of type [w,x,y,z] to represent a quaternion vector.

normalise(tolerance=1e-05)

Return a normalised quaternion vector.

to_rotation_matrix()

Convert the quaternion vector to a rotation matrix and returns the rotation matrix.

unit_quat()

Return a unit quaternion.

TEMPy.math.transform_parser module

class TransformParser

Bases: object

A class to read and save transformation matrices

load_matrix(matrixname, mmap_mode=None)

Load an array(s) from .npy, .npz

Arguments:
matrixname:

.npy matrix If the filename extension is .gz, the file is first decompressed (see numpy.load for more information)

mmap_mode:

default None (memory-map the file) It can be set with different mode: ‘r’,’r+’,’w+’,’c’ accordingly with numpy.load (see numpy.memmap for a detailed description of the modes) The file is opened in this mode:

‘r’ Open existing file for reading only. ‘r+’ Open existing file for reading and writing. ‘w+’ Create or overwrite existing file for reading and

writing.

‘c’ Copy-on-write: assignments affect data in memory,

but changes are not saved to disk. The file on disk is read-only.

A memory-mapped array is kept on disk. However, it can be accessed and sliced like any ndarray. Memory mapping is especially useful for accessing small fragments of large files without reading the entire file into memory.

save_npy_matrix(file, arr)

Save an array to a binary file in NumPy .npy format.

Arguments:
file

File or filename to which the data is saved. If file is a file-object, then the filename is unchanged. If file is a string, a .npy extension will be appended to the file name if it does not already have one.

arr

array_like. Array data to be saved.

save_npz_matrix(file)

Save several arrays into a single file in uncompressed .npz format. (See numpy.savez for more information)

TEMPy.math.vector module

class Vector(x, y, z)

Bases: object

A class representing Cartesian 3-dimensonal vectors.

arg(vector)

Return the argument (angle) between this and another vector.RAD

copy()
Return:

A copy of Vector instance

cross(vector)
Return:

A Vector instance of the cross product of this and another vector specified as input parameter

dist(vector)
Return:

The distance between this and another vector specified as input parameter.

dot(vector)
Return:

The dot product of this and another vector specified as input parameter.

matrix_transform(rot_mat)

Transform the vector using a transformation matrix.

Arguments:
rot_mat

a 3x3 Python matrix instance.

Return:

A vector instance

mod()
Return:

The modulus (length) of the vector.

reverse()

Flip the direction of a Vector instance.

Return:

A Vector instance

times(factor)

Multiplies a Vector instance by a scalar factor.

Return:

A Vector instance

to_atom()

Create an Atom instance based on Vector instance.

Return:

Atom instance

translate(x, y, z)

Translate a Vector instance.

Arguments:
x, y, z

distance in Angstroms in respective Cartesian directions to translate vector.

Return:

Vector instance.

unit()
Return:

Vector instance of a unit vector.

align_2seqs(seq1, seq2)
altTorsion(a, b, c)

An alternate and better way to find the torsion angle between planes ab and bc.

Arguments:
a,b,c

Vector instances.

Return:

The torsion angle (radians)

axis_angle_to_euler(x, y, z, turn, rad=False)

Converts the axis angle rotation to an Euler form.

Arguments:
x, y, z

axis of rotation (does not need to be normalised).

turn

angle of rotation, in radians if rad=True, else in degrees.

Returns:

A 3-tuple (x,y,z) containing the Euler angles. .

axis_angle_to_matrix(x, y, z, turn, rad=False)

Converts the axis angle rotation to a matrix form.

Arguments:
x, y, z

axis of rotation (does not need to be normalised).

turn

angle of rotation, in radians if rad=True, else in degrees.

Return:

A 3X3 transformation matrix.

calcMtrx(arr)

Calculate 3 x 4 transformation matrix from Euler angles and offset. Arguments:

arr

[psi,theta,phi,offsetx,offsety,offsetz].

Returns:

3 x 4 transformation matrix

cps(mat_1, mat_2)

Find rotation and translation difference between two transformations. Arguments:

mat_1,mat_2

Transformation matrices.

Returns:

The translation and rotation differences

euler_to_matrix(x_turn, y_turn, z_turn, rad=False)

Converts an euler rotation to a matrix form.

Arguments:
x_turn, y_turn, z_turn

rotation angles around respective axis, in radians if rad=True, else in degrees.

Return:

A 3X3 transformation matrix.

random_vector(min_v, max_v)

Generate a random vector. The values for the vector component x, y, and z are randomly sampled between minimum and maximum values specified.

Argument:
min_v, max_v

minimum and maximum value

Return:

A Vector instance.

random_vector2(ul_list)
torsion(a, b, c)

Find the torsion angle between planes ab and bc.

Arguments:
a,b,c

Vector instances.

Returns:

The torsion angle in radians

TEMPy.math.vq module

VQ(D, n, epochs, alpha0=0.5, lam0=False)

Clusters a set of vectors (D) into a number (n) of codebook vectors

get_VQ_points(emmap, threshold, noOfPoints, epochs, output_file=None, lap_fil=True)
emmap :

Map (to be clustered) instance.

threshold :

voxels with density above this value are used in the VQ run.

noOfPoints :

num of VQ points to output.

epochs :

num of iterations to run the algorithm

output_file :

file to output to. In PDB format

lap_fil :

True if you want to Laplacian filter the map first, False otherwise. Note that filtering the map will change the density values of the map, which is relevant for the threshold parameter.

map_points(emmap, threshold)
write_to_pdb(vq, output_file=None)

Module contents