TEMPy.assembly package

Submodules

TEMPy.assembly.assembly module

class Assembly(struct_list)

Bases: object

A class to represent multi-subunit component and its corresponding density map.

build_maps(resolution: float, template_map, sig_coeff: float = 0.356) → None

Build list of maps corresponding to the protein components in the structList.

Arguments:
resolution

Desired resolution of the density map in Angstrom units.

template_map

A map object that will be uesd as the template to build maps of for the individual maps. Usually the input map used for the assembly fitting.

sigma_coeff

the sigma value (multiplied by the resolution) that controls the width of the Gaussian. Default values is 0.356.

combine_maps()

Used to combine the list of map objects into a single map object

combine_structs()

Used to combine the list of structure objects into a single structure object

make_vq_points(threshold, no_of_points, lap_fil, epochs: int = 300)

” Cluster the density maps in the assembly object into n points using vector quantisation algorithm.

Arguments:
threshold

voxels with density above this value are used in the VQ run.

no_of_points

Number of Vector quantisation points to output.

lap_fil

True if you want to Laplacian filter the map first, False otherwise. Note that filtering the map change the density values of the map, which is relevant for the threshold parameter.

epochs

Number of iterations to run the Vector quantisation algorithm. Default is set to 300

Return:

A list of vector objects containing the vector quatisation points

move_map_and_prot_by_aa(index, rx, ry, rz, ra, tx, ty, tz) → None

Translate and rotate the structure and map objects in the assembly around its centre given an axis and angle.

Arguments:
index

Index of the structure and map list.

rx,ry,rz

Axis to rotate about, ie. rx,ry,rz = 0,0,1 rotates the structure and map round the xy-plane.

ra

Angle (in degrees) to rotate map.

tx,ty,tz

Distance in Angstroms to move structure and map in respective x, y, and z directions.

move_map_and_prot_by_euler(index, rx, ry, rz, tx, ty, tz) → None

Translate and rotate the structure and map objects in the assembly around its centre using Euler angles.

Arguments:
index

Index of the structure and map list.

rx,ry,rz

Axis to rotate about, ie. rx,ry,rz = 0,0,1 rotates the structure and map round the xy-plane.

ra

Angle (in degrees) to rotate map.

tx,ty,tz

Distance in Angstroms to move structure and map in respective x, y, and z directions.

move_map_and_prot_by_mat(index, mat, tx, ty, tz) → None

Translate and rotate the structure and map objects around pivot given by CoM using a translation vector and a rotation matrix respectively.

Arguments:
mat

3x3 matrix used to rotate structure and map objects.

tx,ty,tz

Distance in Angstroms to move structure and map in respective x, y, and z directions.

move_map_and_prot_by_quat(index, tx, ty, tz, q_param, mat) → None

Translate the structure objects using a translation vector and rotate it using a quaternion object. Translate and rotate the map objects around pivot given by CoM using a translation vector and a rotation matrix respectively.

Arguments:
index

Index of the structure and map list.

tx,ty,tz

Distance in Angstroms to move structure and map in respective x, y, and z directions.

q_param

Is a list of type [w, x, y, z] which represents a quaternion vector used for rotation

mat

3x3 matrix used to rotate structure and map objects.

randomise_structs(max_trans, max_rot, v_grain: int = 30, rad: bool = False) → None

Randomise the position and orientation of the protein components in the structList.

Arguments:
max_trans

Maximum translation permitted

max_rot

Maximum rotation permitted (in degree if rad=False)

v_grain

Graning Level for the generation of random vetors (default=30)

randomise_structs_and_maps(max_trans, max_rot, v_grain: int = 30, rad: bool = False) → None

Randomise the position and orientation of the protein components and its corresponding map objects.

Arguments:
max_trans

Maximum translation permitted

max_rot

Maximum rotation permitted (in degree if rad=False)

v_grain

Graning Level for the generation of random vectors (default=30)

reset_all() → None

Reset the map and structure objects to is initial state.

reset_maps() → None

Undo all the transformations applied to the list of map objects and restore it to its original state.

reset_structs() → None

Translate the list of structure objects back into initial position.

write_all_to_files(template_name)

Write the all the strucrure and map objects separately to a pdb and mrc formatted file respectively.

Arguments:
templateName

A string representing the prefix of the file name

TEMPy.assembly.gamma module

class GA

Bases: object

A class used for implementing the Genetic Algorithm (GA).

get_ga_pool(assembly, pop_size, max_trans, vq_vec_list, emmap)

Method used to generate initial population for running GA.

Arguments:
assembly

Instance of a Assembly object.

pop_size

Number of members in the population.

max_trans

Set the translation offset range (in Angstrom) applied to each of the components position generated in the initial population pool.

vq_vec_list

List of Vector objects used to represent initial point configuration which is used to generate initial populatons of fits.

Return:

Return an instance of a Population object.

run(runs, no_of_gen, pop_size, selection_method, gof, w_gof, w_clash, prot, ncomp, emmap, resolution, logfile, gasnap, vq_vec_list, mrate, crate, moffset, ncpu=1)

Main method to initiate GA cycle.

Arguments:
runs

Number of GA solution to generate.

no_of_gen

Number of GA generations to generate.

pop_size

Number of members in the GA population.

selection_method

Selection method used to pick members in the population for the purpose of generating new population. Currently should be set to 1 for tournament selection.

gof

Option to specify the Goodness-of-fit function to use. Set it to 1 for Mutual information score or 2 for Cross Correlation Coefficient score.

w_gof

Weighting used for Goodness-of-fit score contribution to the GA fitness score.

w_clash

Weighting used for clash penalty score contribution to the GA fitness score.

prot

Instance of a Structure_BioPy object that contain multiple chains used as an input for building Assembly object

ncomp

Number of component in the assembly.

emmap

Instance of a map object.

resolution

Resolution of the map.

logfile

Name of the output logfile.

gasnap

Option used to control the PDB files written during the GA run. Set it to 1 for writing each individual member in the population (fit) in the every GA generation. Default is set to ‘dummy’ which will not write each individual member in the population in every GA generation.

vq_vec_list

List of Vector objects used to represent initial point configuration which is used to generate initial populatons of fits.

mrate

Mutation rate for the mutation operator.

crate

Crossover rate for the mutation operator.

moffset

Set the translation offset range (in Angstrom) applied to each of the components position generated in the initial population pool.

ncpu

Number of cpus to use in parallel through Parallel Python.

Return:

The function return the following items. An instance of the Population object of the final generation. A Structure_BioPy object corresponding to the fittest member in the final genaration and its respective simulated map object. A string containing the best fitness score, Min, Max, Avg, Std and total fitness score of all fits in the final genetation.

score_population(pop, pop_size, gof, w_gof, w_clash, cpu_avil, jobserver, scorer, assembly, emmap, refstruct, ncomp, cvol, template_grid, apix)

Method used to score memeber in the population.

Arguments:
pop

Instance of the Population object.

pop_size

Number of members in the population.

gof

Option to specify the Goodness-of-fit function to use. Set it to 1 for Mutual information score or 2 for Cross Correlation Coefficient score.

w_gof

Selection method used to pick members in the population for the purpose of generating new population. Currently should be set to 1 for tournament selection.

gof

Option to specify the Goodness-of-fit function to use. Set it to 1 for Mutual information score or 2 for Cross

Correlation Coefficient score. w_gof

Weighting used for Goodness-of-fit score contribution to the GA fitness score.

w_clash

Weighting used for clash penalty score contribution to the GA fitness score.

cpu_avil

Number of cpus to use in parallel through Parallel Python.

jobserver

Instance of the Jobserver object used by Parallel Python.

scorer

Instance of the ScoringFunction object.

assembly

Instance of a Assembly object.

emmap

Instance of a map object.

ncomp

Number of component in the assembly.

cvol

List containing the volume values of the individual components. Used in the calculation of the clash score.

template_grid

Map instance to be used as a template for the purpose of calculating the clash score.

apix

voxel size of the grid

Return:

Return an instance of a Population object with all its member’s fitness been score.

class Genotype(gene_types)

Bases: object

A class to store a collection of VectorGene and QuaternionGene that is used to represent the state of the components in the assembly.

breed(otherGenotype, mutationRate, crossRate)

Generate a child genotype after applying crossover and mutation operation using two selected genotypes.

Arguments:
otherGenotype

An instance of Genotype object.

mutationRate

Rate of mutation used.

crossRate

Rate of crossover used.

Return:

Return a child genotype object.

copy()

Randomly select two genes (VectorGene and QuaternionGene) of two component in a genotype and swap them

Return:

Return a copy of the genotype.

get_fitness()

Returns the fitness value of the genotype.

swap()

Randomly select two genes (VectorGene and QuaternionGene) of two component in a genotype and swap them

Return:

Return an instance of a genotype after swapping.

uniform_crossover(otherGenotype, mutationRate, crossRate)

Apply uniform crossover operation between two selected genotype.

Arguments:
otherGenotype

An instance of Genotype object.

mutationRate

Rate of mutation used.

crossRate

Rate of crossover used.

Return:

Return a child genotype object.

class Population

Bases: object

A class to store a collection of Genotype objects.

addGenotype(genotype)

Method to append a genotype object to the population.

avg_fitness()

Returns the average fittness score of the genotypes in the population

breedNewPop(no_of_iters, curr_iter, mutation_const, crossRate, sel_method, sel_par)

Returns a population object after performing a series of breeding operation on the genotype in the current population

breed_1child(mutation_const, crossRate, sel_method, sel_par)

Returns a child genotype after applying crossover and mutation operation

copy()

Returns a copy of the population object.

getBestScores()

Returns a string containing the best fittest value in the pop and its corresponding components of the score (gof,clashscore)

max_fitness()

Returns the best fittness score of the genotype in the population

min_fitness()

Returns the worst fittness score of the genotype in the population

pickBest()

Returns the fittest genotype in the population.

pickSetOfBest(noOfBestGenotypes)

Returns an instance of a population object containing n fittest genotypes.

Arguments:
noOfBestGenotype

Number of fittest genotypes to pick.

pickSetOfWorst(noOfBestGenotypes)

Returns an instance of a population object containing n worst fittest genotypes.

Arguments:
noOfBestGenotype

Number of worst fittest genotypes to pick.

pickWorst()

Returns the worst fittness score of the genotype in the population

size()

Returns the number of genotypes in the population.

std_fitness()

Returns the standard deviation of the fittness values of the genotypes in the population

totalFitnessScore()

Returns a string containing the total, average and the standard deviation of the fitness values in the population.

class QuaternionGene(ul_list, w, x, y, z)

Bases: object

A class used to represent the rotational state a component in the assembly and to apply mutation and crossover genetic operators on the component.

check_for_mutations(mutationRate)

Method to apply mutation operator on QuaternionGene based on mutationRate.

Arguments:
mutationRate

Rate of mutation used to decide an application of the mutation operation.

copy()

Returns a copy of a QuaternionGene object.

crossover(otherquat, cross_rate)

Method returns a QuaternionGene object among two QuaternionGene based on crossover rate.

Arguments:
otherquat

An instance of a QuaternionGene object.

dot_product(q_param)

Performs the dot product between two quaternion and returns the product in the form of a list of type [w,x,y,z].

Arguments:
q_param

A list of type [w,x,y,z] used to represent a quaternion.

get_interpolated_quat(q2_param)

Return an interpolated QuaternionGene found between two different QuaternionGene.

Arguments:
q2_param

A list of type [w,x,y,z] used of the purpose of finding an interpolated quaternion between two quaternions.

muate()

Mutation operator to set a new value for self.param by randomly picking from the list of precomputed quaternion.

to_rotation_matrix()

Method to convert a quaternion to a rotation matrix.

Return:

A rotation matrix

class Selection

Bases: object

A class to implement the selection procedure used to select two genotype for the purpose of creating a child genotype.

SUS(pop)
local_select(pop)
migration_select(pop, no_of_subs)
roulette_wheel(pop)

Return an instance of a Genotype object selected from the current population using Roulette wheel selection

Arguments:
pop

Instance of a Population object.

tournament(pop, tourn_size=2)

Return an instance of a Genotype object selected from the current population using Tournament selection

Arguments:
pop

Instance of a Population object.

tour_size

Size of the tournament selection.

class VectorGene(ul_list, x, y, z)

Bases: object

A class used to record the three dimensional position of a component in the assembly and to apply mutation and crossover genetic operators on them.

check_for_mutations(mutationRate)

Method to apply mutation operator on VectorGene based on mutationRate.

Arguments:
mutationRate

Rate of mutation used to decide an application of the mutation operation.

copy()

Returns a copy of a VectorGene object.

crossover(otherGene, crossRate)

Method to apply crossover operator on VectorGene based on crossover rate.

Arguments:
crossRate

Rate of crossover used to decide an application of the crossover operation.

get_gene_list()

Returns a list of x,y,z coordinates of the vector gene.

mutate()

Mutation operator that modify the x, y, and z coordinates of the VectorGene.

angle_axis_to_quat(angle, axis)
mapfunc(arg_list)

add the constant arguments, needed to use map

Arguments
arg_list

The function to be run, followed by its arguments

Return

The result of the function call

move_assembly_components(assembly, trans_list, qrot_list)
move_map_prot(assembly, n, trans, qrot)
move_struct_quat(g, assembly)

Method used to apply the translation and rotation operations on the individual components using the information in a given genotype.

Arguments:
g

An instance of a Genotype object.

assembly

An instance of the Assembly object.

optimise_EM_g(assembly, g, refmap, n_iters=20)

Optimise assembly fitting to reference map iteratively, using a GMM (each CA is a gaussian), sklearn’s EM fit for GMM, and moving each chain as a rigid body.

quaternion_align_vectors(u, v)

find the quaternion that align vector u to vector v

rand_quat_vec()
random() → x in the interval [0, 1).
restricted_rand_q(ang)

Sample quaternions with rotations less than ang

score_pop_segment_CCC(pop_seg, scorer, assembly, emmap, ncomp, cvol, template_grid, apix, w_gof, w_clash)

Method used to score the genotypes (fits) in the population.

Arguments:
pop_seg

An instance of a Population object.

scorer

An instance of the ScoringFunction object.

assembly

An instance of the Assembly object.

emmap

An instance of the map object.

ncomp

Number of components in the assembly.

cvol

List containing the volume values of the individual components. Used in the calculation of the clash score.

template_grid

Map instance to be used as a template for the purpose of calculating the clash score.

apix

voxel size of the grid

w_gof

Weighting used for Goodness-of-fit score contribution to the GA fitness score.

w_clash

Weighting used for clash penalty score contribution to the GA fitness score.

Return:

Return an instance of a Population object with all its member’s fitness been score.

score_pop_segment_MI(pop_seg, scorer, assembly, emmap, refstruct, ncomp, cvol, template_grid, apix, w_gof, w_clash)

Method used to score the genotypes (fits) in the population.

Arguments:
pop_seg

An instance of a Population object.

scorer

An instance of the ScoringFunction object.

assembly

An instance of the Assembly object.

emmap

An instance of the map object.

ncomp

Number of components in the assembly.

cvol

List containing the volume values of the individual components. Used in the calculation of the clash score.

template_grid

Map instance to be used as a template for the purpose of calculating the clash score.

apix

voxel size of the grid

w_gof

Weighting used for Goodness-of-fit score contribution to the GA fitness score.

w_clash

Weighting used for clash penalty score contribution to the GA fitness score.

Return:

Return an instance of a Population object with all its member’s fitness been score.

vprint(t, *args, **kwargs)
which_chain(chain_lens_cumsum, i)

Find which chain particle i belongs to

Module contents