Repeat_list code documentation¶
The repeat_list
submodule documentation.
repeat_list¶
-
class
tral.repeat_list.repeat_list.
RepeatList
(repeats)[source]¶ A RepeatList contains list of repeats that belong to the same sequence, or set of sequences.
RepeatList contains methods that act on several tandem repeats. For example, methods to
detect overlapping tandem repeats
identify the tandem repeat with highest statistical significance
filter a set of tandem repeats according to different filtering procedures.
-
repeats
¶ The list of tandem repats.
- Type
list of Repeat
-
cluster
(overlap_type, *args)[source]¶ Cluster
repeats
according tooverlap_type
.Cluster
repeats
according tooverlap_type
. We assume that overlap of repeats is transitive: If A overlaps with B, and B overlaps with C, (A,B,C) form one cluster. The attributecluster
is initiated to a dict:self.cluster = {“overlap_type1”: [(0,2),(1)], “overlap_type2”: [(0),(1), (2)]}
In this toy example, the first and the third repeat in
repeats
cluster according to “overlap_type1”, whereas no repeats cluster according to “overlap_type2”.- Parameters
overlap_type (str) – The name of a local pairwise repeat overlapping method.
-
create
(input_format)[source]¶ Read
RepeatList
from file.Read
RepeatList
from file (currently, only pickle is supported)- Parameters
input_format (str) – Currently only “pickle”
file (str) – Path to output file
Todo
Write checks for
input_format
andfile
.
-
filter
(func_name, *args, **kwargs)[source]¶ Filter
repeats
according tofunc_name
.Filter
repeats
according tofunc_name
- Parameters
func_name (str) – The name of a local filtering method.
-
write
(output_format, file=None, return_string=None, *args)[source]¶ Serialize and write
RepeatList
instances.Serialize
RepeatList
instance using the statedoutput_format
. If afile
is specified, save the String. Ifstr
is specified, give back the String (not possible for pickles).- Parameters
output_format (str) – The output format: Either “pickle” or “tsv”
file (str) – Path to output file
Todo
Write checks for
output_format
andfile
.
-
tral.repeat_list.repeat_list.
attribute
(rl, attribute, type, threshold)[source]¶ - Returns all repeats in
rl
with a attribute below (above) a certain threshold.
Returns all repeats in
rl
with a attribute below (above) a certain threshold.- Parameters
rl (RepeatList) – An instance of the RepeatList class.
attribute (str) – The attribute of the Repeat instance
type (str) – Either “min” or “max”
threshold (float) – All repeats with an attribute value below (above) this threshold are filtered out.
- Returns all repeats in
-
tral.repeat_list.repeat_list.
common_ancestry
(repeat1, repeat2)[source]¶ Do two TRs share at least one pair of chars with common ancestry?
Return 1 if the two TRs share at least one pair of chars (amino acids or nucleotides) with common ancestry; else 0.
-
tral.repeat_list.repeat_list.
divergence
(rl, score, threshold)[source]¶ - Returns all repeats in
rl
with a divergence below a certain threshold.
Returns all repeats in
rl
with a divergence below a certain threshold.- Parameters
rl (RepeatList) – An instance of the RepeatList class.
score (str) – The type of score defines the divergence that is used for filtering
threshold (float) – All repeats with a divergence of type score above this threshold are filtered out.
- Returns all repeats in
-
tral.repeat_list.repeat_list.
none_overlapping
(rl, overlap, l_criterion)[source]¶ Returns all none-overlapping repeats in
rl
.Returns all none-overlapping repeats in
rl
. Repeats are clustered according tooverlap
. Of each cluster, only the best repeat is returned according todCriterion
.- Parameters
rl (RepeatList) – An instance of the RepeatList class.
overlap (tuple) – First element: Name (str) of an overlap method in repeat_list. Second element:
**kwargs
. All remaining elements are additional arguments for this class.l_criterion (list) – list of (criterion (str), criterion arguments) tuples. Until only one repeat is remainining in a cluster, the criteria are applied in order.
-
tral.repeat_list.repeat_list.
none_overlapping_fixed_repeats
(rl, rl_fixed, overlap_type)[source]¶ Returns all repeats in
rl
none-overlapping withrl_fixed
.Returns all repeats in
rl
none-overlapping withrl_fixed
according tooverlap
.- Parameters
rl (rl_fixed) – An instance of the RepeatList class.
rl – A second instance of the RepeatList class.
overlap (list) – First list element: Name (str) of an overlap method in repeat_list. All remaining elements are additional arguments for this class.
-
tral.repeat_list.repeat_list.
pvalue
(rl, score, threshold)[source]¶ Returns all repeats in
rl
with a p-Value below a certain threshold.Returns all repeats in
rl
with a p-Value below a certain threshold.- Parameters
rl (RepeatList) – An instance of the RepeatList class.
score (str) – The type of score defines the pvalue that is used for filtering
threshold (float) – All repeats with a pvalue of type score above this threshold are filtered out.
Do two TRs share at least one char?
Return 1 if the two TRs share at least one char (amino acids or nucleotides); else 0.
repeat_list_io¶
- synopsis
Input/output for the RepeatList class