Repeat_list code documentation¶
Initial version of the repeat_list submodule documentation.
repeat_list¶
-
class
tral.repeat_list.repeat_list.
RepeatList
(repeats)[source]¶ A RepeatList contains list of repeats that belong to the same sequence, or set of sequences.
RepeatList contains methods that act on several tandem repeats. For example, methods to
- detect overlapping tandem repeats
- identify the tandem repeat with highest statistical significance
- filter a set of tandem repeats according to different filtering procedures.
-
repeats
¶ list of Repeat
The list of tandem repats.
-
cluster
(overlap_type, *args)[source]¶ Cluster
repeats
according tooverlap_type
.Cluster
repeats
according tooverlap_type
. We assume that overlap of repeats is transitive: If A overlaps with B, and B overlaps with C, (A,B,C) form one cluster. The attributecluster
is initiated to a dict:self.cluster = {“overlap_type1”: [(0,2),(1)], “overlap_type2”: [(0),(1), (2)]}In this toy example, the first and the third repeat in
repeats
cluster according to “overlap_type1”, whereas no repeats cluster according to “overlap_type2”.Parameters: overlap_type (str) – The name of a local pairwise repeat overlapping method.
-
create
(file, input_format)[source]¶ Read
RepeatList
from file.Read
RepeatList
from file (currently, only pickle is supported)Parameters: - input_format (str) – Currently only “pickle”
- file (str) – Path to output file
Todo
Write checks for
input_format
andfile
.
-
filter
(func_name, *args, **kwargs)[source]¶ Filter
repeats
according tofunc_name
.Filter
repeats
according tofunc_name
Parameters: func_name (str) – The name of a local filtering method.
-
write
(output_format, file=None, return_string=None, *args)[source]¶ Serialize and write
RepeatList
instances.Serialize
RepeatList
instance using the statedoutput_format
. If afile
is specified, save the String. Ifstr
is specified, give back the String (not possible for pickles).Parameters: - output_format (str) – The output format: Either “pickle” or “tsv”
- file (str) – Path to output file
Todo
Write checks for
output_format
andfile
.
-
tral.repeat_list.repeat_list.
attribute
(rl, attribute, type, threshold)[source]¶ - Returns all repeats in
rl
with a attribute below (above) a certain - threshold.
Returns all repeats in
rl
with a attribute below (above) a certain threshold.Parameters: - rl (RepeatList) – An instance of the RepeatList class.
- attribute (str) – The attribute of the Repeat instance
- type (str) – Either “min” or “max”
- threshold (float) – All repeats with an attribute value below (above) this threshold are filtered out.
- Returns all repeats in
-
tral.repeat_list.repeat_list.
common_ancestry
(repeat1, repeat2)[source]¶ Do two TRs share at least one pair of chars with common ancestry?
Return 1 if the two TRs share at least one pair of chars (amino acids or nucleotides) with common ancestry; else 0.
Parameters: Returns: 1 if the repeats share >= 1 pair of chars with common ancestry, else 0.
Return type: Bool
-
tral.repeat_list.repeat_list.
divergence
(rl, score, threshold)[source]¶ - Returns all repeats in
rl
with a divergence below a certain - threshold.
Returns all repeats in
rl
with a divergence below a certain threshold.Parameters: - rl (RepeatList) – An instance of the RepeatList class.
- score (str) – The type of score defines the divergence that is used for filtering
- threshold (float) – All repeats with a divergence of type score above this threshold are filtered out.
- Returns all repeats in
-
tral.repeat_list.repeat_list.
none_overlapping
(rl, overlap, l_criterion)[source]¶ Returns all none-overlapping repeats in
rl
.Returns all none-overlapping repeats in
rl
. Repeats are clustered according tooverlap
. Of each cluster, only the best repeat is returned according todCriterion
.Parameters: - rl (RepeatList) – An instance of the RepeatList class.
- overlap (tuple) – First element: Name (str) of an overlap method in repeat_list. Second element: **kwargs. All remaining elements are additional arguments for this class.
- l_criterion (list) – list of (criterion (str), criterion arguments) tuples. Until only one repeat is remainining in a cluster, the criteria are applied in order.
-
tral.repeat_list.repeat_list.
none_overlapping_fixed_repeats
(rl, rl_fixed, overlap_type)[source]¶ Returns all repeats in
rl
none-overlapping withrl_fixed
.Returns all repeats in
rl
none-overlapping withrl_fixed
according tooverlap
.Parameters: - rl (rl_fixed) – An instance of the RepeatList class.
- rl – A second instance of the RepeatList class.
- overlap (list) – First list element: Name (str) of an overlap method in repeat_list. All remaining elements are additional arguments for this class.
-
tral.repeat_list.repeat_list.
pvalue
(rl, score, threshold)[source]¶ Returns all repeats in
rl
with a p-Value below a certain threshold.Returns all repeats in
rl
with a p-Value below a certain threshold.Parameters: - rl (RepeatList) – An instance of the RepeatList class.
- score (str) – The type of score defines the pvalue that is used for filtering
- threshold (float) – All repeats with a pvalue of type score above this threshold are filtered out.
Do two TRs share at least one char?
Return 1 if the two TRs share at least one char (amino acids or nucleotides); else 0.
Parameters: Returns: 1 if the repeats share >= 1 char, else 0.
Return type: Bool
-
tral.repeat_list.repeat_list.
two_repeats_overlap
(overlap_type, repeat1, repeat2)[source]¶ Helper method to test the overlap of
repeat1
andrepeat2
.Helper method to test the overlap of
repeat1
andrepeat2
. The overlap is calculated by the local methodoverlap_type
.Parameters: Returns: Forwards method output
repeat_list_io¶
synopsis: | Input/output for the RepeatList class |
---|