Repeat_list code documentation¶
Initial version of the repeat_list submodule documentation.
repeat_list¶
-
class
tral.repeat_list.repeat_list.RepeatList(repeats)[source]¶ A RepeatList contains list of repeats that belong to the same sequence, or set of sequences.
RepeatList contains methods that act on several tandem repeats. For example, methods to
- detect overlapping tandem repeats
- identify the tandem repeat with highest statistical significance
- filter a set of tandem repeats according to different filtering procedures.
-
repeats¶ list of Repeat
The list of tandem repats.
-
cluster(overlap_type, *args)[source]¶ Cluster
repeatsaccording tooverlap_type.Cluster
repeatsaccording tooverlap_type. We assume that overlap of repeats is transitive: If A overlaps with B, and B overlaps with C, (A,B,C) form one cluster. The attributeclusteris initiated to a dict:self.cluster = {“overlap_type1”: [(0,2),(1)], “overlap_type2”: [(0),(1), (2)]}In this toy example, the first and the third repeat in
repeatscluster according to “overlap_type1”, whereas no repeats cluster according to “overlap_type2”.Parameters: overlap_type (str) – The name of a local pairwise repeat overlapping method.
-
create(file, input_format)[source]¶ Read
RepeatListfrom file.Read
RepeatListfrom file (currently, only pickle is supported)Parameters: - input_format (str) – Currently only “pickle”
- file (str) – Path to output file
Todo
Write checks for
input_formatandfile.
-
filter(func_name, *args, **kwargs)[source]¶ Filter
repeatsaccording tofunc_name.Filter
repeatsaccording tofunc_nameParameters: func_name (str) – The name of a local filtering method.
-
write(output_format, file=None, return_string=None, *args)[source]¶ Serialize and write
RepeatListinstances.Serialize
RepeatListinstance using the statedoutput_format. If afileis specified, save the String. Ifstris specified, give back the String (not possible for pickles).Parameters: - output_format (str) – The output format: Either “pickle” or “tsv”
- file (str) – Path to output file
Todo
Write checks for
output_formatandfile.
-
tral.repeat_list.repeat_list.attribute(rl, attribute, type, threshold)[source]¶ - Returns all repeats in
rlwith a attribute below (above) a certain - threshold.
Returns all repeats in
rlwith a attribute below (above) a certain threshold.Parameters: - rl (RepeatList) – An instance of the RepeatList class.
- attribute (str) – The attribute of the Repeat instance
- type (str) – Either “min” or “max”
- threshold (float) – All repeats with an attribute value below (above) this threshold are filtered out.
- Returns all repeats in
-
tral.repeat_list.repeat_list.common_ancestry(repeat1, repeat2)[source]¶ Do two TRs share at least one pair of chars with common ancestry?
Return 1 if the two TRs share at least one pair of chars (amino acids or nucleotides) with common ancestry; else 0.
Parameters: Returns: 1 if the repeats share >= 1 pair of chars with common ancestry, else 0.
Return type: Bool
-
tral.repeat_list.repeat_list.divergence(rl, score, threshold)[source]¶ - Returns all repeats in
rlwith a divergence below a certain - threshold.
Returns all repeats in
rlwith a divergence below a certain threshold.Parameters: - rl (RepeatList) – An instance of the RepeatList class.
- score (str) – The type of score defines the divergence that is used for filtering
- threshold (float) – All repeats with a divergence of type score above this threshold are filtered out.
- Returns all repeats in
-
tral.repeat_list.repeat_list.none_overlapping(rl, overlap, l_criterion)[source]¶ Returns all none-overlapping repeats in
rl.Returns all none-overlapping repeats in
rl. Repeats are clustered according tooverlap. Of each cluster, only the best repeat is returned according todCriterion.Parameters: - rl (RepeatList) – An instance of the RepeatList class.
- overlap (tuple) – First element: Name (str) of an overlap method in repeat_list. Second element: **kwargs. All remaining elements are additional arguments for this class.
- l_criterion (list) – list of (criterion (str), criterion arguments) tuples. Until only one repeat is remainining in a cluster, the criteria are applied in order.
-
tral.repeat_list.repeat_list.none_overlapping_fixed_repeats(rl, rl_fixed, overlap_type)[source]¶ Returns all repeats in
rlnone-overlapping withrl_fixed.Returns all repeats in
rlnone-overlapping withrl_fixedaccording tooverlap.Parameters: - rl (rl_fixed) – An instance of the RepeatList class.
- rl – A second instance of the RepeatList class.
- overlap (list) – First list element: Name (str) of an overlap method in repeat_list. All remaining elements are additional arguments for this class.
-
tral.repeat_list.repeat_list.pvalue(rl, score, threshold)[source]¶ Returns all repeats in
rlwith a p-Value below a certain threshold.Returns all repeats in
rlwith a p-Value below a certain threshold.Parameters: - rl (RepeatList) – An instance of the RepeatList class.
- score (str) – The type of score defines the pvalue that is used for filtering
- threshold (float) – All repeats with a pvalue of type score above this threshold are filtered out.
Do two TRs share at least one char?
Return 1 if the two TRs share at least one char (amino acids or nucleotides); else 0.
Parameters: Returns: 1 if the repeats share >= 1 char, else 0.
Return type: Bool
-
tral.repeat_list.repeat_list.two_repeats_overlap(overlap_type, repeat1, repeat2)[source]¶ Helper method to test the overlap of
repeat1andrepeat2.Helper method to test the overlap of
repeat1andrepeat2. The overlap is calculated by the local methodoverlap_type.Parameters: Returns: Forwards method output
repeat_list_io¶
| synopsis: | Input/output for the RepeatList class |
|---|
