Repeat_list code documentation

Initial version of the repeat_list submodule documentation.

repeat_list

class tral.repeat_list.repeat_list.RepeatList(repeats)[source]

A RepeatList contains list of repeats that belong to the same sequence, or set of sequences.

RepeatList contains methods that act on several tandem repeats. For example, methods to

  • detect overlapping tandem repeats
  • identify the tandem repeat with highest statistical significance
  • filter a set of tandem repeats according to different filtering procedures.
repeats

list of Repeat

The list of tandem repats.

__str__()[source]

Create string for RepeatList instance.

cluster(overlap_type, *args)[source]

Cluster repeats according to overlap_type.

Cluster repeats according to overlap_type. We assume that overlap of repeats is transitive: If A overlaps with B, and B overlaps with C, (A,B,C) form one cluster. The attribute cluster is initiated to a dict:

self.cluster = {“overlap_type1”: [(0,2),(1)], “overlap_type2”: [(0),(1), (2)]}

In this toy example, the first and the third repeat in repeats cluster according to “overlap_type1”, whereas no repeats cluster according to “overlap_type2”.

Parameters:overlap_type (str) – The name of a local pairwise repeat overlapping method.
create(file, input_format)[source]

Read RepeatList from file.

Read RepeatList from file (currently, only pickle is supported)

Parameters:
  • input_format (str) – Currently only “pickle”
  • file (str) – Path to output file

Todo

Write checks for input_format and file.

filter(func_name, *args, **kwargs)[source]

Filter repeats according to func_name.

Filter repeats according to func_name

Parameters:func_name (str) – The name of a local filtering method.
write(output_format, file=None, return_string=None, *args)[source]

Serialize and write RepeatList instances.

Serialize RepeatList instance using the stated output_format. If a file is specified, save the String. If str is specified, give back the String (not possible for pickles).

Parameters:
  • output_format (str) – The output format: Either “pickle” or “tsv”
  • file (str) – Path to output file

Todo

Write checks for output_format and file.

tral.repeat_list.repeat_list.attribute(rl, attribute, type, threshold)[source]
Returns all repeats in rl with a attribute below (above) a certain
threshold.

Returns all repeats in rl with a attribute below (above) a certain threshold.

Parameters:
  • rl (RepeatList) – An instance of the RepeatList class.
  • attribute (str) – The attribute of the Repeat instance
  • type (str) – Either “min” or “max”
  • threshold (float) – All repeats with an attribute value below (above) this threshold are filtered out.
tral.repeat_list.repeat_list.common_ancestry(repeat1, repeat2)[source]

Do two TRs share at least one pair of chars with common ancestry?

Return 1 if the two TRs share at least one pair of chars (amino acids or nucleotides) with common ancestry; else 0.

Parameters:
  • repeat1 (Repeat) – An instance of the Repeat class
  • repeat2 (Repeat) – A second instance of the Repeat class
Returns:

1 if the repeats share >= 1 pair of chars with common ancestry, else 0.

Return type:

Bool

tral.repeat_list.repeat_list.divergence(rl, score, threshold)[source]
Returns all repeats in rl with a divergence below a certain
threshold.

Returns all repeats in rl with a divergence below a certain threshold.

Parameters:
  • rl (RepeatList) – An instance of the RepeatList class.
  • score (str) – The type of score defines the divergence that is used for filtering
  • threshold (float) – All repeats with a divergence of type score above this threshold are filtered out.
tral.repeat_list.repeat_list.none_overlapping(rl, overlap, l_criterion)[source]

Returns all none-overlapping repeats in rl.

Returns all none-overlapping repeats in rl. Repeats are clustered according to overlap. Of each cluster, only the best repeat is returned according to dCriterion.

Parameters:
  • rl (RepeatList) – An instance of the RepeatList class.
  • overlap (tuple) – First element: Name (str) of an overlap method in repeat_list. Second element: **kwargs. All remaining elements are additional arguments for this class.
  • l_criterion (list) – list of (criterion (str), criterion arguments) tuples. Until only one repeat is remainining in a cluster, the criteria are applied in order.
tral.repeat_list.repeat_list.none_overlapping_fixed_repeats(rl, rl_fixed, overlap_type)[source]

Returns all repeats in rl none-overlapping with rl_fixed.

Returns all repeats in rl none-overlapping with rl_fixed according to overlap.

Parameters:
  • rl (rl_fixed) – An instance of the RepeatList class.
  • rl – A second instance of the RepeatList class.
  • overlap (list) – First list element: Name (str) of an overlap method in repeat_list. All remaining elements are additional arguments for this class.
tral.repeat_list.repeat_list.pvalue(rl, score, threshold)[source]

Returns all repeats in rl with a p-Value below a certain threshold.

Returns all repeats in rl with a p-Value below a certain threshold.

Parameters:
  • rl (RepeatList) – An instance of the RepeatList class.
  • score (str) – The type of score defines the pvalue that is used for filtering
  • threshold (float) – All repeats with a pvalue of type score above this threshold are filtered out.
tral.repeat_list.repeat_list.shared_char(repeat1, repeat2)[source]

Do two TRs share at least one char?

Return 1 if the two TRs share at least one char (amino acids or nucleotides); else 0.

Parameters:
  • repeat1 (Repeat) – An instance of the Repeat class
  • repeat2 (Repeat) – A second instance of the Repeat class
Returns:

1 if the repeats share >= 1 char, else 0.

Return type:

Bool

tral.repeat_list.repeat_list.two_repeats_overlap(overlap_type, repeat1, repeat2)[source]

Helper method to test the overlap of repeat1 and repeat2.

Helper method to test the overlap of repeat1 and repeat2. The overlap is calculated by the local method overlap_type.

Parameters:
  • overlap_type (str) – The name of a local pairwise repeat overlapping method.
  • repeat1 (Repeat) – An instance of the Repeat class
  • repeat2 (Repeat) – A second instance of the Repeat class
Returns:

Forwards method output

repeat_list_io

synopsis:Input/output for the RepeatList class