Repeat_list code documentation

The repeat_list submodule documentation.

repeat_list

class tral.repeat_list.repeat_list.RepeatList(repeats)[source]

A RepeatList contains list of repeats that belong to the same sequence, or set of sequences.

RepeatList contains methods that act on several tandem repeats. For example, methods to

  • detect overlapping tandem repeats

  • identify the tandem repeat with highest statistical significance

  • filter a set of tandem repeats according to different filtering procedures.

repeats

The list of tandem repats.

Type

list of Repeat

cluster(overlap_type, *args)[source]

Cluster repeats according to overlap_type.

Cluster repeats according to overlap_type. We assume that overlap of repeats is transitive: If A overlaps with B, and B overlaps with C, (A,B,C) form one cluster. The attribute cluster is initiated to a dict:

self.cluster = {“overlap_type1”: [(0,2),(1)], “overlap_type2”: [(0),(1), (2)]}

In this toy example, the first and the third repeat in repeats cluster according to “overlap_type1”, whereas no repeats cluster according to “overlap_type2”.

Parameters

overlap_type (str) – The name of a local pairwise repeat overlapping method.

create(input_format)[source]

Read RepeatList from file.

Read RepeatList from file (currently, only pickle is supported)

Parameters
  • input_format (str) – Currently only “pickle”

  • file (str) – Path to output file

Todo

Write checks for input_format and file.

filter(func_name, *args, **kwargs)[source]

Filter repeats according to func_name.

Filter repeats according to func_name

Parameters

func_name (str) – The name of a local filtering method.

write(output_format, file=None, return_string=None, *args)[source]

Serialize and write RepeatList instances.

Serialize RepeatList instance using the stated output_format. If a file is specified, save the String. If str is specified, give back the String (not possible for pickles).

Parameters
  • output_format (str) – The output format: Either “pickle” or “tsv”

  • file (str) – Path to output file

Todo

Write checks for output_format and file.

tral.repeat_list.repeat_list.attribute(rl, attribute, type, threshold)[source]
Returns all repeats in rl with a attribute below (above) a certain

threshold.

Returns all repeats in rl with a attribute below (above) a certain threshold.

Parameters
  • rl (RepeatList) – An instance of the RepeatList class.

  • attribute (str) – The attribute of the Repeat instance

  • type (str) – Either “min” or “max”

  • threshold (float) – All repeats with an attribute value below (above) this threshold are filtered out.

tral.repeat_list.repeat_list.common_ancestry(repeat1, repeat2)[source]

Do two TRs share at least one pair of chars with common ancestry?

Return 1 if the two TRs share at least one pair of chars (amino acids or nucleotides) with common ancestry; else 0.

Parameters
  • repeat1 (Repeat) – An instance of the Repeat class

  • repeat2 (Repeat) – A second instance of the Repeat class

Returns

1 if the repeats share >= 1 pair of chars with common ancestry, else 0.

Return type

Bool

tral.repeat_list.repeat_list.divergence(rl, score, threshold)[source]
Returns all repeats in rl with a divergence below a certain

threshold.

Returns all repeats in rl with a divergence below a certain threshold.

Parameters
  • rl (RepeatList) – An instance of the RepeatList class.

  • score (str) – The type of score defines the divergence that is used for filtering

  • threshold (float) – All repeats with a divergence of type score above this threshold are filtered out.

tral.repeat_list.repeat_list.none_overlapping(rl, overlap, l_criterion)[source]

Returns all none-overlapping repeats in rl.

Returns all none-overlapping repeats in rl. Repeats are clustered according to overlap. Of each cluster, only the best repeat is returned according to dCriterion.

Parameters
  • rl (RepeatList) – An instance of the RepeatList class.

  • overlap (tuple) – First element: Name (str) of an overlap method in repeat_list. Second element: **kwargs. All remaining elements are additional arguments for this class.

  • l_criterion (list) – list of (criterion (str), criterion arguments) tuples. Until only one repeat is remainining in a cluster, the criteria are applied in order.

tral.repeat_list.repeat_list.none_overlapping_fixed_repeats(rl, rl_fixed, overlap_type)[source]

Returns all repeats in rl none-overlapping with rl_fixed.

Returns all repeats in rl none-overlapping with rl_fixed according to overlap.

Parameters
  • rl (rl_fixed) – An instance of the RepeatList class.

  • rl – A second instance of the RepeatList class.

  • overlap (list) – First list element: Name (str) of an overlap method in repeat_list. All remaining elements are additional arguments for this class.

tral.repeat_list.repeat_list.pvalue(rl, score, threshold)[source]

Returns all repeats in rl with a p-Value below a certain threshold.

Returns all repeats in rl with a p-Value below a certain threshold.

Parameters
  • rl (RepeatList) – An instance of the RepeatList class.

  • score (str) – The type of score defines the pvalue that is used for filtering

  • threshold (float) – All repeats with a pvalue of type score above this threshold are filtered out.

tral.repeat_list.repeat_list.shared_char(repeat1, repeat2)[source]

Do two TRs share at least one char?

Return 1 if the two TRs share at least one char (amino acids or nucleotides); else 0.

Parameters
  • repeat1 (Repeat) – An instance of the Repeat class

  • repeat2 (Repeat) – A second instance of the Repeat class

Returns

1 if the repeats share >= 1 char, else 0.

Return type

Bool

tral.repeat_list.repeat_list.two_repeats_overlap(overlap_type, repeat1, repeat2)[source]

Helper method to test the overlap of repeat1 and repeat2.

Helper method to test the overlap of repeat1 and repeat2. The overlap is calculated by the local method overlap_type.

Parameters
  • overlap_type (str) – The name of a local pairwise repeat overlapping method.

  • repeat1 (Repeat) – An instance of the Repeat class

  • repeat2 (Repeat) – A second instance of the Repeat class

Returns

Forwards method output

repeat_list_io

synopsis

Input/output for the RepeatList class