`class` `IM_MSA_Transformer`[source]

IM_MSA_Transformer(iterations=None, p_mask=None, filename=None, num=None, filepath=None)

Class that implement the Iterative masking algorithm

{% endraw %} {% raw %}

`IM_MSA_Transformer.Batch_MSA`[source]

IM_MSA_Transformer.Batch_MSA(use_pdf=False, simplified=False, repetitions=2, sample_all=False, T=1, phylo=False)

Generate a full MSA by calling with different input MSAs the iterative MSA generator defined in: self.NEW_MSA.

---> Use this function with simplified=False only if you need tokens in cuda ! (i.e. if you want to compute embed or contacs), otherwise use simplified=True

The variable self.iterations must be a numpy array which specifies when (at which iterations) the tokens must be saved. The last element of the array gives the maximum number of iterations that should be done.

repetitions: the number of times self.NEW_MSA() is repeated with a different input MSA.

use_pdf: if it's True the function sample the token from the logits pdf instead of getting the argmax (greedy sampling).

sample_all: if True all the new tokens are obtained from the logits (both the masked and the non masked), if False the non masked tokens are left untouched and only the masked ones are changed.

T: Temperature of sampling from the pdf of output logits.

phylo: if True the start sequences are sampled from phylogeny weights instead of randomly.

{% endraw %} {% raw %}

`IM_MSA_Transformer.Context_MSA`[source]

IM_MSA_Transformer.Context_MSA(depth=None, ancestor=None, context=None, use_pdf=False, simplified=False, sample_all=False, print_all=True, T=1)

Generates a new MSA with context-generation by iterating the masking on the original ancestor sequence using: self.generate_MSA_context. It masks ancestor (original sequence) and uses the sequences in context as context MSA.

---> Use this function with simplified=False only if you need tokens in cuda ! (i.e. if you want to compute embed or contacs), otherwise use simplified=True

ancestor: input sequence to be masked iteratively.

context: context MSA (not masked).

use_pdf: if it's True the function sample the token from the logits pdf instead of getting the argmax (greedy sampling).

sample_all: if True all the new tokens are obtained from the logits (both the masked and the non masked), if False the non masked tokens are left untouched and only the masked ones are changed.

T: Temperature of sampling from the pdf of output logits.

depth: number of generated sequences, if None the depth is the number of ancestor sequences.

{% endraw %} {% raw %}

`gen_MSAs`[source]

gen_MSAs(filepath:"Path of the input directory", filename:"Name of the input file(s)", new_dir:"Name of the output directory", pdf:"Should I sample tokens from the pdf ? (bool)", T:"Which is the sampling Temperature from the pdf ? (only when pdf is True)", sample_all:"Should I sample all tokens or just the masked ones ? (True = sample all tokens)", Iters:"Number of total iterations to generate the new tokens", pmask:"Masking probability", num:"Size of the batches MSAs which the MSA-Transformer receives as input", depth:"Number of batches (of size num) that you want to generate", generate:"How should I generate sequences ? False (=Batch generation) or Linear with context (=linear-ran/linear-tot-ran), -ran means that the context MSA is sampled randomly (once) while -tot-ran means that it is sampled randomly each time.", print_all:"Should I print the MSA after each iteration ? (bool)", range_vals:"First and last index of the sequences that you want to use as ancestors", phylo_w:"Should I sample the starting sequences from the phylogeny weights ? (bool)")

Generate a new MSA either with Batch generation of Context generation. It shuffles the initial MSA and uses different slices as batch MSAs

{% endraw %}

class IM_MSA_Transformer[source]

IM_MSA_Transformer.Batch_MSA[source]

IM_MSA_Transformer.Context_MSA[source]

gen_MSAs[source]

Build library

`class` `IM_MSA_Transformer`[source]

`IM_MSA_Transformer.Batch_MSA`[source]

`IM_MSA_Transformer.Context_MSA`[source]

`gen_MSAs`[source]