"...text-generation-inference.git" did not exist on "648ea06430366a735c92b0c688b09b022ad84438"
Commit 3229fc55 authored by Caroline Chen's avatar Caroline Chen Committed by Facebook GitHub Bot
Browse files

Update CTC decoder docs (#2443)

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2443

Reviewed By: nateanl

Differential Revision: D36909822

Pulled By: carolineechen

fbshipit-source-id: ef3ab2345e7a4666cf29dd02c83d03504e8aa62c
parent 41082eb0
...@@ -27,7 +27,7 @@ _PretrainedFiles = namedtuple("PretrainedFiles", ["lexicon", "tokens", "lm"]) ...@@ -27,7 +27,7 @@ _PretrainedFiles = namedtuple("PretrainedFiles", ["lexicon", "tokens", "lm"])
class CTCHypothesis(NamedTuple): class CTCHypothesis(NamedTuple):
r"""Represents hypothesis generated by CTC beam search decoder :py:func`CTCDecoder`. r"""Represents hypothesis generated by CTC beam search decoder :py:func:`CTCDecoder`.
:ivar torch.LongTensor tokens: Predicted sequence of token IDs. Shape `(L, )`, where :ivar torch.LongTensor tokens: Predicted sequence of token IDs. Shape `(L, )`, where
`L` is the length of the output sequence `L` is the length of the output sequence
...@@ -46,15 +46,14 @@ class CTCDecoder: ...@@ -46,15 +46,14 @@ class CTCDecoder:
""" """
.. devices:: CPU .. devices:: CPU
Lexically contrained CTC beam search decoder from *Flashlight* [:footcite:`kahn2022flashlight`]. CTC beam search decoder from *Flashlight* [:footcite:`kahn2022flashlight`].
Note: Note:
To build the decoder, please use factory function To build the decoder, please use the factory function :py:func:`ctc_decoder`.
:py:func:`ctc_decoder`.
Args: Args:
nbest (int): number of best decodings to return nbest (int): number of best decodings to return
lexicon (Dict or None): lexicon mapping of words to spellings, or None for lexicon free decoder lexicon (Dict or None): lexicon mapping of words to spellings, or None for lexicon-free decoder
word_dict (_Dictionary): dictionary of words word_dict (_Dictionary): dictionary of words
tokens_dict (_Dictionary): dictionary of tokens tokens_dict (_Dictionary): dictionary of tokens
lm (_LM): language model lm (_LM): language model
...@@ -211,12 +210,11 @@ def ctc_decoder( ...@@ -211,12 +210,11 @@ def ctc_decoder(
unk_word: str = "<unk>", unk_word: str = "<unk>",
) -> CTCDecoder: ) -> CTCDecoder:
""" """
Builds lexically constrained CTC beam search decoder from Builds CTC beam search decoder from *Flashlight* [:footcite:`kahn2022flashlight`].
*Flashlight* [:footcite:`kahn2022flashlight`].
Args: Args:
lexicon (str or None): lexicon file containing the possible words and corresponding spellings. lexicon (str or None): lexicon file containing the possible words and corresponding spellings.
Each line consists of a word and its space separated spelling. If `None`, uses lexicon free Each line consists of a word and its space separated spelling. If `None`, uses lexicon-free
decoding. decoding.
tokens (str or List[str]): file or list containing valid tokens. If using a file, the expected tokens (str or List[str]): file or list containing valid tokens. If using a file, the expected
format is for tokens mapping to the same index to be on the same line format is for tokens mapping to the same index to be on the same line
...@@ -224,7 +222,7 @@ def ctc_decoder( ...@@ -224,7 +222,7 @@ def ctc_decoder(
nbest (int, optional): number of best decodings to return (Default: 1) nbest (int, optional): number of best decodings to return (Default: 1)
beam_size (int, optional): max number of hypos to hold after each decode step (Default: 50) beam_size (int, optional): max number of hypos to hold after each decode step (Default: 50)
beam_size_token (int, optional): max number of tokens to consider at each decode step. beam_size_token (int, optional): max number of tokens to consider at each decode step.
If None, it is set to the total number of tokens (Default: None) If `None`, it is set to the total number of tokens (Default: None)
beam_threshold (float, optional): threshold for pruning hypothesis (Default: 50) beam_threshold (float, optional): threshold for pruning hypothesis (Default: 50)
lm_weight (float, optional): weight of language model (Default: 2) lm_weight (float, optional): weight of language model (Default: 2)
word_score (float, optional): word insertion score (Default: 0) word_score (float, optional): word insertion score (Default: 0)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment