Commit 3229fc55 authored by Caroline Chen's avatar Caroline Chen Committed by Facebook GitHub Bot
Browse files

Update CTC decoder docs (#2443)

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2443

Reviewed By: nateanl

Differential Revision: D36909822

Pulled By: carolineechen

fbshipit-source-id: ef3ab2345e7a4666cf29dd02c83d03504e8aa62c
parent 41082eb0
......@@ -27,7 +27,7 @@ _PretrainedFiles = namedtuple("PretrainedFiles", ["lexicon", "tokens", "lm"])
class CTCHypothesis(NamedTuple):
r"""Represents hypothesis generated by CTC beam search decoder :py:func`CTCDecoder`.
r"""Represents hypothesis generated by CTC beam search decoder :py:func:`CTCDecoder`.
:ivar torch.LongTensor tokens: Predicted sequence of token IDs. Shape `(L, )`, where
`L` is the length of the output sequence
......@@ -46,15 +46,14 @@ class CTCDecoder:
"""
.. devices:: CPU
Lexically contrained CTC beam search decoder from *Flashlight* [:footcite:`kahn2022flashlight`].
CTC beam search decoder from *Flashlight* [:footcite:`kahn2022flashlight`].
Note:
To build the decoder, please use factory function
:py:func:`ctc_decoder`.
To build the decoder, please use the factory function :py:func:`ctc_decoder`.
Args:
nbest (int): number of best decodings to return
lexicon (Dict or None): lexicon mapping of words to spellings, or None for lexicon free decoder
lexicon (Dict or None): lexicon mapping of words to spellings, or None for lexicon-free decoder
word_dict (_Dictionary): dictionary of words
tokens_dict (_Dictionary): dictionary of tokens
lm (_LM): language model
......@@ -211,12 +210,11 @@ def ctc_decoder(
unk_word: str = "<unk>",
) -> CTCDecoder:
"""
Builds lexically constrained CTC beam search decoder from
*Flashlight* [:footcite:`kahn2022flashlight`].
Builds CTC beam search decoder from *Flashlight* [:footcite:`kahn2022flashlight`].
Args:
lexicon (str or None): lexicon file containing the possible words and corresponding spellings.
Each line consists of a word and its space separated spelling. If `None`, uses lexicon free
Each line consists of a word and its space separated spelling. If `None`, uses lexicon-free
decoding.
tokens (str or List[str]): file or list containing valid tokens. If using a file, the expected
format is for tokens mapping to the same index to be on the same line
......@@ -224,7 +222,7 @@ def ctc_decoder(
nbest (int, optional): number of best decodings to return (Default: 1)
beam_size (int, optional): max number of hypos to hold after each decode step (Default: 50)
beam_size_token (int, optional): max number of tokens to consider at each decode step.
If None, it is set to the total number of tokens (Default: None)
If `None`, it is set to the total number of tokens (Default: None)
beam_threshold (float, optional): threshold for pruning hypothesis (Default: 50)
lm_weight (float, optional): weight of language model (Default: 2)
word_score (float, optional): word insertion score (Default: 0)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment