• Caroline Chen's avatar
    Add NNLM support to CTC Decoder (#2528) · 03a0d68e
    Caroline Chen authored
    Summary:
    Expose flashlight's LM and LMState classes to support decoding with custom language models, including NN LMs.
    
    The `ctc_decoder` API is as follows
    - To decode with KenLM, pass in KenLM language model path to `lm` variable
    - To decode with custom LM, create Python class with `CTCDecoderLM` subclass, and pass in the class to `lm` variable. Additionally create a file of LM words listed in order of the LM index, with a word per line, and pass in the file to `lm_path`.
    - To decode without a language model, set `lm` to `None` (default)
    
    Validated against fairseq w2l decoder on sample LibriSpeech dataset and LM. Code for validation can be found [here](https://github.com/facebookresearch/fairseq/compare/main...carolineechen:fairseq:ctc-decoder). Also added unit tests to validate custom implementations of ZeroLM and KenLM, and also using a biased LM.
    
    Follow ups:
    - Train simple LM on LibriSpeech and demonstrate usage in tutorial or examples directory
    
    cc jacobkahn
    
    Pull Request resolved: https://github.com/pytorch/audio/pull/2528
    
    Reviewed By: mthrok
    
    Differential Revision: D38243802
    
    Pulled By: carolineechen
    
    fbshipit-source-id: 445e78f6c20bda655aabf819fc0f771fe68c73d7
    03a0d68e
__init__.py 752 Bytes