"tests/trainer/test_trainer_distributed.py" did not exist on "4c06893610ee148f6645b7fee21f382de5d53023"
  1. 30 Mar, 2021 1 commit
    • Suraj Patil's avatar
      GPT Neo (#10848) · 86026437
      Suraj Patil authored
      
      
      * lets begin
      
      * boom boom
      
      * fix out proj in attn
      
      * fix attention
      
      * fix local attention
      
      * add tokenizer
      
      * fix imports
      
      * autotokenizer
      
      * fix checkpoint name
      
      * cleanup
      
      * more clean-up
      
      * more cleanup
      
      * output attentions
      
      * fix attn mask creation
      
      * fix imports
      
      * config doc
      
      * add tests
      
      * add slow tests
      
      * quality
      
      * add conversion script
      
      * copyright
      
      * typo
      
      * another bites the dust
      
      * fix attention tests
      
      * doc
      
      * add embed init in convert function
      
      * fix copies
      
      * remove tokenizer
      
      * enable caching
      
      * address review comments
      
      * improve config and create attn layer list internally
      
      * more consistent naming
      
      * init hf config from mesh-tf config json file
      
      * remove neo tokenizer from doc
      
      * handle attention_mask in local attn layer
      
      * attn_layers => attention_layers
      
      * add tokenizer_class in config
      
      * fix docstring
      
      * raise if len of attention_layers is not same as num_layers
      
      * remove tokenizer_class from config
      
      * more consistent naming
      
      * fix doc
      
      * fix checkpoint names
      
      * fp16 compat
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      86026437