1. 03 May, 2022 1 commit
    • Sanchit Gandhi's avatar
      [FlaxBert] Add ForCausalLM (#16995) · cd9274d0
      Sanchit Gandhi authored
      * [FlaxBert] Add ForCausalLM
      
      * make style
      
      * fix output attentions
      
      * Add RobertaForCausalLM
      
      * remove comment
      
      * fix fx-to-pt model loading
      
      * remove comment
      
      * add modeling tests
      
      * add enc-dec model tests
      
      * add big_bird
      
      * add electra
      
      * make style
      
      * make repo-consitency
      
      * add to docs
      
      * remove roberta test
      
      * quality
      
      * amend cookiecutter
      
      * fix attention_mask bug in flax bert model tester
      
      * tighten pt-fx thresholds to 1e-5
      
      * add 'copied from' statements
      
      * amend 'copied from' statements
      
      * amend 'copied from' statements
      
      * quality
      cd9274d0
  2. 01 Apr, 2022 1 commit
  3. 25 Feb, 2022 1 commit
    • Yih-Dar's avatar
      Fix tf.concatenate + test past_key_values for TF models (#15774) · 8635407b
      Yih-Dar authored
      
      
      * fix wrong method name tf.concatenate
      
      * add tests related to causal LM / decoder
      
      * make style and quality
      
      * clean-up
      
      * Fix TFBertModel's extended_attention_mask when past_key_values is provided
      
      * Fix tests
      
      * fix copies
      
      * More tf.int8 -> tf.int32 in TF test template
      
      * clean-up
      
      * Update TF test template
      
      * revert the previous commit + update the TF test template
      
      * Fix TF template extended_attention_mask when past_key_values is provided
      
      * Fix some styles manually
      
      * clean-up
      
      * Fix ValueError: too many values to unpack in the test
      
      * Fix more: too many values to unpack in the test
      
      * Add a comment for extended_attention_mask when there is past_key_values
      
      * Fix TFElectra extended_attention_mask when past_key_values is provided
      
      * Add tests to other TF models
      
      * Fix for TF Electra test: add prepare_config_and_inputs_for_decoder
      
      * Fix not passing training arg to lm_head in TFRobertaForCausalLM
      
      * Fix tests (with past) for TF Roberta
      
      * add testing for pask_key_values for TFElectra model
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      8635407b
  4. 23 Feb, 2022 1 commit