1. 23 Mar, 2022 2 commits
  2. 22 Mar, 2022 1 commit
    • NielsRogge's avatar
      Add GLPN (#16199) · 0c55d47c
      NielsRogge authored
      
      
      * First draft
      
      * Fix logits calculation
      
      * Improve tests
      
      * Add copied from statements
      
      * Fix base_model_prefix
      
      * Improve implementation, upload new models
      
      * Update design
      
      * Fix integration test
      
      * Add model to README and toctree
      
      * Add document image
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Add decoder_hidden_size attribute
      
      * Update design of decoder
      
      * Add DepthEstimatorOutput class
      
      * Rename in_index to head_in_index and add feature extractor tests
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * Update pretrained model name and add to doc tests
      
      * Remove test.py script
      
      * Update copied from statements and clean up
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      0c55d47c
  3. 19 Mar, 2022 1 commit
  4. 18 Mar, 2022 5 commits
  5. 17 Mar, 2022 6 commits
  6. 16 Mar, 2022 5 commits
  7. 15 Mar, 2022 3 commits
  8. 14 Mar, 2022 4 commits
    • Francesco Saverio Zuppichini's avatar
      [WIP] Resnet (#15770) · e3008c67
      Francesco Saverio Zuppichini authored
      
      
      * first commit
      
      * ResNet model correctly implemented.
      
      basic modeling + weights conversion is done
      
      removed unused doc
      
      mdx file
      
      doc and conversion script
      
      added feature_extractor to auto
      
      test
      
      minor changes + style + quality
      
      doc
      
      test
      
      Delete process.yml
      
      A left over from my attempt of running circleci locally
      
      * minor changes
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * new test format
      
      * minor changes from conversations
      
      * minor changes from conversations
      
      * make style + quality
      
      * readded the tests
      
      * test + README
      
      * minor changes from conversations
      
      * error in README
      
      * make fix-copies
      
      * removed regression for classification head
      
      * make quality
      
      * fixed loss control flow
      
      * fixed loss control flow
      
      * resolved conversations
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * READMEs
      
      * index.mdx
      
      * minor changes
      
      * updated tests and models
      
      * unused import
      
      * outputs
      
      * Update docs/source/model_doc/resnet.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * added embeddings_size
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * conversation
      
      * added push to hub
      
      * test
      
      * embedding_size
      
      * make fix-copies
      
      * resolved conversations
      
      * CI
      
      * changed organization
      
      * minor changes
      
      * CI
      
      * minor changes
      
      * conversations
      
      * conversation
      
      * doc
      
      * tests
      
      * removed unused docstring
      
      * conversation
      
      * removed unused outputs
      
      * CI
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      e3008c67
    • Yih-Dar's avatar
      Make TF pt-tf equivalence test more aggressive (#15839) · 923c35b5
      Yih-Dar authored
      
      
      * Make TF pt-tf equivalence test more aggressive
      
      * Fix for TFConvNextModelTest and TFTransfoXLModelTest
      
      * fix kwargs for outputs
      
      * clean-up
      
      * Add docstring for check_outputs()
      
      * remove: need to rename encoder-decoder
      
      * clean-up
      
      * send PyTorch things to the correct device
      
      * Add back the accidentally removed test case in test_pt_tf_model_equivalence()
      
      * Fix: change to tuple before calling check_outputs()
      
      * Fix: tfo could be a list
      
      * use to_tuple()
      
      * allow tfo only to be tuple or tensor
      
      * allow tfo to be list or tuple for now + style change
      
      * minor fix
      
      * remove np.copy and update comments
      
      * tfo -> tf_output, same for pt
      
      * Add more detailed comment
      
      * remove the incorrect comment
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      923c35b5
    • Sanchit Gandhi's avatar
      Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained... · 2de99e6c
      Sanchit Gandhi authored
      Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained Encoder-Decoder Checkpoints (#16056)
      
      * Fix Loading of Flax(Speech)EncoderDecoderModel kwargs from PreTrained Encoder-Decoder Checkpoints
      
      * change wording
      2de99e6c
    • lewtun's avatar
      Add TFCamembertForCausalLM and ONNX integration test (#16073) · 6e1e88fd
      lewtun authored
      * Make Camembert great again!
      
      * Add Camembert to TensorFlow ONNX tests
      6e1e88fd
  9. 12 Mar, 2022 1 commit
    • Stas Bekman's avatar
      [Deepspeed] add support for bf16 mode (#14569) · 580dd87c
      Stas Bekman authored
      
      
      * [WIP] add support for bf16 mode
      
      * prep for bf16
      
      * prep for bf16
      
      * fix; zero2/bf16 is ok
      
      * check bf16 is available
      
      * test fixes
      
      * enable zero3_bf16
      
      * config files
      
      * docs
      
      * split stage_dtype; merge back to non-dtype-specific config file
      
      * fix doc
      
      * cleanup
      
      * cleanup
      
      * bfloat16 => bf16 to match the PR changes
      
      * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/
      
      * test fixes/skipping
      
      * move
      
      * fix
      
      * Update docs/source/main_classes/deepspeed.mdx
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * backticks
      
      * cleanup
      
      * cleanup
      
      * cleanup
      
      * new version
      
      * add note about grad accum in bf16
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      580dd87c
  10. 11 Mar, 2022 2 commits
    • Kevin Bondzio's avatar
      Add soft length regulation for sequence generation (#15245) · 9442b3ce
      Kevin Bondzio authored
      
      
      * add possibility to softly regulate length when using sampling method in model.generate() function
      
      * fix test config, fix formatting
      
      * fix rag integration, fix docstyling
      
      * fix wrong docstring
      
      * change param to tuple, add test
      
      * fix old param in rag_model, remove unused import
      
      * change test according to new param
      
      * fix formatting
      
      * fix test case
      
      * fix doc style
      
      * move start_length calculation to Logitprocessor
      
      * add possibility to softly regulate length when using sampling method in model.generate() function
      
      * fix rag integration, fix docstyling
      
      * fix test config, fix formatting
      
      * change param to tuple, add test
      
      * fix old param in rag_model, remove unused import
      
      * add possibility to softly regulate length when using sampling method in model.generate() function
      
      * change param to tuple, add test
      
      * fix old param in rag_model, remove unused import
      
      * remove unused import
      
      * fix small errors
      
      * fix test
      
      * add possibility to softly regulate length when using sampling method in model.generate() function
      
      * fix test config, fix formatting
      
      * fix rag integration, fix docstyling
      
      * change param to tuple, add test
      
      * fix old param in rag_model, remove unused import
      
      * change test according to new param
      
      * fix test case
      
      * move start_length calculation to Logitprocessor
      
      * add possibility to softly regulate length when using sampling method in model.generate() function
      
      * fix rag integration, fix docstyling
      
      * fix test config, fix formatting
      
      * change param to tuple, add test
      
      * fix old param in rag_model, remove unused import
      
      * add possibility to softly regulate length when using sampling method in model.generate() function
      
      * fix test config, fix formatting
      
      * fix rag integration, fix docstyling
      
      * add possibility to softly regulate length when using sampling method in model.generate() function
      
      * fix rag integration, fix docstyling
      
      * change param to tuple, add test
      
      * fix old param in rag_model, remove unused import
      
      * fix small errors
      
      * Update src/transformers/generation_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/generation_utils.py
      
      * Update src/transformers/generation_utils.py
      
      * fix docstring, add type ind model rag
      
      * fix docstrings
      
      * introduce seq_length variable for cleaner code
      
      * fix black formatting
      
      * add input_ids_seq_length to modeling_rag
      
      * add input_ids_seq_length to test
      
      * retrigger checks
      
      * retrigger checks
      Co-authored-by: default avatarKevin Bondzio <kev@AIM-LAP-02.local>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarKevin Bondzio <kev@AIM-LAP-02.fritz.box>
      9442b3ce
    • Yih-Dar's avatar
      Fix a TF test name (LayoutLMModelTest) (#16061) · b6bdb943
      Yih-Dar authored
      
      
      * fix name
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      b6bdb943
  11. 10 Mar, 2022 6 commits
  12. 09 Mar, 2022 4 commits