1. 23 Mar, 2022 1 commit
  2. 22 Mar, 2022 4 commits
  3. 21 Mar, 2022 9 commits
  4. 18 Mar, 2022 1 commit
  5. 17 Mar, 2022 2 commits
  6. 16 Mar, 2022 1 commit
  7. 15 Mar, 2022 4 commits
  8. 14 Mar, 2022 4 commits
  9. 12 Mar, 2022 1 commit
    • Stas Bekman's avatar
      [Deepspeed] add support for bf16 mode (#14569) · 580dd87c
      Stas Bekman authored
      
      
      * [WIP] add support for bf16 mode
      
      * prep for bf16
      
      * prep for bf16
      
      * fix; zero2/bf16 is ok
      
      * check bf16 is available
      
      * test fixes
      
      * enable zero3_bf16
      
      * config files
      
      * docs
      
      * split stage_dtype; merge back to non-dtype-specific config file
      
      * fix doc
      
      * cleanup
      
      * cleanup
      
      * bfloat16 => bf16 to match the PR changes
      
      * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/
      
      * test fixes/skipping
      
      * move
      
      * fix
      
      * Update docs/source/main_classes/deepspeed.mdx
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * backticks
      
      * cleanup
      
      * cleanup
      
      * cleanup
      
      * new version
      
      * add note about grad accum in bf16
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      580dd87c
  10. 11 Mar, 2022 3 commits
  11. 10 Mar, 2022 3 commits
  12. 09 Mar, 2022 3 commits
    • Sanchit Gandhi's avatar
      Add FlaxBartForCausalLM (#15995) · b256f351
      Sanchit Gandhi authored
      * add causal lm
      
      * add CausalLM tests
      
      * Add FlaxBartForCausalLM
      
      * Add EncoderDecoder model tests
      
      * change docstring
      
      * make repo-consistency
      
      * suggested changes
      
      * remove jax ops
      
      * correction
      
      * rename pre-trained decoder model
      b256f351
    • lewtun's avatar
      Add ONNX export for ViT (#15658) · 50dd314d
      lewtun authored
      
      
      * Add ONNX support for ViT
      
      * Refactor to use generic preprocessor
      
      * Add vision dep to tests
      
      * Extend ONNX slow tests to ViT
      
      * Add dummy image generator
      
      * Use model_type to determine modality
      
      * Add deprecation warnings for tokenizer argument
      
      * Add warning when overwriting the preprocessor
      
      * Add optional args to docstrings
      
      * Add minimum PyTorch version to OnnxConfig
      
      * Refactor OnnxConfig class variables from CONSTANT_NAME to snake_case
      
      * Add reasonable value for default atol
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      50dd314d
    • Patrick von Platen's avatar
      [Doctests] Move doctests to new GPU & Fix bugs (#15969) · c1aaa439
      Patrick von Platen authored
      
      
      * test
      
      * up
      
      * up
      
      * Empty test commit
      
      * up
      
      * update tests
      
      * up
      
      * fix some vision models
      
      * correct
      
      * correct docs
      
      * Trigger notification
      
      * finalize
      
      * check
      
      * correct quicktour
      
      * Apply suggestions from code review
      
      * improve doctests
      
      * Trigger Build
      
      * next try
      
      * next try
      
      * and again
      
      * Output current clone information
      
      * Output current clone information
      
      * Correct path
      
      * add tf round again
      
      * revert to daily job
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      c1aaa439
  13. 07 Mar, 2022 1 commit
  14. 04 Mar, 2022 3 commits
    • Chan Woo Kim's avatar
      Constrained Beam Search [*With* Disjunctive Decoding] (#15761) · 5c6f57ee
      Chan Woo Kim authored
      
      
      * added classes to get started with constrained beam search
      
      * in progress, think i can directly force tokens now but not yet with the round robin
      
      * think now i have total control, now need to code the bank selection
      
      * technically works as desired, need to optimize and fix design choices leading to undersirable outputs
      
      * complete PR #1 without disjunctive decoding
      
      * removed incorrect tests
      
      * Delete k.txt
      
      * Delete test.py
      
      * Delete test.sh
      
      * revert changes to test scripts
      
      * genutils
      
      * full implementation with testing, no disjunctive yet
      
      * shifted docs
      
      * passing all tests realistically ran locally
      
      * removing accidentally included print statements
      
      * fixed source of error in initial PR test
      
      * fixing the get_device() vs device trap
      
      * fixed documentation docstrings about constrained_beam_search
      
      * fixed tests having failing for Speech2TextModel's floating point inputs
      
      * fix cuda long tensor
      
      * added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search
      
      * deleted accidentally added test halting code with assert False
      
      * code reformat
      
      * Update tests/test_generation_utils.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update tests/test_generation_utils.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update tests/test_generation_utils.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update tests/test_generation_utils.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update tests/test_generation_utils.py
      
      * fixing based on comments on PR
      
      * took out the testing code that should but work fails without the beam search moditification ; style changes
      
      * fixing comments issues
      
      * docstrings for ConstraintListState
      
      * typo in PhrsalConstraint docstring
      
      * docstrings improvements
      
      * finished adding what is sort of an opinionated implementation of disjunctive generation, but it revealed errors in inner beam search logic during testing.
      
      * fixed bug found in constrained beam search that used beam_idx that were not global across all the batches
      
      * disjunctive constraint working 100% correctly
      
      * passing all tests
      
      * Accidentally included mlruns
      
      * Update src/transformers/generation_beam_constraints.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/generation_beam_constraints.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * complete overhaul of type complexities and other nits
      
      * strict type checks in generate()
      
      * fixing second round of feedback by narsil
      
      * fixed failing generation test because of type check overhaul
      
      * generation test fail fix
      
      * fixing test fails
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      5c6f57ee
    • Javier de la Rosa's avatar
      Add missing support for Flax XLM-RoBERTa (#15900) · 01485cee
      Javier de la Rosa authored
      
      
      * Adding Flax XLM-RoBERTa
      
      * Add Flax to __init__
      
      * Adding doc and dummy objects
      
      * Add tests
      
      * Add Flax XLM-R models autodoc
      
      * Fix tests
      
      * Add Flask XLM-RoBERTa to TEST_FILES_WITH_NO_COMMON_TESTS
      
      * Update src/transformers/models/xlm_roberta/modeling_flax_xlm_roberta.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Remove test on large Flask XLM-RoBERTa
      
      * Add tokenizer to the test
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      01485cee
    • Nicolas Patry's avatar
      Making MaskFormerForInstanceSegmentation. (#15934) · 89c7d9cf
      Nicolas Patry authored
      Small adjustments.
      
      Adding in type hint.
      
      Last fix ?
      
      Only include the default dict thing, not the pipelines.
      89c7d9cf