1. 28 Jun, 2022 7 commits
  2. 27 Jun, 2022 6 commits
  3. 25 Jun, 2022 1 commit
  4. 24 Jun, 2022 6 commits
    • kumapo's avatar
      2ef94ee0
    • willtai's avatar
      Add type hints for gptneox models (#17858) · ef28a402
      willtai authored
      * feat: Add type hints for GPTNeoxForCausalLM and GPTNeoXModel
      
      * fix: removed imported Dict type
      
      * fix: Removed unused List import
      ef28a402
    • Suraj Patil's avatar
    • rooa's avatar
      Add CodeGen model (#17443) · d6b6fb99
      rooa authored
      
      
      * Add CodeGen model
      
      * Add missing key and switch order of super()
      
      * Fix torch.ones init with uint8 instead of bool
      
      * Address comments: copy statements and doc
      
      * update tests
      
      * remove old model parallel
      
      * fix batch gen tests
      
      * fix batch gen test
      
      * update test_gpt2_sample_max_time
      
      * fix codgen test and revert gpt2 test change
      
      * Fix incorrect tie_word_embedding value, typo, URL
      
      * Fix model order in README and styling
      
      * Reorder model list alphabetically
      
      * Set tie_word_embedding to False by default
      
      * Apply suggestions from code review
      
      * Better attn mask name & remove attn masked_bias
      
      * add tokenizer for codegen
      
      * quality
      
      * doc tokenizer
      
      * fix-copies
      
      * add CodeGenTokenizer in converter
      
      * make truncation optional
      
      * add test for truncation
      
      * add copyright
      
      * fix-copies
      
      * fix fast tokenizer decode
      
      * Update src/transformers/models/codegen/tokenization_codegen.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * increase vocab_size in tests
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d6b6fb99
    • NaN's avatar
      Fix Constrained beam search duplication and weird output issue (#17814) · bc7a6fdc
      NaN authored
      * fix(ConstrainedBeamSearchScorer.step_sentence_constraint): avoid hypothesis duplication between topk and advance
      
      * fix(GenerationMixin.constrained_beam_search): appropriately assign beam scores instead of token scores
      bc7a6fdc
    • NielsRogge's avatar
      Improve vision models (#17731) · 09178705
      NielsRogge authored
      
      
      * Improve vision models
      
      * Add a lot of improvements
      
      * Remove to_2tuple from swin tests
      
      * Fix TF Swin
      
      * Fix more tests
      
      * Fix copies
      
      * Improve more models
      
      * Fix ViTMAE test
      
      * Add channel check for TF models
      
      * Add proper channel check for TF models
      
      * Apply suggestion from code review
      
      * Apply suggestions from code review
      
      * Add channel check for Flax models, apply suggestion
      
      * Fix bug
      
      * Add tests for greyscale images
      
      * Add test for interpolation of pos encodigns
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      09178705
  5. 23 Jun, 2022 11 commits
  6. 22 Jun, 2022 5 commits
    • Sylvain Gugger's avatar
      Offload fixes (#17810) · df8e6804
      Sylvain Gugger authored
      * Offload fixes
      
      * Add a test
      df8e6804
    • Joao Gante's avatar
      CLI: use hub's `create_commit` (#17755) · 0d0c392c
      Joao Gante authored
      * use create_commit
      
      * better commit message and description
      
      * touch setup.py to trigger cache update
      
      * add hub version gating
      0d0c392c
    • Arthur's avatar
      initial commit (#17818) · 56b83cf0
      Arthur authored
      56b83cf0
    • Eran Hirsch's avatar
      Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer`... · 13570381
      Eran Hirsch authored
      Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` (#17805)
      
      * Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict`
      
      * Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it
      
      * Remove `self._num_beams` from trainer classes
      
      * - Run fixup
      - Fix "Constraint" not exposed
      - Fix synced_gpus to actually read from param
      
      * Use kwargs
      
      * Copy kwargs before making changes to it
      
      * Fix style issues unused imports
      13570381
    • Arthur's avatar
      Flax sharded (#17760) · 16c6eb7c
      Arthur authored
      16c6eb7c
  7. 21 Jun, 2022 4 commits