1. 15 Sep, 2021 1 commit
  2. 14 Sep, 2021 2 commits
    • Bhadresh Savani's avatar
      [Flax] Addition of FlaxPegasus (#13420) · c1e47bf4
      Bhadresh Savani authored
      
      
      * added initial files
      
      * fixes pipeline
      
      * fixes style and quality
      
      * fixes doc issue and positional encoding
      
      * fixes layer norm and test
      
      * fixes quality issue
      
      * fixes code quality
      
      * removed extra layer norm
      
      * added layer norm back in encoder and decoder
      
      * added more code copy quality checks
      
      * update tests
      
      * Apply suggestions from code review
      
      * fix import
      
      * fix test
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      c1e47bf4
    • Sylvain Gugger's avatar
      Push to hub when saving checkpoints (#13503) · 3081d386
      Sylvain Gugger authored
      * Push to hub when saving checkpoints
      
      * Add model card
      
      * Revert partial model card
      
      * Small fix for checkpoint
      
      * Add tests
      
      * Add documentation
      
      * Fix tests
      
      * Bump huggingface_hub
      
      * Fix test
      3081d386
  3. 13 Sep, 2021 3 commits
  4. 10 Sep, 2021 3 commits
    • Suraj Patil's avatar
      [GPT-Neo] Simplify local attention (#13491) · 010965dc
      Suraj Patil authored
      * simplify local attention
      
      * update tests
      
      * add a comment and use torch.bitwise_xor
      010965dc
    • Patrick von Platen's avatar
      [Wav2Vec2] Fix normalization for non-padded tensors (#13512) · d7b3b709
      Patrick von Platen authored
      * finalize
      
      * Apply suggestions from code review
      
      * finish cleaner implementation
      
      * more tests
      
      * small fix
      
      * finish
      
      * up
      d7b3b709
    • Nicolas Patry's avatar
      [Large PR] Entire rework of pipelines. (#13308) · c63fcabf
      Nicolas Patry authored
      
      
      * Enabling dataset iteration on pipelines.
      
      Enabling dataset iteration on pipelines.
      
      Unifying parameters under `set_parameters` function.
      
      Small fix.
      
      Last fixes after rebase
      
      Remove print.
      
      Fixing text2text `generate_kwargs`
      
      No more `self.max_length`.
      
      Fixing tf only conversational.
      
      Consistency in start/stop index over TF/PT.
      
      Speeding up drastically on TF (nasty bug where max_length would increase
      a ton.)
      
      Adding test for support for non fast tokenizers.
      
      Fixign GPU usage on zero-shot.
      
      Fix working on Tf.
      
      Update src/transformers/pipelines/base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      Update src/transformers/pipelines/base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      Small cleanup.
      
      Remove all asserts + simple format.
      
      * Fixing audio-classification for large PR.
      
      * Overly explicity null checking.
      
      * Encapsulating GPU/CPU pytorch manipulation directly within `base.py`.
      
      * Removed internal state for parameters of the  pipeline.
      
      Instead of overriding implicitly internal state, we moved
      to real named arguments on every `preprocess`, `_forward`,
      `postprocess` function.
      
      Instead `_sanitize_parameters` will be used to split all kwargs
      of both __init__ and __call__ into the 3 kinds of named parameters.
      
      * Move import warnings.
      
      * Small fixes.
      
      * Quality.
      
      * Another small fix, using the CI to debug faster.
      
      * Last fixes.
      
      * Last fix.
      
      * Small cleanup of tensor moving.
      
      * is not None.
      
      * Adding a bunch of docs + a iteration test.
      
      * Fixing doc style.
      
      * KeyDataset = None guard.
      
      * RRemoving the Cuda test for pipelines (was testing).
      
      * Even more simple iteration test.
      
      * Correct import .
      
      * Long day.
      
      * Fixes in docs.
      
      * [WIP] migrating object detection.
      
      * Fixed the target_size bug.
      
      * Fixup.
      
      * Bad variable name.
      
      * Fixing `ensure_on_device` respects original ModelOutput.
      c63fcabf
  5. 09 Sep, 2021 4 commits
  6. 08 Sep, 2021 4 commits
  7. 07 Sep, 2021 2 commits
  8. 06 Sep, 2021 5 commits
  9. 02 Sep, 2021 5 commits
    • Nathan Raw's avatar
      Add PyTorch image classification example (#13134) · 76c4d8bf
      Nathan Raw authored
      *  add pytorch image classification example
      
      * 🔥 remove utils.py
      
      * 💄 fix flake8 style issues
      
      * 🔥 remove unnecessary line
      
      *  limit dataset sizes
      
      * 📌 update reqs
      
      * 🎨 restructure - use datasets lib
      
      * 🎨 import transforms directly
      
      * 📝 add comments
      
      * 💄 style
      
      * 🔥 remove flag
      
      * 📌 update requirement warning
      
      * 📝 add vision README.md
      
      * 📝 update README.md
      
      * 📝 update README.md
      
      * 🎨 add image-classification tag to model card
      
      * 🚚 rename vision ️ image-classification
      
      * 📝 update image-classification README.md
      76c4d8bf
    • Patrick von Platen's avatar
      up (#13396) · 9bd5d97c
      Patrick von Platen authored
      9bd5d97c
    • Patrick von Platen's avatar
      fix (#13395) · efa4f5f0
      Patrick von Platen authored
      efa4f5f0
    • Apoorv Garg's avatar
      Correct order of overflowing_tokens for slow tokenizer (#13179) · b91e65af
      Apoorv Garg authored
      * correct order of overflowing_tokens for slow tokenizer (issue fix #13148)
      
      * python 3.9 requires sentencepiece version 0.1.94 or above
      
      * slicing of ids fixed in truncated_sequence()
      
      * Update setup.py
      
      * Correct order of overflowing tokens for pair of sentences
      
      * code reformatted
      
      * Update tokenization_utils_base.py
      
      * reformatting file
      
      * test to check single_input added
      
      * missing function restored
      
      * test to check pair_input overflowing tokens order
      
      * test to check pair_input overflowing tokens order
      
      * test to check pair_input overflowing tokens order
      
      * added an error message for pair of seq and longest_first strategy
      
      * test for pair_input modified
      
      * variable name corrected
      
      * fixed a typo in error message
      
      * requested changes implemented
      
      * required test added
      
      * Corrected the message to match test message
      
      * added error message for Luke Tokenizer
      
      * lost test recovered
      
      * docstring for truncate_sequences and prepare_for_model updated
      
      * docstring for luke tokenizer updated
      
      * updated ENCODE_PLUS_ADDITIONAL_KWARGS_DOCSTRING
      
      * aligned text and fixed puncuatations
      
      * improved style and quality of code
      
      * fixed error_msg in truncate_sequences
      
      * replaced encode_plus method with regular call method
      
      * clean up
      
      * rephrased the docstring
      b91e65af
    • Nicolas Patry's avatar
      Enabling automatic loading of tokenizer with `pipeline` for (#13376) · c9184a2e
      Nicolas Patry authored
      `audio-classification`.
      c9184a2e
  10. 01 Sep, 2021 8 commits
  11. 31 Aug, 2021 3 commits
    • Stella Biderman's avatar
      GPT-J-6B (#13022) · c02cd95c
      Stella Biderman authored
      
      
      * Test GPTJ implementation
      
      * Fixed conflicts
      
      * Update __init__.py
      
      * Update __init__.py
      
      * change GPT_J to GPTJ
      
      * fix missing imports and typos
      
      * use einops for now
      (need to change to torch ops later)
      
      * Use torch ops instead of einsum
      
      * remove einops deps
      
      * Update configuration_auto.py
      
      * Added GPT J
      
      * Update gptj.rst
      
      * Update __init__.py
      
      * Update test_modeling_gptj.py
      
      * Added GPT J
      
      * Changed configs to match GPT2 instead of GPT Neo
      
      * Removed non-existent sequence model
      
      * Update configuration_auto.py
      
      * Update configuration_auto.py
      
      * Update configuration_auto.py
      
      * Update modeling_gptj.py
      
      * Update modeling_gptj.py
      
      * Progress on updating configs to agree with GPT2
      
      * Update modeling_gptj.py
      
      * num_layers -> n_layer
      
      * layer_norm_eps -> layer_norm_epsilon
      
      * attention_layers -> num_hidden_layers
      
      * Update modeling_gptj.py
      
      * attention_pdrop -> attn_pdrop
      
      * hidden_act -> activation_function
      
      * Update configuration_gptj.py
      
      * Update configuration_gptj.py
      
      * Update configuration_gptj.py
      
      * Update configuration_gptj.py
      
      * Update configuration_gptj.py
      
      * Update modeling_gptj.py
      
      * Update modeling_gptj.py
      
      * Update modeling_gptj.py
      
      * Update modeling_gptj.py
      
      * Update modeling_gptj.py
      
      * Update modeling_gptj.py
      
      * fix layernorm and lm_head size
      delete attn_type
      
      * Update docs/source/model_doc/gptj.rst
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * removed claim that GPT J uses local attention
      
      * Removed GPTJForSequenceClassification
      
      * Update src/transformers/models/gptj/configuration_gptj.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Removed unsupported boilerplate
      
      * Update tests/test_modeling_gptj.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update tests/test_modeling_gptj.py
      Co-authored-by: default avatarEric Hallahan <eric@hallahans.name>
      
      * Update tests/test_modeling_gptj.py
      Co-authored-by: default avatarEric Hallahan <eric@hallahans.name>
      
      * Update tests/test_modeling_gptj.py
      Co-authored-by: default avatarEric Hallahan <eric@hallahans.name>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Update __init__.py
      
      * Update configuration_gptj.py
      
      * Update modeling_gptj.py
      
      * Corrected indentation
      
      * Remove stray backslash
      
      * Delete .DS_Store
      
      * Delete .DS_Store
      
      * Delete .DS_Store
      
      * Delete .DS_Store
      
      * Delete .DS_Store
      
      * Update docs to match
      
      * Remove tf loading
      
      * Remove config.jax
      
      * Remove stray `else:` statement
      
      * Remove references to `load_tf_weights_in_gptj`
      
      * Adapt tests to match output from GPT-J 6B
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Default `activation_function` to `gelu_new`
      
      - Specify the approximate formulation of GELU to ensure parity with the default setting of `jax.nn.gelu()`
      
      * Fix part of the config documentation
      
      * Revert "Update configuration_auto.py"
      
      This reverts commit e9860e9c043b6ebf57a0e705044e9ec9ba2263bb.
      
      * Revert "Update configuration_auto.py"
      
      This reverts commit cfaaae4c4dc70f1fbe9abd60fc8bd0b863b8c011.
      
      * Revert "Update configuration_auto.py"
      
      This reverts commit 687788954fd0cfbc567fa1202d56a4ff9271944f.
      
      * Revert "Update configuration_auto.py"
      
      This reverts commit 194d024ea87d4fcef0dcb08e57f52c47511a9fc6.
      
      * Hyphenate GPT-J
      
      * Undid sorting of the models alphabetically
      
      * Reverting previous commit
      
      * fix style and quality issues
      
      * Update docs/source/model_doc/gptj.rst
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update tests/test_modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/configuration_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/configuration_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/configuration_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Replaced GPTJ-specific code with generic code
      
      * Update src/transformers/models/gptj/modeling_gptj.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Made the code always use rotary positional encodings
      
      * Update index.rst
      
      * Fix documentation
      
      * Combine attention classes
      
      - Condense all attention operations into `GPTJAttention`
      - Replicate GPT-2 and improve code clarity by renaming `GPTJAttention.attn_pdrop` and `GPTJAttention.resid_pdrop` to `GPTJAttention.attn_dropout` and `GPTJAttention.resid_dropout`
      
      * Removed `config.rotary_dim` from tests
      
      * Update test_modeling_gptj.py
      
      * Update test_modeling_gptj.py
      
      * Fix formatting
      
      * Removed depreciated argument `layer_id` to `GPTJAttention`
      
      * Update modeling_gptj.py
      
      * Update modeling_gptj.py
      
      * Fix code quality
      
      * Restore model functionality
      
      * Save `lm_head.weight` in checkpoints
      
      * Fix crashes when loading with reduced precision
      
      * refactor self._attn(...)` and rename layer weights"
      
      * make sure logits are in fp32 for sampling
      
      * improve docs
      
      * Add `GPTJForCausalLM` to `TextGenerationPipeline` whitelist
      
      * Added GPT-J to the README
      
      * Fix doc/readme consistency
      
      * Add rough parallelization support
      
      - Remove unused imports and variables
      - Clean up docstrings
      - Port experimental parallelization code from GPT-2 into GPT-J
      
      * Clean up loose ends
      
      * Fix index.rst
      Co-authored-by: default avatarkurumuz <kurumuz1@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarEric Hallahan <eric@hallahans.name>
      Co-authored-by: default avatarLeo Gao <54557097+leogao2@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: your_github_username <your_github_email>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      c02cd95c
    • Matt's avatar
      TF/Numpy variants for all DataCollator classes (#13105) · 854260ca
      Matt authored
      
      
      * Adding a TF variant of the DataCollatorForTokenClassification to get feedback
      
      * Added a Numpy variant and a post_init check to fail early if a missing import is found
      
      * Fixed call to Numpy variant
      
      * Added a couple more of the collators
      
      * Update src/transformers/data/data_collator.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Fixes, style pass, finished DataCollatorForSeqToSeq
      
      * Added all the LanguageModeling DataCollators, except SOP and PermutationLanguageModeling
      
      * Adding DataCollatorForPermutationLanguageModeling
      
      * Style pass
      
      * Add missing `__call__` for PLM
      
      * Remove `post_init` checks for frameworks because the imports inside them were making us fail code quality checks
      
      * Remove unused imports
      
      * First attempt at some TF tests
      
      * A second attempt to make any of those tests actually work
      
      * TF tests, round three
      
      * TF tests, round four
      
      * TF tests, round five
      
      * TF tests, all enabled!
      
      * Style pass
      
      * Merging tests into `test_data_collator.py`
      
      * Merging tests into `test_data_collator.py`
      
      * Fixing up test imports
      
      * Fixing up test imports
      
      * Trying shuffling the conditionals around
      
      * Commenting out non-functional old tests
      
      * Completed all tests for all three frameworks
      
      * Style pass
      
      * Fixed test typo
      
      * Style pass
      
      * Move standard `__call__` method to mixin
      
      * Rearranged imports for `test_data_collator`
      
      * Fix data collator typo "torch" -> "pt"
      
      * Fixed the most embarrassingly obvious bug
      
      * Update src/transformers/data/data_collator.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Renaming mixin
      
      * Updating docs
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarDalton Walker <dalton_walker@icloud.com>
      Co-authored-by: default avatarAndrew Romans <andrew.romans@hotmail.com>
      854260ca
    • Sylvain Gugger's avatar
      Clean up test file · 74b3344f
      Sylvain Gugger authored
      74b3344f