1. 18 Aug, 2022 2 commits
  2. 17 Aug, 2022 2 commits
    • amyeroberts's avatar
      Update feature extractor methods to enable type cast before normalize (#18499) · 49e44b21
      amyeroberts authored
      * Update methods to optionally rescale
      This is necessary to allow for casting our images / videos to numpy arrays within the feature extractors' call. We want to do this to make sure the behaviour is as expected when flags like  are False. If some transformations aren't applied, then the output type can't be unexpected e.g. a list of PIL images instead of numpy arrays.
      
      * Cast images to numpy arrays in call to enable consistent behaviour with different configs
      
      * Remove accidental clip changes
      
      * Update tests to reflect the scaling logic
      We write a generic  function to handle rescaling of our arrays. In order for the API to be intuitive, we take some factor c and rescale the image values by that. This means, the rescaling done in normalize and to_numpy_array are now done with array * (1/255) instead of array / 255. This leads to small differences in the resulting image. When testing, this was in the order of 1e-8, and so deemed OK
      49e44b21
    • Yih-Dar's avatar
      Fix Yolos ONNX export test (#18606) · c99e9846
      Yih-Dar authored
      
      Co-authored-by: default avatarlewtun <lewis.c.tunstall@gmail.com>
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      c99e9846
  3. 16 Aug, 2022 1 commit
  4. 12 Aug, 2022 6 commits
    • Younes Belkada's avatar
      small change (#18584) · 1ccd2515
      Younes Belkada authored
      1ccd2515
    • Niklas Muennighoff's avatar
      Update BLOOM parameter counts (#18531) · 56ef0ba4
      Niklas Muennighoff authored
      * Update BLOOM parameter counts
      
      * Update BLOOM parameter counts
      56ef0ba4
    • NielsRogge's avatar
      Add Donut (#18488) · 2ab790e8
      NielsRogge authored
      
      
      * First draft
      
      * Improve script
      
      * Update script
      
      * Make conversion work
      
      * Add final_layer_norm attribute to Swin's config
      
      * Add DonutProcessor
      
      * Convert more models
      
      * Improve feature extractor and convert base models
      
      * Fix bug
      
      * Improve integration tests
      
      * Improve integration tests and add model to README
      
      * Add doc test
      
      * Add feature extractor to docs
      
      * Fix integration tests
      
      * Remove register_buffer
      
      * Fix toctree and add missing attribute
      
      * Add DonutSwin
      
      * Make conversion script work
      
      * Improve conversion script
      
      * Address comment
      
      * Fix bug
      
      * Fix another bug
      
      * Remove deprecated method from docs
      
      * Make Swin and Swinv2 untouched
      
      * Fix code examples
      
      * Fix processor
      
      * Update model_type to donut-swin
      
      * Add feature extractor tests, add token2json method, improve feature extractor
      
      * Fix failing tests, remove integration test
      
      * Add do_thumbnail for consistency
      
      * Improve code examples
      
      * Add code example for document parsing
      
      * Add DonutSwin to MODEL_NAMES_MAPPING
      
      * Add model to appropriate place in toctree
      
      * Update namespace to appropriate organization
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      2ab790e8
    • Younes Belkada's avatar
      Supporting seq2seq models for `bitsandbytes` integration (#18579) · a5ca56ff
      Younes Belkada authored
      * Supporting seq2seq models for `bitsandbytes` integration
      
      - `bitsandbytes` integration supports now seq2seq models
      - check if a model has tied weights as an additional check
      
      * small modification
      
      - tie the weights before looking at tied weights!
      a5ca56ff
    • Joao Gante's avatar
      Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261) · ed1924e8
      Joao Gante authored
      * validate generate model_kwargs
      
      * generate tests -- not all models have an attn mask
      ed1924e8
    • Arthur's avatar
      Load sharded pt to flax (#18419) · bce36ee0
      Arthur authored
      
      
      * initial commit
      
      * add small test
      
      * add cross pt tf flag to test
      
      * fix quality
      
      * style
      
      * update test with new repo
      
      * fix failing test
      
      * update
      
      * fix wrong param ordering
      
      * style
      
      * update based on review
      
      * update related to recent new caching mechanism
      
      * quality
      
      * Update based on review
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      
      * quality and style
      
      * Update src/transformers/modeling_flax_utils.py
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      bce36ee0
  5. 11 Aug, 2022 2 commits
  6. 10 Aug, 2022 3 commits
    • Dhruv Karan's avatar
      Adds CLIP to models exportable with ONNX (#18515) · f62cb831
      Dhruv Karan authored
      
      
      * onnx config for clip
      
      * default opset as 14
      
      * changes from the original repo
      
      * input values order fix
      
      * outputs fix
      
      * remove unused import
      
      * ran make fix-copies
      
      * black format
      
      * review comments: forward ref, import fix, model change revert, .to cleanup
      
      * make style
      
      * formatting fixes
      
      * revert groupvit
      
      * comment for cast to int32
      
      * comment fix
      
      * make .T as .t() for onnx conversion
      
      * ran make fix-copies
      
      * remove unneeded comment
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fix copies
      
      * remove comment
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      f62cb831
    • Sylvain Gugger's avatar
      Use commit hash to look in cache instead of calling head (#18534) · 0d0aada5
      Sylvain Gugger authored
      
      
      * Use commit hash to look in cache instead of calling head
      
      * Add tests
      
      * Add attr for local configs too
      
      * Stupid typos
      
      * Fix tests
      
      * Update src/transformers/utils/hub.py
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      
      * Address Julien's comments
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      0d0aada5
    • Younes Belkada's avatar
      `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) · 4a51075a
      Younes Belkada authored
      
      
      * first commit
      
      * correct replace function
      
      * add final changes
      
      - works like charm!
      - cannot implement tests yet
      - tested
      
      * clean up a bit
      
      * add bitsandbytes dependencies
      
      * working version
      
      - added import function
      - added bitsandbytes utils file
      
      * small fix
      
      * small fix
      
      - fix import issue
      
      * fix import issues
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * refactor a bit
      
      - move bitsandbytes utils to utils
      - change comments on functions
      
      * reformat docstring
      
      - reformat docstring on init_empty_weights_8bit
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * revert bad formatting
      
      * change to bitsandbytes
      
      * refactor a bit
      
      - remove init8bit since it is useless
      
      * more refactoring
      
      - fixed init empty weights issue
      - added threshold param
      
      * small hack to make it work
      
      * Update src/transformers/modeling_utils.py
      
      * Update src/transformers/modeling_utils.py
      
      * revmoe the small hack
      
      * modify utils file
      
      * make style + refactor a bit
      
      * create correctly device map
      
      * add correct dtype for device map creation
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * apply suggestions
      
      - remove with torch.grad
      - do not rely on Python bool magic!
      
      * add docstring
      
       - add docstring for new kwargs
      
      * add docstring
      
      - comment `replace_8bit_linear` function
      - fix weird formatting
      
      * - added more documentation
      - added new utility function for memory footprint tracking
      - colab demo to add
      
      * few modifs
      
      - typo doc
      - force cast into float16 when load_in_8bit is enabled
      
      * added colab link
      
      * add test architecture + docstring a bit
      
      * refactor a bit testing class
      
      * make style + refactor a bit
      
      * enhance checks
      
      - add more checks
      - start writing saving test
      
      * clean up a bit
      
      * male style
      
      * add more details on doc
      
      * add more tests
      
      - still needs to fix 2 tests
      
      * replace by "or"
      
      - could not fix it from GitHub GUI
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * refactor a bit testing code + add readme
      
      * make style
      
      * fix import issue
      
      * Update src/transformers/modeling_utils.py
      Co-authored-by: default avatarMichael Benayoun <mickbenayoun@gmail.com>
      
      * add few comments
      
      * add more doctring + make style
      
      * more docstring
      
      * raise error when loaded in 8bit
      
      * make style
      
      * add warning if loaded on CPU
      
      * add small sanity check
      
      * fix small comment
      
      * add bitsandbytes on dockerfile
      
      * Improve documentation
      
      - improve documentation from comments
      
      * add few comments
      
      * slow tests pass on the VM but not on the CI VM
      
      * Fix merge conflict
      
      * make style
      
      * another test should pass on a multi gpu setup
      
      * fix bad import in testing file
      
      * Fix slow tests
      
      - remove dummy batches
      - no more CUDA illegal memory errors
      
      * odify dockerfile
      
      * Update docs/source/en/main_classes/model.mdx
      
      * Update Dockerfile
      
      * Update model.mdx
      
      * Update Dockerfile
      
      * Apply suggestions from code review
      
      * few modifications
      
      - lm head can stay on disk/cpu
      - change model name so that test pass
      
      * change test value
      
      - change test value to the correct output
      - torch bmm changed to baddmm in bloom modeling when merging
      
      * modify installation guidelines
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * replace `n`by `name`
      
      * merge `load_in_8bit` and `low_cpu_mem_usage`
      
      * first try - keep the lm head in full precision
      
      * better check
      
      - check the attribute `base_model_prefix` instead of computing the number of parameters
      
      * added more tests
      
      * Update src/transformers/utils/bitsandbytes.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers
      
       into integration-8bit
      
      * improve documentation
      
      - fix typos for installation
      - change title in the documentation
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarMichael Benayoun <mickbenayoun@gmail.com>
      4a51075a
  7. 09 Aug, 2022 3 commits
  8. 08 Aug, 2022 2 commits
    • Sylvain Gugger's avatar
      Clean up hub (#18497) · 377cdded
      Sylvain Gugger authored
      * Clean up utils.hub
      
      * Remove imports
      
      * More fixes
      
      * Last fix
      377cdded
    • Nicolas Patry's avatar
      [DX fix] Fixing QA pipeline streaming a dataset. (#18516) · a4562552
      Nicolas Patry authored
      * [DX fix] Fixing QA pipeline streaming a dataset.
      
      QuestionAnsweringArgumentHandler would iterate over the whole dataset
      effectively killing all properties of the pipeline.
      This restores nice properties when using `Dataset` or `Generator` since
      those are meant to be consumed lazily.
      
      * Handling TF better.
      a4562552
  9. 06 Aug, 2022 1 commit
  10. 05 Aug, 2022 6 commits
  11. 04 Aug, 2022 3 commits
    • Yih-Dar's avatar
    • NielsRogge's avatar
      Add VideoMAE (#17821) · f9a0008d
      NielsRogge authored
      
      
      * First draft
      
      * Add VideoMAEForVideoClassification
      
      * Improve conversion script
      
      * Add VideoMAEForPreTraining
      
      * Add VideoMAEFeatureExtractor
      
      * Improve VideoMAEFeatureExtractor
      
      * Improve docs
      
      * Add first draft of model tests
      
      * Improve VideoMAEForPreTraining
      
      * Fix base_model_prefix
      
      * Make model take pixel_values of shape (B, T, C, H, W)
      
      * Add loss computation of VideoMAEForPreTraining
      
      * Improve tests
      
      * Improve model tests茅
      
      * Make all tests pass
      
      * Add VideoMAE to main README
      
      * Add tests for VideoMAEFeatureExtractor
      
      * Add integration test
      
      * Improve conversion script
      
      * Rename patch embedding class
      
      * Remove VideoMAELayer from init
      
      * Update design of patch embeddings
      
      * Improve comments
      
      * Improve conversion script
      
      * Improve conversion script
      
      * Add conversion of pretrained model
      
      * Add loss verification of pretrained model
      
      * Add loss verification of unnormalized targets
      
      * Add integration test for pretraining model
      
      * Apply suggestions from code review
      
      * Fix bug to make feature extractor resize only shorter edge
      
      * Address more comments
      
      * Improve normalization of videos
      
      * Add doc examples
      
      * Move constants to dedicated script
      
      * Remove scripts
      
      * Transfer checkpoints, fix docs
      
      * Update script
      
      * Update image mean and std
      
      * Fix doc tests
      
      * Set return_tensors to NumPy by default
      
      * Revert the previous change
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      f9a0008d
    • nlpcat's avatar
      change shape to support dynamic batch input in tf.function XLA generate for tf serving (#18372) · fc1d841b
      nlpcat authored
      
      
      * change shape to support dynamic batch input in tf.generate
      
      * add tests
      Co-authored-by: default avatarnlpcatcode <nlpcodecat@gmail.com>
      fc1d841b
  12. 03 Aug, 2022 3 commits
  13. 02 Aug, 2022 1 commit
    • David's avatar
      Update pipeline word heuristic to work with whitespace in token offsets (#18402) · 042f4203
      David authored
      * Update pipeline word heuristic to work with whitespace in token offsets
      
      This change checks for whitespace in the input string at either the
      character preceding the token or in the first character of the token.
      This works with tokenizers that return offsets excluding whitespace
      between words or with offsets including whitespace.
      
      fixes #18111
      
      starting
      
      * Use smaller model, ensure expected tokenization
      
      * Re-run CI (please squash)
      042f4203
  14. 01 Aug, 2022 2 commits
  15. 29 Jul, 2022 2 commits
  16. 27 Jul, 2022 1 commit
    • Ritik Nandwal's avatar
      Add swin transformer v2 (#17469) · e87ac9d1
      Ritik Nandwal authored
      
      
      * Add files generated using transformer-cli add-new-model-like command
      
      * Add changes for swinv2 attention and forward method
      
      * Add fixes
      
      * Add modifications for weight conversion and remaining args in swin model
      
      * Add changes for patchmerging
      
      * Add changes for SwinV2selfattention
      
      * Update conversion script
      
      * Add final fixes for the swin_v2 model
      
      * Add changes for conversion script for pretrained window size case
      
      * Add pretrained window size value from config in SwinV2Encoder class
      
      * Make fixup
      
      * Add swinv2 to models_not_in_readme to utils/check_copies.py
      
      * Modify Swinv2v2 to Swin Transformer V2
      
      * Remove copied from, to run make fixup command
      
      * Add updates to swinv2tf from main branch
      
      * Add pretrained_window_size to config, to make tests pass
      
      * Add modified weights from nandwalritik profile for swinv2
      
      * Update model weights from swinv2 from nandwalritik profile
      
      * Add fix for build_pr_documentation CI fix
      
      * Add fixes for weight conversion
      
      * Add change to make input with padding work
      
      * Add fixes for test cases
      
      * Add few changes from swin to swinv2 to pass test cases
      
      * Remove tests for tensorflow as swinv2 for TF is not added yet
      
      * Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet
      
      * Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now.
      
      * Update docs url for swinv2 in README.md
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Undo changes for check_repo
      
      * Update url in readme.md
      
      * Remove overrided function to test pt_tf_model_equivalence
      
      * Remove TF model imports for Swinv2 as its not implemented in this PR
      
      * Add changes for index.mdx
      
      * Add swinv2 papers link,abstract and contributors details
      
      * Rename cpb_mlp to continous_position_bias_mlp
      
      * Add tips for swinv2 model
      
      * Update src/transformers/models/swinv2/configuration_swinv2.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/swinv2/configuration_swinv2.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update import order in src/transformers/models/swinv2/configuration_swinv2.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add copyright statements in weights conversion script.
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Remove Swinv2 from models_not_in_readme
      
      * Reformat code
      
      * Remove TF implementation file for swinv2
      
      * Update start docstring.
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add changes for docstring
      
      * Update orgname for weights to microsoft
      
      * Remove to_2tuple function
      
      * Add copied from statements wherever applicable
      
      * Add copied from to Swinv2ForMaskedImageModelling class
      
      * Reformat code.
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add unittest.skip(with reason.) for test_inputs_embeds test case.
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add updates for test_modeling_swinv2.py
      
      * Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function
      
      * Add continuous_position_bias_mlp parameter to conversion script
      
      * Add test for testing masked_image_modelling for swinv2
      
      * Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add suggested changes
      
      * Add copied from to forward methods of Swinv2Stage and Swinv2Encoder
      
      * Add push_to_hub flag to weight conversion script
      
      * Change order or Swinv2DropPath class
      
      * Add id2label mapping for imagenet 21k
      
      * Add updated url for SwinV2 functions and classes used in implementation
      
      * Update input_feature dimensions format, mentioned in comments.
      Co-authored-by: default avatarAlara Dirik <8944735+alaradirik@users.noreply.github.com>
      
      * Add suggested changes for modeling_swin2.py
      
      * Update docs
      
      * Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient.
      
      * Fix indentation.
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Add changes for making Nit objects in code style
      
      * Add suggested changes
      
      * Add suggested changes for test_modelling_swinv2
      
      * make fix-copies
      
      * Update docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarAlara Dirik <8944735+alaradirik@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      e87ac9d1