1. 10 Aug, 2022 5 commits
    • Sylvain Gugger's avatar
      Use commit hash to look in cache instead of calling head (#18534) · 0d0aada5
      Sylvain Gugger authored
      
      
      * Use commit hash to look in cache instead of calling head
      
      * Add tests
      
      * Add attr for local configs too
      
      * Stupid typos
      
      * Fix tests
      
      * Update src/transformers/utils/hub.py
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      
      * Address Julien's comments
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      0d0aada5
    • Matt's avatar
      TF Examples Rewrite (#18451) · 6eb51450
      Matt authored
      
      
      * Finished QA example
      
      * Dodge a merge conflict
      
      * Update text classification and LM examples
      
      * Update NER example
      
      * New Keras metrics WIP, fix NER example
      
      * Update NER example
      
      * Update MC, summarization and translation examples
      
      * Add XLA warnings when shapes are variable
      
      * Make sure batch_size is consistently scaled by num_replicas
      
      * Add PushToHubCallback to all models
      
      * Add docs links for KerasMetricCallback
      
      * Add docs links for prepare_tf_dataset and jit_compile
      
      * Correct inferred model names
      
      * Don't assume the dataset has 'lang'
      
      * Don't assume the dataset has 'lang'
      
      * Write metrics in text classification
      
      * Add 'framework' to TrainingArguments and TFTrainingArguments
      
      * Export metrics in all examples and add tests
      
      * Fix training args for Flax
      
      * Update command line args for translation test
      
      * make fixup
      
      * Fix accidentally running other tests in fp16
      
      * Remove do_train/do_eval from run_clm.py
      
      * Remove do_train/do_eval from run_mlm.py
      
      * Add tensorflow tests to circleci
      
      * Fix circleci
      
      * Update examples/tensorflow/language-modeling/run_mlm.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update examples/tensorflow/test_tensorflow_examples.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update examples/tensorflow/translation/run_translation.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update examples/tensorflow/token-classification/run_ner.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Fix save path for tests
      
      * Fix some model card kwargs
      
      * Explain the magical -1000
      
      * Actually enable tests this time
      
      * Skip text classification PR until we fix shape inference
      
      * make fixup
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      6eb51450
    • Sylvain Gugger's avatar
      Preserve hub-related kwargs in AutoModel.from_pretrained (#18545) · d7e2d7b4
      Sylvain Gugger authored
      * Preserve hub-related kwargs in AutoModel.from_pretrained
      
      * Fix tests
      
      * Remove debug statement
      d7e2d7b4
    • Joao Gante's avatar
      TF: XLA-trainable DeBERTa v2 (#18546) · 34aad0da
      Joao Gante authored
      * fix deberta issues
      
      * add different code paths for gpu and tpu
      
      * shorter gpu take along axis
      
      * Stable Dropout without tf cond
      
      * variable must be float
      34aad0da
    • Younes Belkada's avatar
      `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) · 4a51075a
      Younes Belkada authored
      
      
      * first commit
      
      * correct replace function
      
      * add final changes
      
      - works like charm!
      - cannot implement tests yet
      - tested
      
      * clean up a bit
      
      * add bitsandbytes dependencies
      
      * working version
      
      - added import function
      - added bitsandbytes utils file
      
      * small fix
      
      * small fix
      
      - fix import issue
      
      * fix import issues
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * refactor a bit
      
      - move bitsandbytes utils to utils
      - change comments on functions
      
      * reformat docstring
      
      - reformat docstring on init_empty_weights_8bit
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * revert bad formatting
      
      * change to bitsandbytes
      
      * refactor a bit
      
      - remove init8bit since it is useless
      
      * more refactoring
      
      - fixed init empty weights issue
      - added threshold param
      
      * small hack to make it work
      
      * Update src/transformers/modeling_utils.py
      
      * Update src/transformers/modeling_utils.py
      
      * revmoe the small hack
      
      * modify utils file
      
      * make style + refactor a bit
      
      * create correctly device map
      
      * add correct dtype for device map creation
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * apply suggestions
      
      - remove with torch.grad
      - do not rely on Python bool magic!
      
      * add docstring
      
       - add docstring for new kwargs
      
      * add docstring
      
      - comment `replace_8bit_linear` function
      - fix weird formatting
      
      * - added more documentation
      - added new utility function for memory footprint tracking
      - colab demo to add
      
      * few modifs
      
      - typo doc
      - force cast into float16 when load_in_8bit is enabled
      
      * added colab link
      
      * add test architecture + docstring a bit
      
      * refactor a bit testing class
      
      * make style + refactor a bit
      
      * enhance checks
      
      - add more checks
      - start writing saving test
      
      * clean up a bit
      
      * male style
      
      * add more details on doc
      
      * add more tests
      
      - still needs to fix 2 tests
      
      * replace by "or"
      
      - could not fix it from GitHub GUI
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * refactor a bit testing code + add readme
      
      * make style
      
      * fix import issue
      
      * Update src/transformers/modeling_utils.py
      Co-authored-by: default avatarMichael Benayoun <mickbenayoun@gmail.com>
      
      * add few comments
      
      * add more doctring + make style
      
      * more docstring
      
      * raise error when loaded in 8bit
      
      * make style
      
      * add warning if loaded on CPU
      
      * add small sanity check
      
      * fix small comment
      
      * add bitsandbytes on dockerfile
      
      * Improve documentation
      
      - improve documentation from comments
      
      * add few comments
      
      * slow tests pass on the VM but not on the CI VM
      
      * Fix merge conflict
      
      * make style
      
      * another test should pass on a multi gpu setup
      
      * fix bad import in testing file
      
      * Fix slow tests
      
      - remove dummy batches
      - no more CUDA illegal memory errors
      
      * odify dockerfile
      
      * Update docs/source/en/main_classes/model.mdx
      
      * Update Dockerfile
      
      * Update model.mdx
      
      * Update Dockerfile
      
      * Apply suggestions from code review
      
      * few modifications
      
      - lm head can stay on disk/cpu
      - change model name so that test pass
      
      * change test value
      
      - change test value to the correct output
      - torch bmm changed to baddmm in bloom modeling when merging
      
      * modify installation guidelines
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * replace `n`by `name`
      
      * merge `load_in_8bit` and `low_cpu_mem_usage`
      
      * first try - keep the lm head in full precision
      
      * better check
      
      - check the attribute `base_model_prefix` instead of computing the number of parameters
      
      * added more tests
      
      * Update src/transformers/utils/bitsandbytes.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers
      
       into integration-8bit
      
      * improve documentation
      
      - fix typos for installation
      - change title in the documentation
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarMichael Benayoun <mickbenayoun@gmail.com>
      4a51075a
  2. 09 Aug, 2022 7 commits
  3. 08 Aug, 2022 7 commits
  4. 06 Aug, 2022 1 commit
  5. 05 Aug, 2022 9 commits
  6. 04 Aug, 2022 7 commits
    • Yih-Dar's avatar
    • NielsRogge's avatar
      Add VideoMAE (#17821) · f9a0008d
      NielsRogge authored
      
      
      * First draft
      
      * Add VideoMAEForVideoClassification
      
      * Improve conversion script
      
      * Add VideoMAEForPreTraining
      
      * Add VideoMAEFeatureExtractor
      
      * Improve VideoMAEFeatureExtractor
      
      * Improve docs
      
      * Add first draft of model tests
      
      * Improve VideoMAEForPreTraining
      
      * Fix base_model_prefix
      
      * Make model take pixel_values of shape (B, T, C, H, W)
      
      * Add loss computation of VideoMAEForPreTraining
      
      * Improve tests
      
      * Improve model tests茅
      
      * Make all tests pass
      
      * Add VideoMAE to main README
      
      * Add tests for VideoMAEFeatureExtractor
      
      * Add integration test
      
      * Improve conversion script
      
      * Rename patch embedding class
      
      * Remove VideoMAELayer from init
      
      * Update design of patch embeddings
      
      * Improve comments
      
      * Improve conversion script
      
      * Improve conversion script
      
      * Add conversion of pretrained model
      
      * Add loss verification of pretrained model
      
      * Add loss verification of unnormalized targets
      
      * Add integration test for pretraining model
      
      * Apply suggestions from code review
      
      * Fix bug to make feature extractor resize only shorter edge
      
      * Address more comments
      
      * Improve normalization of videos
      
      * Add doc examples
      
      * Move constants to dedicated script
      
      * Remove scripts
      
      * Transfer checkpoints, fix docs
      
      * Update script
      
      * Update image mean and std
      
      * Fix doc tests
      
      * Set return_tensors to NumPy by default
      
      * Revert the previous change
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      f9a0008d
    • Thomas Wang's avatar
    • Sylvain Gugger's avatar
      df28de05
    • Michael Benayoun's avatar
      HFTracer.trace can now take callables and torch.nn.Module (#18457) · c74befc9
      Michael Benayoun authored
      * Enable HFTracer to trace with custom dummy inputs instead of pre-computed ones
      
      * Add HFTracer.trace docstring, and make it possible to handle callable and torch.nn.Module in general
      
      * Remove pdb comment
      
      * Apply suggestions
      c74befc9
    • nlpcat's avatar
      change shape to support dynamic batch input in tf.function XLA generate for tf serving (#18372) · fc1d841b
      nlpcat authored
      
      
      * change shape to support dynamic batch input in tf.generate
      
      * add tests
      Co-authored-by: default avatarnlpcatcode <nlpcodecat@gmail.com>
      fc1d841b
    • Thomas Wang's avatar
      [BLOOM] Clean modeling code (#18344) · b69a62d5
      Thomas Wang authored
      
      
      * Cleanup some code
      
      * Improve signatures
      
      * Try to reduce the number of reshape/copies
      
      * I don't think we actually need the layer_num scaling trick
      
      * No need for duplication
      
      * Try to fix beam_search
      
      * Fix beam search
      
      * Removing layer num normalization seems to be breaking
      
      * Not sure self.layer_number normalization actually matters
      
      * Try and be backward compatible
      
      * Try to fix beam_search
      
      * Revert attempt to be backward compatible
      
      * Improve documentation on past_key_values format
      
      * Optimize the device allocation in case of hidden_states in multiple devices
      
      * No need to manually cast the values to a specific device
      
      * Rename with long version of variables
      
      * Improve type hinting
      
      * Add comment that explains that some methods return views
      
      * Actually i think the attention casting only makes sense when we use torch.float16
      
      * We don't actually need layer_number to be passed anymore
      
      * Fix FX test
      
      * Bypass torch.baddbmm
      
      * Apply suggestions from code review
      
      * Add comment about support for torchScript v1.11
      
      * fix ONNX support for bloom (#18456)
      Co-authored-by: default avatarNiklas Muennighoff <n.muennighoff@gmail.com>
      Co-authored-by: default avatarNouamane Tazi <nouamane98@gmail.com>
      b69a62d5
  7. 03 Aug, 2022 4 commits