1. 13 Mar, 2023 1 commit
  2. 07 Mar, 2023 1 commit
    • Eli Simhayev's avatar
      [Time-Series] informer model (#21099) · 8abe4930
      Eli Simhayev authored
      * added informer to gitignore
      
      * added informer to gitignore
      
      * WIP informer2020
      
      * added checking that instantiate works
      
      * added config using gluonTS by kashif
      
      * WIP config
      
      * adding informeConfig. need to remove FeatureEmbedder
      
      * done InformerConfig, but need to change the names
      
      * Done informer model init. working on enc-dec
      
      * added things to address, after reading again enc-dec in the paper
      
      * done modeling - checking initialization work
      
      * added informer to gitignore
      
      * WIP informer2020
      
      * added checking that instantiate works
      
      * added config using gluonTS by kashif
      
      * WIP config
      
      * adding informeConfig. need to remove FeatureEmbedder
      
      * done InformerConfig, but need to change the names
      
      * Done informer model init. working on enc-dec
      
      * added things to address, after reading again enc-dec in the paper
      
      * done modeling - checking initialization work
      
      * moved enc-dec init to InformerEncoder/Decoder init
      
      * added 'init_std' to config, now model init works!
      
      * WIP conversion script, and added code sources
      
      * WIP conversion script: loading original informer pth works
      
      * WIP conversion script: change defaults in the config
      
      * WIP conversion script: supporting Informer input embedding
      
      * WIP conversion script: added parameters for the informer embed
      
      * WIP conversion script: change dim_feedforward=2048
      
      * WIP conversion script: remove unused args for loading checkpoint
      
      * just cleaning up
      
      * DataEmbedding removed, after thinking with Kashif
      
      * working on forward pass
      
      * WIP forward pass: trying to establish working batch for forward pass
      
      * cleaning and finalizing
      
      * adding HF names and docs
      
      * init after cleaning works
      
      * WIP in tests
      
      * added docs for the informer specific args
      
      * fix style
      
      * undo change
      
      * cleaning informer, now need to work only enc-dec
      
      * initial enc-dec classes
      
      * added encoder and decoder
      
      * added todo
      
      * add todos for conv_layers
      
      * added decoder docs from vanilla
      
      * added encoder docs from vanilla
      
      * remove encoder decoder from the original informer
      
      * removed AttentionLayer from the original paper
      
      * removed TriangularCausalMask, same as decoder_attention_mask
      
      * initial sparse attention
      
      * use conv_layers
      
      * fixed test_config test
      
      * fix parenthesis when itearting zip(layers, conv_layers)
      
      * error found in prob attention, added sizes as comments
      
      * fix sizes
      
      * added proposal for q_reduce indexing, and remove unused
      
      * WIP ProbMask, and changed factor=2 for testing
      
      * remove unused libs for this PR for creating the env
      
      * fix checking the attn_weights.size() after bmm
      
      * Q_reduce: changed from torch.gather to simple slicing
      
      * WIP calculate final attn_output
      
      * finish adding v_aggregated, attn_output ready
      
      * changed tgt_len to u in attention_mask, need to fix the size error
      
      * comment attention_mask for encoder, and fix if cond for v_agg
      
      * added ProbMask support (wip), removed old original code
      
      * finished ProbMask 😃
      
      
      
      * Revert "remove unused libs for this PR for creating the env"
      
      This reverts commit 11a081e09e92771e51a5d2758d53a9afb59547f0.
      
      * fixes
      
      * make style
      
      * fix initial tests
      
      * fix more tests
      
      * dry
      
      * make style
      
      * remove unused files
      
      * style
      
      * added integration tests
      
      * fix num_static_real_features
      
      * fix header
      
      * remove unused function
      
      * fix example
      
      * fix docs
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/modeling_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * fixes for reviewer
      
      * use prediction_length from model
      
      * fix style
      
      * fixed informer.mdx
      
      * added to index
      
      * updated readme
      
      * undo
      
      * make fix-copies
      
      * typo
      
      * fix copy
      
      * added Informer to toctree
      
      * in order
      
      * fixed comments
      
      * remove unneeded new lines in docs
      
      * make static real and cat optional
      
      * fix use of distil conv layers
      
      * fixed integration test
      
      * added checkpoint for convlayer
      
      * make fix-copies
      
      * updated from time series model
      
      * make fix-copies
      
      * copy decoder
      
      * fix unit tests
      
      * updated scaling config
      
      * fix integration tests
      
      * IGNORE_NON_TESTED
      
      * IGNORE_NON_AUTO_CONFIGURED
      
      * IGNORE_NON_AUTO_CONFIGURED
      
      * updated check configs
      
      * fix formatting
      
      * undo change from time series
      
      * prediction_length should not be None
      
      * aliign with the blog: prettify ProbSparse and change attention_factor  to sampling_factor
      
      * make style
      
      * make fix-copies
      
      * niels CR: update contributed by
      
      * niels CR: update configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * niels CR: update kashif -> huggingface
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * niels CR: `sampling_factor` only relevant when `attention_type`=prob
      
      * make style
      
      * fixed U_part: added multiplication by `L_Q`
      
      * fixed bug: remove `is not None` from `if config.distil`
      
      * fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check
      
      * fix integration tests
      
      * updated model hub
      
      * do not shift as in training
      
      * undo
      
      * fix make-copies
      
      * make fix-copies
      
      * added `if prediction_length is None`
      
      * changed `ProbSparseAttention` to `InformerProbSparseAttention`
      
      * changed `V_sum` -> `v_mean_dim_time`
      
      * changed `ConvLayer` to `InformerConvLayer` and fixed `super()`
      
      * TimeSeriesTansformer->Informer in decoder's Copied from
      
      * more descriptive in ProbSparse
      
      * make style
      
      * fix coped from
      
      * Revert "added `if prediction_length is None`"
      
      This reverts commit b4cbddfa05e3bd739b79569cd3c3b89e316f2451.
      
      * fixed indent
      
      * use InformerSinusoidalPositionalEmbedding
      
      * make fix-style
      
      * fix from #21860
      
      * fix name
      
      * make fix-copies
      
      * use time series utils
      
      * fix dec num_heads
      
      * docstring
      
      * added time series util doc
      
      * _import_structure
      
      * formatting
      
      * changes from review
      
      * make style
      
      * fix docs
      
      * fix doc
      
      * removed NegativeLogLikelihood
      
      ---------
      Co-authored-by: default avatarKashif Rasul <kashif.rasul@gmail.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      8abe4930
  3. 01 Mar, 2023 2 commits
  4. 27 Feb, 2023 2 commits
  5. 17 Feb, 2023 1 commit
  6. 13 Feb, 2023 1 commit
  7. 10 Feb, 2023 1 commit
  8. 08 Feb, 2023 1 commit
  9. 07 Feb, 2023 1 commit
    • Adrian Sager La Ganga's avatar
      Add inverse sqrt learning rate scheduler (#21495) · a3034c70
      Adrian Sager La Ganga authored
      * added inverse sqrt lr scheduler
      
      * Updated get_scheduler in src/transformers/optimization.py
      
      * Updated src/transformers/__init__.py
      
      * Added inverse sqrt lr scheduler test
      
      * Updated docs/source/en/main_classes/optimizer_schedules.mdx
      
      * Ran style and quality scripts
      
      * Fix get_inverse_sqrt_schedule docstring
      
      * Comment implementation URL
      a3034c70
  10. 06 Feb, 2023 1 commit
  11. 03 Feb, 2023 1 commit
    • Matthijs Hollemans's avatar
      [WIP] add SpeechT5 model (#18922) · e4bacf66
      Matthijs Hollemans authored
      * make SpeechT5 model by copying Wav2Vec2
      
      * add paper to docs
      
      * whoops added docs in wrong file
      
      * remove SpeechT5Tokenizer + put CTC back in the name
      
      * remove deprecated class
      
      * remove unused docstring
      
      * delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead
      
      * remove classes we don't need right now
      
      * initial stab at speech encoder prenet
      
      * add more speech encoder prenet stuff
      
      * improve SpeechEncoderPrenet
      
      * add encoder (not finished yet)
      
      * add relative position bias to self-attention
      
      * add encoder CTC layers
      
      * fix formatting
      
      * add decoder from BART, doesn't work yet
      
      * make it work with generate loop
      
      * wrap the encoder into a speech encoder class
      
      * wrap the decoder in a text decoder class
      
      * changed my mind
      
      * changed my mind again ;-)
      
      * load decoder weights, make it work
      
      * add weights for text decoder postnet
      
      * add SpeechT5ForCTC model that uses only the encoder
      
      * clean up EncoderLayer and DecoderLayer
      
      * implement _init_weights in SpeechT5PreTrainedModel
      
      * cleanup config + Encoder and Decoder
      
      * add head + cross attention masks
      
      * improve doc comments
      
      * fixup
      
      * more cleanup
      
      * more fixup
      
      * TextDecoderPrenet works now, thanks Kendall
      
      * add CTC loss
      
      * add placeholders for other pre/postnets
      
      * add type annotation
      
      * fix freeze_feature_encoder
      
      * set padding tokens to 0 in decoder attention mask
      
      * encoder attention mask downsampling
      
      * remove features_pen calculation
      
      * disable the padding tokens thing again
      
      * fixup
      
      * more fixup
      
      * code review fixes
      
      * rename encoder/decoder wrapper classes
      
      * allow checkpoints to be loaded into SpeechT5Model
      
      * put encoder into wrapper for CTC model
      
      * clean up conversion script
      
      * add encoder for TTS model
      
      * add speech decoder prenet
      
      * add speech decoder post-net
      
      * attempt to reconstruct the generation loop
      
      * add speech generation loop
      
      * clean up generate_speech
      
      * small tweaks
      
      * fix forward pass
      
      * enable always dropout on speech decoder prenet
      
      * sort declaration
      
      * rename models
      
      * fixup
      
      * fix copies
      
      * more fixup
      
      * make consistency checker happy
      
      * add Seq2SeqSpectrogramOutput class
      
      * doc comments
      
      * quick note about loss and labels
      
      * add HiFi-GAN implementation (from Speech2Speech PR)
      
      * rename file
      
      * add vocoder to TTS model
      
      * improve vocoder
      
      * working on tokenizer
      
      * more better tokenizer
      
      * add CTC tokenizer
      
      * fix decode and batch_code in CTC tokenizer
      
      * fix processor
      
      * two processors and feature extractors
      
      * use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2
      
      * cleanup
      
      * more cleanup
      
      * even more fixup
      
      * notebooks
      
      * fix log-mel spectrograms
      
      * support reduction factor
      
      * fixup
      
      * shift spectrograms to right to create decoder inputs
      
      * return correct labels
      
      * add labels for stop token prediction
      
      * fix doc comments
      
      * fixup
      
      * remove SpeechT5ForPreTraining
      
      * more fixup
      
      * update copyright headers
      
      * add usage examples
      
      * add SpeechT5ProcessorForCTC
      
      * fixup
      
      * push unofficial checkpoints to hub
      
      * initial version of tokenizer unit tests
      
      * add slow test
      
      * fix failing tests
      
      * tests for CTC tokenizer
      
      * finish CTC tokenizer tests
      
      * processor tests
      
      * initial test for feature extractors
      
      * tests for spectrogram feature extractor
      
      * fixup
      
      * more fixup
      
      * add decorators
      
      * require speech for tests
      
      * modeling tests
      
      * more tests for ASR model
      
      * fix imports
      
      * add fake tests for the other models
      
      * fixup
      
      * remove jupyter notebooks
      
      * add missing SpeechT5Model tests
      
      * add missing tests for SpeechT5ForCTC
      
      * add missing tests for SpeechT5ForTextToSpeech
      
      * sort tests by name
      
      * fix Hi-Fi GAN tests
      
      * fixup
      
      * add speech-to-speech model
      
      * refactor duplicate speech generation code
      
      * add processor for SpeechToSpeech model
      
      * add usage example
      
      * add tests for speech-to-speech model
      
      * fixup
      
      * enable gradient checkpointing for SpeechT5FeatureEncoder
      
      * code review
      
      * push_to_hub now takes repo_id
      
      * improve doc comments for HiFi-GAN config
      
      * add missing test
      
      * add integration tests
      
      * make number of layers in speech decoder prenet configurable
      
      * rename variable
      
      * rename variables
      
      * add auto classes for TTS and S2S
      
      * REMOVE CTC!!!
      
      * S2S processor does not support save/load_pretrained
      
      * fixup
      
      * these models are now in an auto mapping
      
      * fix doc links
      
      * rename HiFiGAN to HifiGan, remove separate config file
      
      * REMOVE auto classes
      
      * there can be only one
      
      * fixup
      
      * replace assert
      
      * reformat
      
      * feature extractor can process input and target at same time
      
      * update checkpoint names
      
      * fix commit hash
      e4bacf66
  12. 24 Jan, 2023 1 commit
  13. 20 Jan, 2023 1 commit
  14. 17 Jan, 2023 1 commit
  15. 15 Dec, 2022 1 commit
  16. 08 Dec, 2022 1 commit
    • Nathan Raw's avatar
      Add video classification pipeline (#20151) · 9e56aff5
      Nathan Raw authored
      * 🚧 wip video classification pipeline
      
      * 🚧 wip - add is_decord_available check
      
      * 🐛 add missing import
      
      *  add tests
      
      * 🔧 add decord to setup extras
      
      * 🚧 add is_decord_available
      
      *  add video-classification pipeline
      
      * 📝 add video classification pipe to docs
      
      * 🐛 add missing VideoClassificationPipeline import
      
      * 📌 add decord install in test runner
      
      *  fix url inputs to video-classification pipeline
      
      *  updates from review
      
      * 📝 add video cls pipeline to docs
      
      * 📝 add docstring
      
      * 🔥 remove unused import
      
      * 🔥 remove some code
      
      * 📝 docfix
      9e56aff5
  17. 06 Dec, 2022 1 commit
  18. 30 Nov, 2022 1 commit
  19. 21 Nov, 2022 1 commit
  20. 18 Nov, 2022 1 commit
  21. 15 Nov, 2022 1 commit
  22. 09 Nov, 2022 1 commit
  23. 08 Nov, 2022 1 commit
    • amyeroberts's avatar
      AutoImageProcessor (#20111) · 4eb918e6
      amyeroberts authored
      * AutoImageProcessor skeleton
      
      * Update references
      
      * Add mapping in init
      
      * Add model image processors to __init__ for importing
      
      * Add AutoImageProcessor tests
      
      * Fix up
      
      * Image Processor documentation
      
      * Remove pdb
      
      * Update docs/source/en/model_doc/mobilevit.mdx
      
      * Update docs
      
      * Don't add whitespace on json files
      
      * Remove fixtures
      
      * Move checking model config down
      
      * Fix up
      
      * Add check for image processor
      
      * Remove FeatureExtractorMixin in docstrings
      
      * Rename model_tmpfile to config_tmpfile
      
      * Don't make None if not in image processor map
      4eb918e6
  24. 07 Nov, 2022 1 commit
  25. 20 Oct, 2022 1 commit
  26. 19 Oct, 2022 1 commit
    • GMFTBY's avatar
      Adding the state-of-the-art contrastive search decoding methods for the... · 71786b10
      GMFTBY authored
      Adding the state-of-the-art contrastive search decoding methods for the codebase of generation_utils.py (#19477)
      
      * add: the contrastive search for generaton_utils
      
      * add: testing scripts for contrastive search under examples/text-generation
      
      * update the quality of codes
      
      * revise the docstring; make the generation_contrastive_search.py scripts;
      
      * revise the examples/pytorch/text-generation/run_generation_contrastive_search.py to the auto-APIs format
      
      * revise the necessary documents
      
      * fix: revise the docstring of generation_contrastive_search.py
      
      * Fix the code indentation
      
      * fix: revise the nits and examples in contrastive_search docstring.
      
      * fix the copyright
      
      * delete generation_contrastive_search.py
      
      * revise the logic in contrastive_search
      
      * update the intergration test and the docstring
      
      * run the tests over
      
      * add the slow decorate to the contrastive_search intergrate test
      
      * add more test
      
      * do the style, quality, consistency checks
      71786b10
  27. 12 Oct, 2022 1 commit
  28. 07 Oct, 2022 1 commit
    • Amrit Sahu's avatar
      [WIP] Add ZeroShotObjectDetectionPipeline (#18445) (#18930) · e9a49bab
      Amrit Sahu authored
      * Add ZeroShotObjectDetectionPipeline (#18445)
      
      * Add AutoModelForZeroShotObjectDetection task
      
      This commit also adds the following
      
      - Add explicit _processor method for ZeroShotObjectDetectionPipeline.
        This is necessary as pipelines don't auto infer processors yet and
        `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
        process multiple images at once
      
      - Add auto tests and other tests for ZeroShotObjectDetectionPipeline
      
      * Add AutoModelForZeroShotObjectDetection task
      
      This commit also adds the following
      
      - Add explicit _processor method for ZeroShotObjectDetectionPipeline.
        This is necessary as pipelines don't auto infer processors yet and
        `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
        process multiple images at once
      
      - Add auto tests and other tests for ZeroShotObjectDetectionPipeline
      
      * Add batching for ZeroShotObjectDetectionPipeline
      
      * Fix doc-string ZeroShotObjectDetectionPipeline
      
      * Fix output format: ZeroShotObjectDetectionPipeline
      e9a49bab
  29. 14 Sep, 2022 1 commit
  30. 09 Sep, 2022 1 commit
  31. 07 Sep, 2022 1 commit
    • Ankur Goyal's avatar
      Add DocumentQuestionAnswering pipeline (#18414) · 2ef77421
      Ankur Goyal authored
      
      
      * [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models
      
      * Fixup
      
      * Use the full encoding
      
      * Basic refactoring to DocumentQuestionAnsweringPipeline
      
      * Cleanup
      
      * Improve args, docs, and implement preprocessing
      
      * Integrate OCR
      
      * Refactor question_answering pipeline
      
      * Use refactored QA code in the document qa pipeline
      
      * Fix tests
      
      * Some small cleanups
      
      * Use a string type annotation for Image.Image
      
      * Update encoding with image features
      
      * Wire through the basic docs
      
      * Handle invalid response
      
      * Handle empty word_boxes properly
      
      * Docstring fix
      
      * Integrate Donut model
      
      * Fixup
      
      * Incorporate comments
      
      * Address comments
      
      * Initial incorporation of tests
      
      * Address Comments
      
      * Change assert to ValueError
      
      * Comments
      
      * Wrap `score` in float to make it JSON serializable
      
      * Incorporate AutoModeLForDocumentQuestionAnswering changes
      
      * Fixup
      
      * Rename postprocess function
      
      * Fix auto import
      
      * Applying comments
      
      * Improve docs
      
      * Remove extra assets and add copyright
      
      * Address comments
      Co-authored-by: default avatarAnkur Goyal <ankur@impira.com>
      2ef77421
  32. 02 Sep, 2022 1 commit
  33. 01 Sep, 2022 2 commits
  34. 29 Aug, 2022 1 commit
  35. 16 Aug, 2022 2 commits
  36. 10 Aug, 2022 1 commit
    • Younes Belkada's avatar
      `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) · 4a51075a
      Younes Belkada authored
      
      
      * first commit
      
      * correct replace function
      
      * add final changes
      
      - works like charm!
      - cannot implement tests yet
      - tested
      
      * clean up a bit
      
      * add bitsandbytes dependencies
      
      * working version
      
      - added import function
      - added bitsandbytes utils file
      
      * small fix
      
      * small fix
      
      - fix import issue
      
      * fix import issues
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * refactor a bit
      
      - move bitsandbytes utils to utils
      - change comments on functions
      
      * reformat docstring
      
      - reformat docstring on init_empty_weights_8bit
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * revert bad formatting
      
      * change to bitsandbytes
      
      * refactor a bit
      
      - remove init8bit since it is useless
      
      * more refactoring
      
      - fixed init empty weights issue
      - added threshold param
      
      * small hack to make it work
      
      * Update src/transformers/modeling_utils.py
      
      * Update src/transformers/modeling_utils.py
      
      * revmoe the small hack
      
      * modify utils file
      
      * make style + refactor a bit
      
      * create correctly device map
      
      * add correct dtype for device map creation
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * apply suggestions
      
      - remove with torch.grad
      - do not rely on Python bool magic!
      
      * add docstring
      
       - add docstring for new kwargs
      
      * add docstring
      
      - comment `replace_8bit_linear` function
      - fix weird formatting
      
      * - added more documentation
      - added new utility function for memory footprint tracking
      - colab demo to add
      
      * few modifs
      
      - typo doc
      - force cast into float16 when load_in_8bit is enabled
      
      * added colab link
      
      * add test architecture + docstring a bit
      
      * refactor a bit testing class
      
      * make style + refactor a bit
      
      * enhance checks
      
      - add more checks
      - start writing saving test
      
      * clean up a bit
      
      * male style
      
      * add more details on doc
      
      * add more tests
      
      - still needs to fix 2 tests
      
      * replace by "or"
      
      - could not fix it from GitHub GUI
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * refactor a bit testing code + add readme
      
      * make style
      
      * fix import issue
      
      * Update src/transformers/modeling_utils.py
      Co-authored-by: default avatarMichael Benayoun <mickbenayoun@gmail.com>
      
      * add few comments
      
      * add more doctring + make style
      
      * more docstring
      
      * raise error when loaded in 8bit
      
      * make style
      
      * add warning if loaded on CPU
      
      * add small sanity check
      
      * fix small comment
      
      * add bitsandbytes on dockerfile
      
      * Improve documentation
      
      - improve documentation from comments
      
      * add few comments
      
      * slow tests pass on the VM but not on the CI VM
      
      * Fix merge conflict
      
      * make style
      
      * another test should pass on a multi gpu setup
      
      * fix bad import in testing file
      
      * Fix slow tests
      
      - remove dummy batches
      - no more CUDA illegal memory errors
      
      * odify dockerfile
      
      * Update docs/source/en/main_classes/model.mdx
      
      * Update Dockerfile
      
      * Update model.mdx
      
      * Update Dockerfile
      
      * Apply suggestions from code review
      
      * few modifications
      
      - lm head can stay on disk/cpu
      - change model name so that test pass
      
      * change test value
      
      - change test value to the correct output
      - torch bmm changed to baddmm in bloom modeling when merging
      
      * modify installation guidelines
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * replace `n`by `name`
      
      * merge `load_in_8bit` and `low_cpu_mem_usage`
      
      * first try - keep the lm head in full precision
      
      * better check
      
      - check the attribute `base_model_prefix` instead of computing the number of parameters
      
      * added more tests
      
      * Update src/transformers/utils/bitsandbytes.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers
      
       into integration-8bit
      
      * improve documentation
      
      - fix typos for installation
      - change title in the documentation
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarMichael Benayoun <mickbenayoun@gmail.com>
      4a51075a