- 12 Aug, 2022 8 commits
-
-
Younes Belkada authored
* Supporting seq2seq models for `bitsandbytes` integration - `bitsandbytes` integration supports now seq2seq models - check if a model has tied weights as an additional check * small modification - tie the weights before looking at tied weights!
-
Joao Gante authored
* validate generate model_kwargs * generate tests -- not all models have an attn mask
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Sourab Mangrulkar authored
-
Stas Bekman authored
-
Wang, Yi authored
* update doc for perf_train_cpu_many, add mpi introduction Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Ian Castillo authored
* Add type hints for Vilt models * Add missing return type for TokenClassification class
-
Arthur authored
* initial commit * add small test * add cross pt tf flag to test * fix quality * style * update test with new repo * fix failing test * update * fix wrong param ordering * style * update based on review * update related to recent new caching mechanism * quality * Update based on review Co-authored-by:
sgugger <sylvain.gugger@gmail.com> * quality and style * Update src/transformers/modeling_flax_utils.py Co-authored-by:
sgugger <sylvain.gugger@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 11 Aug, 2022 13 commits
-
-
amyeroberts authored
-
Alara Dirik authored
-
dependabot[bot] authored
Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0 ) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
dependabot[bot] authored
Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0 ) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Sylvain Gugger authored
* Fix docstrings with last version of hf-doc-builder styler * Remove empty Parameter block
-
Michael Benayoun authored
* Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not * Use ENV_VARS_TRUE_VALUES
-
iiLaurens authored
* Fix critical trace warnings to allow ONNX export * Force input to `sqrt` to be float type * Cleanup code * Remove unused import statement * Update model sew * Small refactor Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com> * Use broadcasting instead of repeat * Implement suggestion Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com> * Match deberta v2 changes in sew_d * Improve code quality * Update code quality * Consistency of small refactor * Match changes in sew_d Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com>
-
flozi00 authored
* Create _config.py * Create _toctree.yml * Create index.mdx not sure about "du / ihr" oder "sie" * Create quicktour.mdx * Update _toctree.yml * Update build_documentation.yml * Update build_pr_documentation.yml * fix build * Update index.mdx * Update quicktour.mdx * Create installation.mdx * Update _toctree.yml
-
Dan Jones authored
Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486) * changing BartLearnedPositionalEmbedding forward signature and references to it * removing debugging dead code (thanks style checker) * blackened modeling_bart file * removing copy inconsistencies via make fix-copies * changing references to copied signatures in Bart variants * make fix-copies once more * using expand over repeat (thanks @michaelbenayoun) * expand instead of repeat for all model copies Co-authored-by:Daniel Jones <jonesdaniel@microsoft.com>
-
Sylvain Gugger authored
-
Wonseok Lee (Jack) authored
* fix typos * fix sequence_length docs of LayoutLMv3Model * delete trailing white spaces * fix layoutlmv3 docs more * apply make fixup & quality * change to two versions of input docstring * apply make fixup & quality
-
Alara Dirik authored
* Fixes resizing bug in OWL-ViT * Defaults to square resize if size is set to an int * Sets do_center_crop default value to False
-
Maxime G authored
* Segformer TF: fix output size in doc * Segformer pytorch: fix output size in doc Co-authored-by:Maxime Gardoni <maxime.gardoni@ecorobotix.com>
-
- 10 Aug, 2022 11 commits
-
-
Michael Wyatt authored
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Dhruv Karan authored
* onnx config for clip * default opset as 14 * changes from the original repo * input values order fix * outputs fix * remove unused import * ran make fix-copies * black format * review comments: forward ref, import fix, model change revert, .to cleanup * make style * formatting fixes * revert groupvit * comment for cast to int32 * comment fix * make .T as .t() for onnx conversion * ran make fix-copies * remove unneeded comment Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * remove comment Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
-
Steven Liu authored
*
馃摑 update philosophy to include other preprocessing classes *馃枍 apply feedbacks -
Julien Chaumond authored
* `pipeline` support for `device="mps"` (or any other string) * Simplify `if` nesting * Update src/transformers/pipelines/base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix? @sgugger * passing `attr=None` is not the same as not passing `attr` 馃く Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Use commit hash to look in cache instead of calling head * Add tests * Add attr for local configs too * Stupid typos * Fix tests * Update src/transformers/utils/hub.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Address Julien's comments Co-authored-by:
Julien Chaumond <julien@huggingface.co>
-
Matt authored
* Finished QA example * Dodge a merge conflict * Update text classification and LM examples * Update NER example * New Keras metrics WIP, fix NER example * Update NER example * Update MC, summarization and translation examples * Add XLA warnings when shapes are variable * Make sure batch_size is consistently scaled by num_replicas * Add PushToHubCallback to all models * Add docs links for KerasMetricCallback * Add docs links for prepare_tf_dataset and jit_compile * Correct inferred model names * Don't assume the dataset has 'lang' * Don't assume the dataset has 'lang' * Write metrics in text classification * Add 'framework' to TrainingArguments and TFTrainingArguments * Export metrics in all examples and add tests * Fix training args for Flax * Update command line args for translation test * make fixup * Fix accidentally running other tests in fp16 * Remove do_train/do_eval from run_clm.py * Remove do_train/do_eval from run_mlm.py * Add tensorflow tests to circleci * Fix circleci * Update examples/tensorflow/language-modeling/run_mlm.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/test_tensorflow_examples.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/translation/run_translation.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/token-classification/run_ner.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Fix save path for tests * Fix some model card kwargs * Explain the magical -1000 * Actually enable tests this time * Skip text classification PR until we fix shape inference * make fixup Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
Sylvain Gugger authored
* Preserve hub-related kwargs in AutoModel.from_pretrained * Fix tests * Remove debug statement
-
Joao Gante authored
* fix deberta issues * add different code paths for gpu and tpu * shorter gpu take along axis * Stable Dropout without tf cond * variable must be float
-
Younes Belkada authored
* first commit * correct replace function * add final changes - works like charm! - cannot implement tests yet - tested * clean up a bit * add bitsandbytes dependencies * working version - added import function - added bitsandbytes utils file * small fix * small fix - fix import issue * fix import issues * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit - move bitsandbytes utils to utils - change comments on functions * reformat docstring - reformat docstring on init_empty_weights_8bit * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert bad formatting * change to bitsandbytes * refactor a bit - remove init8bit since it is useless * more refactoring - fixed init empty weights issue - added threshold param * small hack to make it work * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * revmoe the small hack * modify utils file * make style + refactor a bit * create correctly device map * add correct dtype for device map creation * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions - remove with torch.grad - do not rely on Python bool magic! * add docstring - add docstring for new kwargs * add docstring - comment `replace_8bit_linear` function - fix weird formatting * - added more documentation - added new utility function for memory footprint tracking - colab demo to add * few modifs - typo doc - force cast into float16 when load_in_8bit is enabled * added colab link * add test architecture + docstring a bit * refactor a bit testing class * make style + refactor a bit * enhance checks - add more checks - start writing saving test * clean up a bit * male style * add more details on doc * add more tests - still needs to fix 2 tests * replace by "or" - could not fix it from GitHub GUI Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit testing code + add readme * make style * fix import issue * Update src/transformers/modeling_utils.py Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com> * add few comments * add more doctring + make style * more docstring * raise error when loaded in 8bit * make style * add warning if loaded on CPU * add small sanity check * fix small comment * add bitsandbytes on dockerfile * Improve documentation - improve documentation from comments * add few comments * slow tests pass on the VM but not on the CI VM * Fix merge conflict * make style * another test should pass on a multi gpu setup * fix bad import in testing file * Fix slow tests - remove dummy batches - no more CUDA illegal memory errors * odify dockerfile * Update docs/source/en/main_classes/model.mdx * Update Dockerfile * Update model.mdx * Update Dockerfile * Apply suggestions from code review * few modifications - lm head can stay on disk/cpu - change model name so that test pass * change test value - change test value to the correct output - torch bmm changed to baddmm in bloom modeling when merging * modify installation guidelines * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace `n`by `name` * merge `load_in_8bit` and `low_cpu_mem_usage` * first try - keep the lm head in full precision * better check - check the attribute `base_model_prefix` instead of computing the number of parameters * added more tests * Update src/transformers/utils/bitsandbytes.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit * improve documentation - fix typos for installation - change title in the documentation Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com>
-
- 09 Aug, 2022 8 commits
-
-
Steven Liu authored
-
Sylvain Gugger authored
-
YouJiacheng authored
* Recover _init_weights value in no_init_weights For potential nested use. In addition, users might modify private no_init_weights as well. * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove private variable change check Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Steven Liu authored
-
Nicolas Patry authored
* Adding a new `align_to_words` param to qa pipeline. * Update src/transformers/pipelines/question_answering.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Import protection. Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Younes Belkada authored
* attempt to fix attn mask device * fix bart `_prepare_decoder_attention_mask` - add correct device - run `make fix-copies` to propagate the fix
-
Yih-Dar authored
Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Thomas Chaigneau authored
* update features * MT5OnnxConfig added with updated with tests and docs * fix imports * fix onnc_config_cls for mt5 Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>
-