- 16 Mar, 2022 3 commits
-
-
Anton Lozhkov authored
* Minor fixes * Fix vocab union * Update examples/research_projects/xtreme-s/README.md Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update README * unused import Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Sanchit Gandhi authored
* Replace all deprecated `jax.ops` operations with jnp's `at` * np to jnp scores * suggested changes
-
Patrick von Platen authored
-
- 15 Mar, 2022 1 commit
-
-
Anton Lozhkov authored
* CTC+classification draft * CTC+classification draft * style * multilingual runs * Fix race condition during processor.from_reatrained * Merge covost experiments * Add README * Quality * Switch to .all configs * Fix typos
-
- 12 Mar, 2022 1 commit
-
-
Stas Bekman authored
* [WIP] add support for bf16 mode * prep for bf16 * prep for bf16 * fix; zero2/bf16 is ok * check bf16 is available * test fixes * enable zero3_bf16 * config files * docs * split stage_dtype; merge back to non-dtype-specific config file * fix doc * cleanup * cleanup * bfloat16 => bf16 to match the PR changes * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/ * test fixes/skipping * move * fix * Update docs/source/main_classes/deepspeed.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * backticks * cleanup * cleanup * cleanup * new version * add note about grad accum in bf16 Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 10 Mar, 2022 1 commit
-
-
Sanchit Gandhi authored
-
- 04 Mar, 2022 1 commit
-
-
Sanchit Gandhi authored
-
- 02 Mar, 2022 1 commit
-
-
Ross Johnstone authored
-
- 21 Feb, 2022 1 commit
-
-
Ivan Agarsk媒 authored
-
- 15 Feb, 2022 1 commit
-
-
Shamane Siri authored
-
- 11 Feb, 2022 1 commit
-
-
Stas Bekman authored
* [research_projects] deal with security alerts * add a note of the original PL ver and warning
-
- 09 Feb, 2022 1 commit
-
-
Lysandre Debut authored
* Upgrade black to version ~=22.0 * Check copies * Fix code
-
- 07 Feb, 2022 1 commit
-
-
Anton Lozhkov authored
* Single-epoch run * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Infinite dataset * Trainer fix + distributed benchmark * Benchmark fix * unused import * interleaved splits * interleaved splits * has_length util * Move to research projects * Leftover Sized checks * Bump min version * Unused import * Revert trainer changes Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 31 Jan, 2022 2 commits
-
-
Jonatas Grosman authored
-
Julien Plu authored
* Add Luke training * Fix true label tags * Fix true label tags * Fix true label tags * Update the data collator for Luke * Some training refactor for Luke * Improve data collator for Luke * Fix import * Fix datasets concatenation * Add the --max_entity_length argument for Luke models * Remove unused code * Fix style issues * Fix style issues * Move the Luke training into a separate folder * Fix style * Fix naming * Fix filtering * Fix filtering * Fix filter * Update some preprocessing * Move luke to research_projects * Checkstyle * Address comments * Fix style
-
- 27 Jan, 2022 4 commits
-
-
dependabot[bot] authored
Bumps [numpy](https://github.com/numpy/numpy) from 1.19.2 to 1.21.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt) - [Commits](https://github.com/numpy/numpy/compare/v1.19.2...v1.21.0 ) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
dependabot[bot] authored
Bumps [notebook](http://jupyter.org ) from 6.1.5 to 6.4.1. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
dependabot[bot] authored
Bumps [numpy](https://github.com/numpy/numpy) from 1.19.2 to 1.21.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt) - [Commits](https://github.com/numpy/numpy/compare/v1.19.2...v1.21.0 ) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Anton Lozhkov authored
* Device argument for the eval script * Default to none * isort
-
- 24 Jan, 2022 1 commit
-
-
Patrick von Platen authored
-
- 21 Jan, 2022 2 commits
-
-
Patrick von Platen authored
-
lewtun authored
* Move BART + ONNX example to research_projects * Add author information
-
- 20 Jan, 2022 2 commits
-
-
Anton Lozhkov authored
Clarify OVH instruction
-
Anton Lozhkov authored
Add an OVHcloud tutorial URL for the Robust Speech Challenge
-
- 19 Jan, 2022 5 commits
-
-
Patrick von Platen authored
-
Suraj Patil authored
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Patrick von Platen authored
-
- 18 Jan, 2022 1 commit
-
-
Patrick von Platen authored
* up * improve readme * up * up * more info * up * up * Apply suggestions from code review Co-authored-by:
Anton Lozhkov <aglozhkov@gmail.com> * add more stuff for eval * update * up * Update README.md * Update examples/research_projects/xls_r/README.md Co-authored-by:
Omar Sanseviero <osanseviero@users.noreply.github.com> * apply omar's suggestions Co-authored-by:
Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by:
Omar Sanseviero <osanseviero@users.noreply.github.com>
-
- 12 Jan, 2022 1 commit
-
-
Leandro von Werra authored
-
- 10 Jan, 2022 1 commit
-
-
Patrick von Platen authored
* up * up * up * up * up * up * improve * up * up * Update src/transformers/trainer.py * up * up * up
-
- 23 Dec, 2021 1 commit
-
-
Leandro von Werra authored
-
- 13 Dec, 2021 1 commit
-
-
Nathan Cooper authored
* Add some nicety flags for better controlling evaluation. * Fix dependency issue with outdated requirement * Add additional flag to example to ensure eval is done * Wrap code into main function for accelerate launcher to find * Fix valid batch size flag in readme * Add note to install git-lfs when initializing/training the model * Update examples/research_projects/codeparrot/scripts/arguments.py Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Revert "Wrap code into main function for accelerate launcher to find" This reverts commit ff11df1c810d4df198d04b827538eb4572147ba3. * Fix formatting issue * Move git-lfs instructions to installation section * Add a quick check before code generation for code evaluation * Fix styling issue * Update examples/research_projects/codeparrot/scripts/human_eval.py Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Make iterable dataset use passed in tokenizer rather than globally defined one Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
ncoop57 <nac33@students.uwf.edu>
-
- 06 Dec, 2021 1 commit
-
-
Julien Chaumond authored
* Replace outdated model tags with their now-canonical pipeline types * spam the CI till it's green
-
- 02 Dec, 2021 1 commit
-
-
Leandro von Werra authored
* add readme skeleton * update readme * add initialization script * add deduplication script * add codeparrot training script * add code generation evaluation * add validation loss script * add requirements * update readme * tweak readme * make style * add highlights to readme * add CLIs to scripts * add tokenizer training script * add docstring to constant length dataset * fix defaults in arguments * update readme with cli * move image to hub * tweaks of readme * fix cli commands * add author * explain env variables * fix formatting * Update examples/research_projects/codeparrot/README.md Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com> * replace generic with gpt2 tokenizer Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com>
-
- 30 Nov, 2021 1 commit
-
-
Thomas Viehmann authored
* use functional interface instead of instantiating module and immediately calling it * fix torch.nn.functional to nn.functional. Thank you Stas!
-
- 22 Nov, 2021 1 commit
-
-
Nicholas Broad authored
* remove sum for list flattening * change to chain(*) * make chain object a list * delete empty lines per sgugger's suggestions Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Nicholas Broad <nicholas@nmbroad.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 19 Nov, 2021 1 commit
-
-
Shang Zhang authored
* clean up branch for add-qdqbert-model * README update for QAT example; update docstrings in modeling_qdqbert.py * Update qdqbert.rst * Update README.md * Update README.md * calibration data using traning set; QAT example runs in fp32 * re-use BERTtokenizer for qdqbert * Update docs/source/model_doc/qdqbert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove qdqbert tokenizer * Update qdqbert.rst * update evaluate-hf-trt-qa.py * update configuration_qdqbert.py * update modeling_qdqbert.py: add copied statement; replace assert with ValueError * update copied from statement * add is_quantization_available; run make fix-copies * unittest add require_quantization * add backend dependency to qdqbert model * update README; update evaluate script; make style * lint * docs qdqbert update * circleci build_doc add pytorch-quantization for qdqbert * update README * update example readme with instructions to upgrade TensorRT to 8.2 * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * change quantization to pytorch_quantization for backend requirement * feed_forward_chunking not supported in QDQBert * make style * update model docstrings and comments in testing scripts * rename example to quantization-qdqbert; rename example scripts from qat to quant * Update src/transformers/models/qdqbert/modeling_qdqbert.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * rm experimental functions in quant_trainer * qa cleanup * make fix-copies for docs index.rst * fix doctree; use post_init() for qdqbert * fix early device assignment for qdqbert * fix CI:Model templates runner Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 17 Nov, 2021 1 commit
-
-
Antonio Carlos Falc茫o Petri authored
Co-authored-by:Stas Bekman <stas@stason.org>
-