- 12 Apr, 2021 9 commits
-
-
Philipp Schmid authored
* increased train_runtime for model parallelism * added documentation for framework upgrade
-
Lysandre Debut authored
-
cronoik authored
-
NielsRogge authored
* First draft of deit * More improvements * Remove DeiTTokenizerFast from init * Conversion script works * Add DeiT to ViT conversion script * Add tests, add head model, add support for deit in vit conversion script * Update model checkpoint names * Update image_mean and image_std, set resample to bicubic * Improve docs * Docs improvements * Add DeiTForImageClassificationWithTeacher to init * Address comments by @sgugger * Improve feature extractors * Make fix-copies * Minor fixes * Address comments by @patil-suraj * All models uploaded * Fix tests * Remove labels argument from DeiTForImageClassificationWithTeacher * Fix-copies, style and quality * Fix tests * Fix typo * Multiple docs improvements * More docs fixes
-
Takuya Makino authored
-
fghuman authored
* Added documentation for data collator. * Update docs/source/data_collator.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Added documentation for data collator. * Added documentation for the data collator. * Merge branch 'doc_DataCollator' of C:\Users\mahii\PycharmProjects\transformers with conflicts. * Update documentation for the data collator. * Update documentation for the data collator. Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Amna <A.A.Ahmad@student.tudelft.nl>
-
Masatoshi TSUCHIYA authored
* model_path is refered as the path of the trainer, and should be ignored as the checkpoint path. * Improved according to Sgugger's comment.
-
Sylvain Gugger authored
-
cronoik authored
-
- 09 Apr, 2021 13 commits
-
-
Sylvain Gugger authored
-
Lysandre authored
-
Philipp Schmid authored
* added json dump and extraction of train run time * make style happy
-
Stas Bekman authored
* fix _LazyModule hasher error * reword
-
Suraj Patil authored
* keep a list of multilingual tokenizers * add forced_bos_token argument
-
Kevin Canwen Xu authored
* Add a special tokenizer for CPM model * make style * fix * Add docs * styles * cpm doc * fix ci * fix the overview * add test * make style * typo * Custom tokenizer flag * Add REAMDE.md Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr>
-
Sylvain Gugger authored
-
Saviour Owolabi authored
Corrected a typo ('Downlowd' to 'Download') -
Keisuke Hirota authored
* Change duplicated LogitsProcessor to LogitsWarper in LogitsProcessorList document * Write more detailed information about LogitsProcessor's scores argument * apply suggestion from review * style Co-authored-by:Suraj Patil <surajp815@gmail.com>
-
Niklas Muennighoff authored
* Add Wav2Vec Inference notebook * Update docs/source/community.md Co-authored-by:Suraj Patil <surajp815@gmail.com>
-
Stas Bekman authored
* typo * style
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
- 08 Apr, 2021 14 commits
-
-
Stas Bekman authored
* make fairscale and deepspeed setup extras * fix default * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * no reason not to ask for the good version * update the CIs Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Add support for multiple models for one config in auto classes * Use get_values everywhere * Prettier doc
-
Stas Bekman authored
* extras[doc] must include 'all' * fix * better * regroup
-
Stas Bekman authored
* relocate core integration tests * add sys.path context manager * cleanup * try * try2 * fix path * doc * style * add dep * add 2 more deps
-
Andrea Cappelli authored
* Add mlm collator pad to multiple option (#10627) * Use padding to 8x in run mlm (#10627)
-
Sylvain Gugger authored
-
Philipp Schmid authored
-
Lysandre Debut authored
* Add fairscale and deepspeed back to the CI * Add deepspeed to single GPU tests
-
Stas Bekman authored
* solve "scheduler before optimizer step" warning * style * correct the state evaluation test
-
Julien Demouth authored
* Add support for NVIDIA Megatron models * Add support for NVIDIA Megatron GPT2 and BERT Add the megatron_gpt2 model. That model reuses the existing GPT2 model. This commit includes a script to convert a Megatron-GPT2 checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. Add the megatron_bert model. That model is implemented as a modification of the existing BERT model in Transformers. This commit includes a script to convert a Megatron-BERT checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Remove model.half in tests + add "# Copied ..." Remove the model.half() instruction which makes tests fail on the CPU. Add a comment "# Copied ..." before many classes in the model to enable automatic tracking in CI between the new Megatron classes and the original Bert ones. * Fix issues * Fix Flax/TF tests * Fix copyright * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/megatron_bert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/megatron_gpt2.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Resolve most of 'sgugger' comments * Fix conversion issue + Run make fix-copies/quality/docs * Apply suggestions from code review * Causal LM & merge * Fix init * Add CausalLM to last auto class Co-authored-by:
Julien Demouth <jdemouth@nvidia.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Stas Bekman authored
* synced gpus * fix * fix * need to use t5-small for quality tests * notes * complete merge * fix a disappearing std stream problem * start zero3 tests * wip * tune params * sorting out the pre-trained model loading * reworking generate loop wip * wip * style * fix tests * split the tests * refactor tests * wip * parameterized * fix * workout the resume from non-ds checkpoint pass + test * cleanup * remove no longer needed code * split getter/setter functions * complete the docs * suggestions * gpus and their compute capabilities link * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * style * remove invalid paramgd * automatically configure zero3 params that rely on hidden size * make _get_resized_embeddings zero3-aware * add test exercising resize_token_embeddings() * add docstring Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
* clarify why we get the warning here * Update examples/language-modeling/run_clm.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * wording * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Yusuke Mori authored
-
Jannis Born authored
* fix: docstrings in prediction_step * ci: Satisfy line length requirements * ci: character length requirements
-
- 07 Apr, 2021 4 commits
-
-
Sylvain Gugger authored
-
Philipp Schmid authored
* added model_kwargs to infer_framework_from_model * added model_kwargs to tokenizer * added use_auth_token as named parameter * added dynamic get for use_auth_token
-
Stas Bekman authored
* handle version requirement ranges * add mixed requirement test * cleanup
-
Vasudev Gupta authored
-