"docs/source/en/main_classes/agent.md" did not exist on "f76fb3aeeafa98f2270e71f307559b6ab26d3801"
- 14 Jun, 2021 5 commits
-
-
Nicholas Broad authored
* Use text_column_name variable instead of "text" `text_column_name` was already defined above where I made the changes and it was also used below where I made changes. This is a very minor change. If a dataset does not use "text" as the column name, then the `tokenize_function` will now use whatever column is assigned to `text_column_name`. `text_column_name` is just the first column name if "text" is not a column name. It makes the function a little more robust, though I would assume that 90% + of datasets use "text" anyway. * black formatting * make style Co-authored-by:Nicholas Broad <nicholas@nmbroad.com>
-
Sylvain Gugger authored
* Don't log anything before logging is setup in examples * Last example
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * add colab links
-
Suraj Patil authored
* add readme for flax clm * use section link for tokenizer * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * update metrics Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * upload * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Update examples/flax/language-modeling/README.md * add more info * finish * fix Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
- 11 Jun, 2021 1 commit
-
-
Suraj Patil authored
* first draft * max_seq_length => block_size * fix arg names * fix typos * fix loss calculation * add max examples, fix train eval steps, metrics * optimizer mask * fix perpelexity, metric logging * fix logging * data_collator = > data_loader * refactor loss_fn * support single GPU * pass distributed to write_metric * fix jitting * fix single device training * fix single device metrics * close inner progress bars once finished * add overwrite_cache arg * ifx dataset caching issue * add more logs * few small fixes, * address nicholas suggestions * fix docstr * address patricks suggestions * make flake happy * pass new new_dropout_rng to apply_gradients * reset train metrics after every epoc * remove distributed logis, small fixes
-
- 10 Jun, 2021 7 commits
-
-
Bhavitvya Malik authored
* add relevant `desc` in examples * require_version datasets>=1.8.0
-
Matt authored
-
Matt authored
-
Matt authored
-
Sylvain Gugger authored
-
Matt authored
* Pushing partially-complete new GLUE example * First draft of the new TF GLUE example! Needs a little more testing to be sure but it's almost ready. * Fix to the fit() call * Bugfixes, making sure TPU and multi-GPU support is ready * Remove logger line that depends on Pytorch * Style pass * Deleting old TF GLUE example * Include label2id and id2label in the saved model config * Don't clobber the existing model.config.label2id * Style fixes * Update examples/tensorflow/text-classification/run_glue.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
kumapo authored
* Add text_column_name and label_column_name to run_ner args * Minor fix: grouping for text and label column name
-
- 09 Jun, 2021 5 commits
-
-
Stas Bekman authored
-
Suraj Patil authored
-
Anton Lozhkov authored
* Working quantizer forward * Working quantizer forward * Clean up unused model parts, test reproducibility * Working quantizer forward * Clean up unused model parts, test reproducibility * Remove custom outputs from the shared ones * correct conversion * correct bug * add first pretrain script * save intermediate * static shapes * save intermediate * finish first pretrain script version * more refactor * remove wanddb * refactor more * improve test * correct perplexity compute bug * finish model implementation * add to docs * finish docs * finish pretraining script * finish pretraining script * remove wandb * finish PR for merge * finish config * finish * make deepspeed work * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions * fix flaky test Co-authored-by:
patrickvonplaten <patrick.v.platen@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
-
Koichi Yasuoka authored
-
- 08 Jun, 2021 6 commits
-
-
Stas Bekman authored
* wip * wip - but working with https://github.com/microsoft/DeepSpeed/pull/1044 * cleanup * workaround * working 5/8 modes * solve fp32 distributed zero3 * style * sync * sync * rework * deprecation * cleanup * https://github.com/microsoft/DeepSpeed/pull/1044 pr was merged * clean up * add a guide * more prose * more prose * fix * more prose * sub_group_size was too big * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor * bug fix * make the true check explicit * new deepspeed release Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
-
cdleong authored
* Add torch to requirements.txt in language-modeling * Update examples/pytorch/language-modeling/requirements.txt Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Mario 艩a拧ko authored
* Replace legacy torch.Tensor constructor with torch.{tensor, empty} * Remove torch.Tensor in examples -
Shamane Siri authored
* updated the original RAG implementation to be compatible with the latest PL version * updated the requirements.txt file * execute make style * code quality test * code quality * conflix resolved in requirement.txt * code quality * changed the MyDDP class name to CustomDDP
-
Russell Klopfer authored
* adds metric prefix. * update tests to include prefix
-
- 03 Jun, 2021 2 commits
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * finish refactor Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
Nicholas Vadivelu authored
* Fix weight decay masking in `run_flax_glue.py` Issues with the previous implementation: - The `dict` from `traverse_util.flatten_dict` has keys which are tuples of strings, not one long string with the path separated by periods. - `optax.masked` applies the transformation wherever the mask is True, so the masks are flipped. - Flax's LayerNorm calls the scale parameter `scale` not `weight` * Fix formatting with black * adapt results Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
- 02 Jun, 2021 1 commit
-
-
dependabot[bot] authored
Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.8 to 1.26.5. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.25.8...1.26.5 ) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 01 Jun, 2021 2 commits
-
-
Fan Zhang authored
* modify qa-trainer * fix flax model
-
Shamane Siri authored
* initial * code quality test * code quality * added test functions in test_modeling_rag.py and test_retrieval_rag.py to test end2end retreiver * minor change in test_modeling_rag * fixed tests * Update examples/research_projects/rag-end2end-retriever/README.md typo corrected as suggested by lhoestq Co-authored-by:
Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Update examples/research_projects/rag-end2end-retriever/finetune_rag.py type change suggested by lhoestq Co-authored-by:
Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Update src/transformers/models/rag/retrieval_rag.py Adding this change as mentioned by lhoestq. Co-authored-by:
Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * completed the minor changes suggested by the reviewers Co-authored-by:
Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
-
- 31 May, 2021 2 commits
-
-
Philip May authored
* Add MT5ForConditionalGeneration as supported arch. * Update README.md
-
Nicholas Vadivelu authored
* Remove redundant `nn.log_softmax` in `run_flax_glue.py` `optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference. * Remove unused 'flax.linen' import
-
- 26 May, 2021 1 commit
-
-
Avital Oliver authored
-
- 25 May, 2021 4 commits
-
-
Stas Bekman authored
* create custom model on the flight * better wording * add update_from_string * cleanup * cleanup * Update src/transformers/configuration_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more bool options * style * fix logger * add test * add the doc * assert on conflict of options Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* fix overflow in perplexity calc * use inf * fix
-
Sylvain Gugger authored
* Add option to long only once in multinode training * Use an alternate property
-
Wang Ran (姹劧) authored
-
- 24 May, 2021 1 commit
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * change pytorch import to flax import
-
- 21 May, 2021 3 commits
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * add flax glue link
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * correct best seed for flax fine-tuning Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
Patrick von Platen authored
* speed up flax glue * remove unnecessary line * remove folder * remove run in loop Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-