"vscode:/vscode.git/clone" did not exist on "4f403ea8994ee8785aa73957c827938e74cf0fe3"
- 02 Feb, 2022 4 commits
-
-
Patrick von Platen authored
-
NielsRogge authored
* Add torchvision's resize * Rename torch_resize to default_to_square * Apply suggestions from code review * Add support for default_to_square and tuple of length 1
-
Steven Liu authored
* first draft of pipeline, autoclass, preprocess tutorials * apply review feedback *
馃枍 apply feedback from patrick/niels *馃摑 add output image to preprocessed image *馃枍 apply feedback from patrick -
Steven Liu authored
* add fine-tune tutorial * make edits, fix style *
馃摑 make edits *馃枍 fix code format links to external libraries *馃攧 revert code formatting *馃枍 use DefaultDataCollator instead of DataCollatorWithPadding
-
- 01 Feb, 2022 11 commits
-
-
Sylvain Gugger authored
* Harder check for IndexErrors in QA scripts * Make test stronger
-
Sylvain Gugger authored
-
Suraj Patil authored
* refactor bart tokenizers * doc * replace assert with ValueError
-
Yih-Dar authored
* use mean instead of elementwise_mean * make style Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
SaulLu authored
fix the `tokenizer_config.json` file for the slow tokenizer when a fast version is available (#15319) * add new test * update test * remove `tokenizer_file` from `additional_files_names` in `tokenization_utils_base.py` * add `tokenizer_file` for the fast only tokenizer * change global variables layoutxml * remove `"tokenizer_file"` from DPR tokenizer's Global variables * remove `tokenizer_file` from herbert slow tokenizer init * `"tokenizer_file"` from LED tokenizer's Global variables * remove `tokenizer_file` from mbart slow tokenizer init * remove `tokenizer_file` from slow tokenizer template * adapt to versioning * adapt the `test_tokenizer_mismatch_warning` test * clean test * clarify `VOCAB_FILES_NAMES` in tokenization_utils_fast.py * Revert "remove `tokenizer_file` from mbart slow tokenizer init" This reverts commit 0dbb723fa9c7599d4640fe30b3647a74eb4a64e1. * Revert "`"tokenizer_file"` from LED tokenizer's Global variables" This reverts commit 5a3f879bdd651233f3d74a3d1146c34cde82b0c2. * Revert "remove `tokenizer_file` from herbert slow tokenizer init" This reverts commit f5e10007b7b0ec5345e015b9de7ffec72c5407fd. * Revert "remove `"tokenizer_file"` from DPR tokenizer's Global variables" This reverts commit da0895330bedfafc81ae3073470a9348c669f032. * set `tokenizer_file` in super `__init__` of mbart
-
SaulLu authored
* replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__` * add test * fix kwargs * reformat test * format * format * fix typo to render the documentation
-
Kamal Raj authored
fix typo
-
Suraj Patil authored
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* Fix TF Causal LM models' returned logits * Fix expected shape in the tests Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 31 Jan, 2022 25 commits
-
-
Stas Bekman authored
-
Suraj Patil authored
-
Sylvain Gugger authored
-
peregilk authored
* Update modeling_wav2vec2.py With very tiny sound files (less than 0.1 seconds) the num_masked_span can be too long. The issue is described in issue #15366 and discussed with @patrickvonplaten. * correct errors with mask time indices * remove bogus file * make fix-copies Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Tavin Turner authored
* Add 'with torch.no_grad()' to BEiT integration test forward pass * Fix inconsistent use of tabs and spaces in indentation
-
Matt authored
* Fix spurious warning in TF TokenClassification models * Fixing one last spurious warning * Removing outdated warning altogether
-
Suraj Patil authored
* refactor roberta tokenizer * refactor fast tokenizer * remove old comment
-
Suraj Patil authored
-
Yih-Dar authored
* fix tf led * fix * fix * Add test_pt_tf_model_equivalence_extra for TFLED * add a (temporary) test Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Suraj Patil authored
* add a section about GPUs * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* [Trainer] suppress warning for length-related columns * improve message * Update src/transformers/trainer.py
-
Sylvain Gugger authored
* Change REALM checkpoint to new ones * Last checkpoint missing
-
Matt authored
-
Yih-Dar authored
* Fix loss calculation in TFFunnelForTokenClassification * revert the change in TFFunnelForTokenClassification * fix FunnelForTokenClassification loss * fix other TokenClassification loss * fix more * fix more * add num_labels to ElectraForTokenClassification * revert the change to research projects Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Stas Bekman authored
* [deepspeed doc] fix import, extra notes * typo
-
NielsRogge authored
-
Sylvain Gugger authored
-
Ogundepo Odunayo authored
-
NielsRogge authored
* Fix Swin model outputs * Rename pooler
-
Suraj Patil authored
-
Jonatas Grosman authored
-
Kamal Raj authored
fix typo
-
Julien Plu authored
* Add Luke training * Fix true label tags * Fix true label tags * Fix true label tags * Update the data collator for Luke * Some training refactor for Luke * Improve data collator for Luke * Fix import * Fix datasets concatenation * Add the --max_entity_length argument for Luke models * Remove unused code * Fix style issues * Fix style issues * Move the Luke training into a separate folder * Fix style * Fix naming * Fix filtering * Fix filtering * Fix filter * Update some preprocessing * Move luke to research_projects * Checkstyle * Address comments * Fix style
-
Fran莽ois REMY authored
(This is an editorial change only)
-
NielsRogge authored
-