- 03 Apr, 2020 1 commit
-
-
Yohei Tamura authored
* BertJapaneseTokenizer accept options for mecab * black * fix mecab_option to Option[str]
-
- 02 Apr, 2020 2 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
* solve conflicts * improve comments
-
- 01 Apr, 2020 4 commits
-
-
Patrick von Platen authored
* change tf t5 argument naming for TF 2.2 * correct bug in testing
-
Patrick von Platen authored
-
Anirudh Srinivasan authored
-
Patrick von Platen authored
[T5, Testst] Add extensive hard-coded integration tests and make sure PT and TF give equal results (#3550) * add some t5 integration tests * finish summarization and translation integration tests for T5 - results loook good * add tf test * fix == vs is bug * fix tf beam search error and make tf t5 tests pass
-
- 31 Mar, 2020 2 commits
-
-
Patrick von Platen authored
* add bad words list * make style * add bad_words_tokens * make style * better naming * make style * fix typo
-
Patrick von Platen authored
-
- 30 Mar, 2020 7 commits
-
-
dougian authored
Co-authored-by:Ioannis Douratsos <ioannisd@amazon.com>
-
Julien Chaumond authored
-
Julien Plu authored
* Update the NER TF script to remove the softmax and make the pad token label id to -1 * Reformat the quality and style Co-authored-by:Julien Plu <julien.plu@adevinta.com>
-
LysandreJik authored
-
Patrick von Platen authored
-
Patrick von Platen authored
* make decoder input ids optional for t5 training * lm_lables should not be shifted in t5 * add tests * finish shift right functionality for PT T5 * move shift right to correct class * cleaner code * replace -100 values with pad token id * add assert statement * remove unnecessary for loop * make style
-
Patrick von Platen authored
* Add clear description of how to train T5 * correct docstring in T5 * correct typo * correct docstring format * update t5 model docs * implement collins feedback * fix typo and add more explanation for sentinal tokens * delete unnecessary todos
-
- 29 Mar, 2020 1 commit
-
-
Sam Shleifer authored
-
- 27 Mar, 2020 3 commits
-
-
Patrick von Platen authored
* add t5 docs basis * improve docs * add t5 docs * improve t5 docstring * add t5 tokenizer docstring * finish docstring * make style * add pretrained models * correct typo * make examples work * finalize docs
-
LysandreJik authored
For some reason Sphinx extremely dislikes this and crashes.
-
Sam Shleifer authored
-
- 26 Mar, 2020 10 commits
-
-
Sam Shleifer authored
* trim seq_len below 1024 if there are columns full of pad_token_id * Centralize trim_batch so SummarizationDataset can use it too
-
Sam Shleifer authored
-
Sam Shleifer authored
* Dummy inputs to model.device * Move self.device to ModuleUtilsMixin
-
Sam Shleifer authored
-
Sam Shleifer authored
* delete lm_head, skips weight tying * Fixed s3
-
sakares saengkaew authored
* Add the missing token classification for XLM * fix styling * Add XLMForTokenClassification to AutoModelForTokenClassification class * Fix docstring typo for non-existing class * Add the missing token classification for XLM * fix styling * fix styling * Add XLMForTokenClassification to AutoModelForTokenClassification class * Fix docstring typo for non-existing class * Add missing description for AlbertForTokenClassification * fix styling * Add missing docstring for AlBert * Slow tests should be slow Co-authored-by:
Sakares Saengkaew <s.sakares@gmail.com> Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr>
-
Patrick von Platen authored
-
Patrick von Platen authored
* fix merge conflicts * add t5 summarization example * change parameters for t5 summarization * make style * add first code snippet for translation * only add prefixes * add prefix patterns * make style * renaming * fix conflicts * remove unused patterns * solve conflicts * fix merge conflicts * remove translation example * remove summarization example * make sure tensors are in numpy for float comparsion * re-add t5 config * fix t5 import config typo * make style * remove unused numpy statements * update doctstring * import translation pipeline
-
Patrick von Platen authored
* solve conflicts * move warnings below * incorporate changes * add pad_to_max_length to pipelines * add bug fix for T5 beam search * add prefix patterns * make style * fix conflicts * adapt pipelines for task specific parameters * improve docstring * remove unused patterns
-
Lysandre Debut authored
-
- 25 Mar, 2020 2 commits
-
-
Patrick von Platen authored
* add new default configs * change prefix default to None
-
Julien Chaumond authored
* [ci] Also run test_examples in py37 (will revert at the end of the experiment) * InputExample: use immutable dataclass * [deps] Install dataclasses for Py<3.7 * [skip ci] Revert "[ci] Also run test_examples in py37" This reverts commit d29afd9959786b77759b0b8fa4e6b4335b952015.
-
- 24 Mar, 2020 3 commits
-
-
Julien Chaumond authored
-
LysandreJik authored
-
Julien Chaumond authored
-
- 23 Mar, 2020 1 commit
-
-
Julien Chaumond authored
see #3359 cc @lysandrejik
-
- 20 Mar, 2020 1 commit
-
-
Patrick von Platen authored
* make style * fix conflicts
-
- 19 Mar, 2020 3 commits
-
-
Julien Chaumond authored
-
Patrick von Platen authored
* fix conflicts * update bart max length test * correct spelling mistakes * implemented model specific encode function * fix merge conflicts * better naming * save intermediate state -> need to rethink strucuture a bit * leave tf problem as it is for now * current version * add layers.pop * remove ipdb * make style * clean return cut decoding * remove ipdbs * Fix restoring layers in the decoders that doesnt exists. * push good intermediate solution for now * fix conflicts * always good to refuse to merge conflicts when rebasing * fix small bug * improve function calls * remove unused file * add correct scope behavior for t5_generate Co-authored-by:Morgan Funtowicz <funtowiczmo@gmail.com>
-
Lysandre Debut authored
-