- 07 May, 2020 15 commits
-
-
Sava艧 Y谋ld谋r谋m authored
-
Julien Chaumond authored
-
Julien Chaumond authored
* README * Update README.md
-
Lysandre authored
-
Lysandre authored
-
Julien Chaumond authored
* Created using Colaboratory * [examples] reorganize files * remove run_tpu_glue.py as superseded by TPU support in Trainer * Bugfix: int, not tuple * move files around
-
Julien Chaumond authored
cc @patrickvonplaten @thomwolf
-
Lysandre Debut authored
* wip * wip * a last wip * Better logging when using TPUs * Correct argument name * Tests * fix * Metrics in evaluation * Update src/transformers/training_args.py * [tpu] Use launcher script instead * [tpu] lots of tweaks * Fix formatting Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Funtowicz Morgan authored
Ensure fast tokenizer can construct tensor without pad token if only one sample is provided. (#4201)
-
Funtowicz Morgan authored
* Rewritten batch support in pipelines. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Fix imports sorting
馃敡 Signed-off-by:Morgan Funtowicz <morgan@huggingface.co> * Set pad_to_max_length=True by default on Pipeline. * Set pad_to_max_length=False for generation pipelines. Most of generation models doesn't have padding token. * Address @joeddav review comment: Uniformized *args. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Address @joeddav review comment: Uniformized *args (second). Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co>
-
Patrick von Platen authored
-
Patrick von Platen authored
* fix example reformer * fix error message and example docstring * improved error message
-
Patrick von Platen authored
-
Patrick von Platen authored
* first copy & past commit from Bert and morgans LSH code * add easy way to compare to trax original code * translate most of function * make trax lsh self attention deterministic with numpy seed + copy paste code * add same config * add same config * make layer init work * implemented hash_vectors function for lsh attention * continue reformer translation * hf LSHSelfAttentionLayer gives same output as trax layer * refactor code * refactor code * refactor code * refactor * refactor + add reformer config * delete bogus file * split reformer attention layer into two layers * save intermediate step * save intermediate step * make test work * add complete reformer block layer * finish reformer layer * implement causal and self mask * clean reformer test and refactor code * fix merge conflicts * fix merge conflicts * update init * fix device for GPU * fix chunk length init for tests * include morgans optimization * improve memory a bit * improve comment * factorize num_buckets * better testing parameters * make whole model work * make lm model work * add t5 copy paste tokenizer * add chunking feed forward * clean config * add improved assert statements * make tokenizer work * improve test * correct typo * extend config * add complexer test * add new axial position embeddings * add local block attention layer * clean tests * refactor * better testing * save intermediate progress * clean test file * make shorter input length work for model * allow variable input length * refactor * make forward pass for pretrained model work * add generation possibility * finish dropout and init * make style * refactor * add first version of RevNet Layers * make forward pass work and add convert file * make uploaded model forward pass work * make uploaded model forward pass work * refactor code * add namedtuples and cache buckets * correct head masks * refactor * made reformer more flexible * make style * remove set max length * add attention masks * fix up tests * fix lsh attention mask * make random seed optional for the moment * improve memory in reformer * add tests * make style * make sure masks work correctly * detach gradients * save intermediate * correct backprob through gather * make style * change back num hashes * rename to labels * fix rotation shape * fix detach * update * fix trainer * fix backward dropout * make reformer more flexible * fix conflict * fix * fix * add tests for fixed seed in reformer layer * fix trainer typo * fix typo in activations * add fp16 tests * add fp16 training * support fp16 * correct gradient bug in reformer * add fast gelu * re-add dropout for embedding dropout * better naming * better naming * renaming * finalize test branch * finalize tests * add more tests * finish tests * fix * fix type trainer * fix fp16 tests * fix tests * fix tests * fix tests * fix issue with dropout * fix dropout seeds * correct random seed on gpu * finalize random seed for dropout * finalize random seed for dropout * remove duplicate line * correct half precision bug * make style * refactor * refactor * docstring * remove sinusoidal position encodings for reformer * move chunking to modeling_utils * make style * clean config * make style * fix tests * fix auto tests * pretrained models * fix docstring * update conversion file * Update pretrained_models.rst * fix rst * fix rst * update copyright * fix test path * fix test path * fix small issue in test * include reformer in generation tests * add docs for axial position encoding * finish docs * Update convert_reformer_trax_checkpoint_to_pytorch.py * remove isort * include sams comments * remove wrong comment in utils * correct typos * fix typo * Update reformer.rst * applied morgans optimization * make style * make gpu compatible * remove bogus file * big test refactor * add example for chunking * fix typo * add to README
-
- 06 May, 2020 7 commits
-
-
Clement authored
-
Julien Plu authored
* First commit to add a TF version of the trainer. * Make the TF trainer closer to what looks the PT trainer * Refactoring common code between the PT and TF trainer into an util file. * Some bugfix + better similarity with the PT trainer * Add missing class in transformers init * Bugfix over prediction + use classification report instead of simple metrics * Fix name error * Fix optimization tests + style * Apply style * Several bugfix for multi-gpu training * Apply style * Apply style * Add glue example for the TF trainer * Several bugix + address the reviews * Fix on the TF training args file * Add a debug mode * Bugfix in utils_ner.py when segment_ids is None * Apply style * Apply style * Add TPU strategy * Fix selection strategy
-
Simone Primarosa authored
-
kumapo authored
-
martindh authored
Description for the model card describing the camembert-large-fquad model.
-
Julien Plu authored
-
Manuel Romero authored
-
- 05 May, 2020 4 commits
-
-
Patrick von Platen authored
-
-
Lysandre Debut authored
* Standard deviation can no longer be set to 0 * Remove torch pinned version * 9th instead of 10th, silly me
-
Boris Dayma authored
* feat: add logging through Weights & Biases * feat(wandb): make logging compatible with all scripts * style(trainer.py): fix formatting * [Trainer] Tweak wandb integration Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
- 04 May, 2020 2 commits
-
-
jaymody authored
-
Patrick von Platen authored
* Hoist bert model tester for patric * indent * make tests work * Update tests/test_modeling_bert.py Co-authored-by:
Julien Chaumond <chaumond@gmail.com> Co-authored-by:
sshleifer <sshleifer@gmail.com> Co-authored-by:
Julien Chaumond <chaumond@gmail.com>
-
- 03 May, 2020 1 commit
-
-
Lorenzo Ampil authored
-
- 02 May, 2020 11 commits
-
-
Zhiyu Lin authored
* Fix of issue #2941 Reshaped score array to avoid `numpy` ValueError. * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Manuel Romero authored
* Create model card Create Model card for distilroberta-base-finetuned-sentiment * Update model_cards/mrm8488/distilroberta-base-finetuned-sentiment/README.md * Update model_cards/mrm8488/distilroberta-base-finetuned-sentiment/README.md Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Suraj Parmar authored
* Create README.md * Update model_cards/surajp/albert-base-sanskrit/README.md Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Zhen Wang authored
-
HUSEIN ZOLKEPLI authored
-
William Falcon authored
-
William Falcon authored
-
Stefan Schweter authored
* ner: parse args from .args file or JSON * examples: mention json-based configuration file support for run_ner script
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Patrick von Platen authored
* correct model card * remove model card from patrick von platen
-