1. 07 May, 2020 17 commits
    • Jared T Nielsen's avatar
      Add AlbertForPreTraining and TFAlbertForPreTraining models. (#4057) · 8bf73126
      Jared T Nielsen authored
      
      
      * Add AlbertForPreTraining and TFAlbertForPreTraining models.
      
      * PyTorch conversion
      
      * TensorFlow conversion
      
      * style
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      8bf73126
    • Julien Chaumond's avatar
      c99fe038
    • Sava艧 Y谋ld谋r谋m's avatar
      Create README.md (#4202) · 66113bd6
      Sava艧 Y谋ld谋r谋m authored
      66113bd6
    • Julien Chaumond's avatar
      6669915b
    • Julien Chaumond's avatar
      Examples readme.md (#4215) · 612fa1b1
      Julien Chaumond authored
      * README
      
      * Update README.md
      612fa1b1
    • Lysandre's avatar
      Pin isort and tf <= 2.1.0 · 2e578243
      Lysandre authored
      2e578243
    • Lysandre's avatar
      Release: v2.9.0 · e7cfc1a3
      Lysandre authored
      e7cfc1a3
    • Julien Chaumond's avatar
      BIG Reorganize examples (#4213) · 0ae96ff8
      Julien Chaumond authored
      * Created using Colaboratory
      
      * [examples] reorganize files
      
      * remove run_tpu_glue.py as superseded by TPU support in Trainer
      
      * Bugfix: int, not tuple
      
      * move files around
      0ae96ff8
    • Julien Chaumond's avatar
      [Trainer] Ability to specify optimizer/scheduler at init · cafa6a9e
      Julien Chaumond authored
      cc @patrickvonplaten @thomwolf
      cafa6a9e
    • Bram Vanroy's avatar
    • Lysandre Debut's avatar
      Tpu trainer (#4146) · ebf80e2e
      Lysandre Debut authored
      
      
      * wip
      
      * wip
      
      * a last wip
      
      * Better logging when using TPUs
      
      * Correct argument name
      
      * Tests
      
      * fix
      
      * Metrics in evaluation
      
      * Update src/transformers/training_args.py
      
      * [tpu] Use launcher script instead
      
      * [tpu] lots of tweaks
      
      * Fix formatting
      Co-authored-by: default avatarJulien Chaumond <chaumond@gmail.com>
      ebf80e2e
    • Funtowicz Morgan's avatar
      Ensure fast tokenizer can construct tensor without pad token if only one... · 026097b9
      Funtowicz Morgan authored
      Ensure fast tokenizer can construct tensor without pad token if only one sample is provided. (#4201)
      
      026097b9
    • Funtowicz Morgan's avatar
      Rewritten batch support in pipelines. (#4154) · 0a6cbea0
      Funtowicz Morgan authored
      
      
      * Rewritten batch support in pipelines.
      Signed-off-by: default avatarMorgan Funtowicz <morgan@huggingface.co>
      
      * Fix imports sorting 馃敡
      
      Signed-off-by: default avatarMorgan Funtowicz <morgan@huggingface.co>
      
      * Set pad_to_max_length=True by default on Pipeline.
      
      * Set pad_to_max_length=False for generation pipelines.
      
      Most of generation models doesn't have padding token.
      
      * Address @joeddav review comment: Uniformized *args.
      Signed-off-by: default avatarMorgan Funtowicz <morgan@huggingface.co>
      
      * Address @joeddav review comment: Uniformized *args (second).
      Signed-off-by: default avatarMorgan Funtowicz <morgan@huggingface.co>
      0a6cbea0
    • Patrick von Platen's avatar
      fix examples (#4192) · 99d1a694
      Patrick von Platen authored
      99d1a694
    • Patrick von Platen's avatar
      [Reformer] Fix example and error message (#4191) · 74ffc9ea
      Patrick von Platen authored
      * fix example reformer
      
      * fix error message and example docstring
      
      * improved error message
      74ffc9ea
    • Patrick von Platen's avatar
      fix docstring reformer (#4190) · 96c78396
      Patrick von Platen authored
      96c78396
    • Patrick von Platen's avatar
      Reformer (#3351) · dca34695
      Patrick von Platen authored
      * first copy & past commit from Bert and morgans LSH code
      
      * add easy way to compare to trax original code
      
      * translate most of function
      
      * make trax lsh self attention deterministic with numpy seed + copy paste code
      
      * add same config
      
      * add same config
      
      * make layer init work
      
      * implemented hash_vectors function for lsh attention
      
      * continue reformer translation
      
      * hf LSHSelfAttentionLayer gives same output as trax layer
      
      * refactor code
      
      * refactor code
      
      * refactor code
      
      * refactor
      
      * refactor + add reformer config
      
      * delete bogus file
      
      * split reformer attention layer into two layers
      
      * save intermediate step
      
      * save intermediate step
      
      * make test work
      
      * add complete reformer block layer
      
      * finish reformer layer
      
      * implement causal and self mask
      
      * clean reformer test and refactor code
      
      * fix merge conflicts
      
      * fix merge conflicts
      
      * update init
      
      * fix device for GPU
      
      * fix chunk length init for tests
      
      * include morgans optimization
      
      * improve memory a bit
      
      * improve comment
      
      * factorize num_buckets
      
      * better testing parameters
      
      * make whole model work
      
      * make lm model work
      
      * add t5 copy paste tokenizer
      
      * add chunking feed forward
      
      * clean config
      
      * add improved assert statements
      
      * make tokenizer work
      
      * improve test
      
      * correct typo
      
      * extend config
      
      * add complexer test
      
      * add new axial position embeddings
      
      * add local block attention layer
      
      * clean tests
      
      * refactor
      
      * better testing
      
      * save intermediate progress
      
      * clean test file
      
      * make shorter input length work for model
      
      * allow variable input length
      
      * refactor
      
      * make forward pass for pretrained model work
      
      * add generation possibility
      
      * finish dropout and init
      
      * make style
      
      * refactor
      
      * add first version of RevNet Layers
      
      * make forward pass work and add convert file
      
      * make uploaded model forward pass work
      
      * make uploaded model forward pass work
      
      * refactor code
      
      * add namedtuples and cache buckets
      
      * correct head masks
      
      * refactor
      
      * made reformer more flexible
      
      * make style
      
      * remove set max length
      
      * add attention masks
      
      * fix up tests
      
      * fix lsh attention mask
      
      * make random seed optional for the moment
      
      * improve memory in reformer
      
      * add tests
      
      * make style
      
      * make sure masks work correctly
      
      * detach gradients
      
      * save intermediate
      
      * correct backprob through gather
      
      * make style
      
      * change back num hashes
      
      * rename to labels
      
      * fix rotation shape
      
      * fix detach
      
      * update
      
      * fix trainer
      
      * fix backward dropout
      
      * make reformer more flexible
      
      * fix conflict
      
      * fix
      
      * fix
      
      * add tests for fixed seed in reformer layer
      
      * fix trainer typo
      
      * fix typo in activations
      
      * add fp16 tests
      
      * add fp16 training
      
      * support fp16
      
      * correct gradient bug in reformer
      
      * add fast gelu
      
      * re-add dropout for embedding dropout
      
      * better naming
      
      * better naming
      
      * renaming
      
      * finalize test branch
      
      * finalize tests
      
      * add more tests
      
      * finish tests
      
      * fix
      
      * fix type trainer
      
      * fix fp16 tests
      
      * fix tests
      
      * fix tests
      
      * fix tests
      
      * fix issue with dropout
      
      * fix dropout seeds
      
      * correct random seed on gpu
      
      * finalize random seed for dropout
      
      * finalize random seed for dropout
      
      * remove duplicate line
      
      * correct half precision bug
      
      * make style
      
      * refactor
      
      * refactor
      
      * docstring
      
      * remove sinusoidal position encodings for reformer
      
      * move chunking to modeling_utils
      
      * make style
      
      * clean config
      
      * make style
      
      * fix tests
      
      * fix auto tests
      
      * pretrained models
      
      * fix docstring
      
      * update conversion file
      
      * Update pretrained_models.rst
      
      * fix rst
      
      * fix rst
      
      * update copyright
      
      * fix test path
      
      * fix test path
      
      * fix small issue in test
      
      * include reformer in generation tests
      
      * add docs for axial position encoding
      
      * finish docs
      
      * Update convert_reformer_trax_checkpoint_to_pytorch.py
      
      * remove isort
      
      * include sams comments
      
      * remove wrong comment in utils
      
      * correct typos
      
      * fix typo
      
      * Update reformer.rst
      
      * applied morgans optimization
      
      * make style
      
      * make gpu compatible
      
      * remove bogus file
      
      * big test refactor
      
      * add example for chunking
      
      * fix typo
      
      * add to README
      dca34695
  2. 06 May, 2020 7 commits
  3. 05 May, 2020 4 commits
  4. 04 May, 2020 2 commits
  5. 03 May, 2020 1 commit
  6. 02 May, 2020 9 commits