1. 17 Jul, 2020 1 commit
    • Patrick von Platen's avatar
      [Reformer] - Cache hidden states and buckets to speed up inference (#5578) · 9d37c56b
      Patrick von Platen authored
      * fix merge rebase
      
      * add intermediate reformer code
      
      * save intermediate caching results
      
      * save intermediate
      
      * save intermediate results
      
      * save intermediate
      
      * upload next step
      
      * fix generate tests
      
      * make tests work
      
      * add named tuple output
      
      * Apply suggestions from code review
      
      * fix use_cache for False case
      
      * fix tensor to gpu
      
      * fix tensor to gpu
      
      * refactor
      
      * refactor and make style
      9d37c56b
  2. 14 Jul, 2020 1 commit
    • as-stevens's avatar
      [Reformer classification head] Implement the reformer model classification... · f867000f
      as-stevens authored
      
      [Reformer classification head] Implement the reformer model classification head for text classification (#5198)
      
      * Reformer model head classification implementation for text classification
      
      * Reformat the reformer model classification code
      
      * PR review comments, and test case implementation for reformer for classification head changes
      
      * CI/CD reformer for classification head test import error fix
      
      * CI/CD test case implementation  added ReformerForSequenceClassification to all_model_classes
      
      * Code formatting- fixed
      
      * Normal test cases added for reformer classification head
      
      * Fix test cases implementation for the reformer classification head
      
      * removed token_type_id parameter from the reformer classification head
      
      * fixed the test case for reformer classification head
      
      * merge conflict with master fixed
      
      * merge conflict, changed reformer classification to accept the choice_label parameter added in latest code
      
      * refactored the the reformer classification head test code
      
      * reformer classification head, common transform test cases fixed
      
      * final set of the review comment, rearranging the reformer classes and docstring add to classification forward method
      
      * fixed the compilation error and text case fix for reformer classification head
      
      * Apply suggestions from code review
      
      Remove unnecessary dup
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      f867000f
  3. 07 Jul, 2020 1 commit
  4. 01 Jul, 2020 3 commits
  5. 02 Jun, 2020 2 commits
  6. 19 May, 2020 1 commit
    • Julien Chaumond's avatar
      Fix nn.DataParallel compatibility in PyTorch 1.5 (#4300) · 4c068936
      Julien Chaumond authored
      * Test case for #3936
      
      * multigpu tests pass on pytorch 1.4.0
      
      * Fixup
      
      * multigpu tests pass on pytorch 1.5.0
      
      * Update src/transformers/modeling_utils.py
      
      * Update src/transformers/modeling_utils.py
      
      * rename multigpu to require_multigpu
      
      * mode doc
      4c068936
  7. 07 May, 2020 1 commit
    • Patrick von Platen's avatar
      Reformer (#3351) · dca34695
      Patrick von Platen authored
      * first copy & past commit from Bert and morgans LSH code
      
      * add easy way to compare to trax original code
      
      * translate most of function
      
      * make trax lsh self attention deterministic with numpy seed + copy paste code
      
      * add same config
      
      * add same config
      
      * make layer init work
      
      * implemented hash_vectors function for lsh attention
      
      * continue reformer translation
      
      * hf LSHSelfAttentionLayer gives same output as trax layer
      
      * refactor code
      
      * refactor code
      
      * refactor code
      
      * refactor
      
      * refactor + add reformer config
      
      * delete bogus file
      
      * split reformer attention layer into two layers
      
      * save intermediate step
      
      * save intermediate step
      
      * make test work
      
      * add complete reformer block layer
      
      * finish reformer layer
      
      * implement causal and self mask
      
      * clean reformer test and refactor code
      
      * fix merge conflicts
      
      * fix merge conflicts
      
      * update init
      
      * fix device for GPU
      
      * fix chunk length init for tests
      
      * include morgans optimization
      
      * improve memory a bit
      
      * improve comment
      
      * factorize num_buckets
      
      * better testing parameters
      
      * make whole model work
      
      * make lm model work
      
      * add t5 copy paste tokenizer
      
      * add chunking feed forward
      
      * clean config
      
      * add improved assert statements
      
      * make tokenizer work
      
      * improve test
      
      * correct typo
      
      * extend config
      
      * add complexer test
      
      * add new axial position embeddings
      
      * add local block attention layer
      
      * clean tests
      
      * refactor
      
      * better testing
      
      * save intermediate progress
      
      * clean test file
      
      * make shorter input length work for model
      
      * allow variable input length
      
      * refactor
      
      * make forward pass for pretrained model work
      
      * add generation possibility
      
      * finish dropout and init
      
      * make style
      
      * refactor
      
      * add first version of RevNet Layers
      
      * make forward pass work and add convert file
      
      * make uploaded model forward pass work
      
      * make uploaded model forward pass work
      
      * refactor code
      
      * add namedtuples and cache buckets
      
      * correct head masks
      
      * refactor
      
      * made reformer more flexible
      
      * make style
      
      * remove set max length
      
      * add attention masks
      
      * fix up tests
      
      * fix lsh attention mask
      
      * make random seed optional for the moment
      
      * improve memory in reformer
      
      * add tests
      
      * make style
      
      * make sure masks work correctly
      
      * detach gradients
      
      * save intermediate
      
      * correct backprob through gather
      
      * make style
      
      * change back num hashes
      
      * rename to labels
      
      * fix rotation shape
      
      * fix detach
      
      * update
      
      * fix trainer
      
      * fix backward dropout
      
      * make reformer more flexible
      
      * fix conflict
      
      * fix
      
      * fix
      
      * add tests for fixed seed in reformer layer
      
      * fix trainer typo
      
      * fix typo in activations
      
      * add fp16 tests
      
      * add fp16 training
      
      * support fp16
      
      * correct gradient bug in reformer
      
      * add fast gelu
      
      * re-add dropout for embedding dropout
      
      * better naming
      
      * better naming
      
      * renaming
      
      * finalize test branch
      
      * finalize tests
      
      * add more tests
      
      * finish tests
      
      * fix
      
      * fix type trainer
      
      * fix fp16 tests
      
      * fix tests
      
      * fix tests
      
      * fix tests
      
      * fix issue with dropout
      
      * fix dropout seeds
      
      * correct random seed on gpu
      
      * finalize random seed for dropout
      
      * finalize random seed for dropout
      
      * remove duplicate line
      
      * correct half precision bug
      
      * make style
      
      * refactor
      
      * refactor
      
      * docstring
      
      * remove sinusoidal position encodings for reformer
      
      * move chunking to modeling_utils
      
      * make style
      
      * clean config
      
      * make style
      
      * fix tests
      
      * fix auto tests
      
      * pretrained models
      
      * fix docstring
      
      * update conversion file
      
      * Update pretrained_models.rst
      
      * fix rst
      
      * fix rst
      
      * update copyright
      
      * fix test path
      
      * fix test path
      
      * fix small issue in test
      
      * include reformer in generation tests
      
      * add docs for axial position encoding
      
      * finish docs
      
      * Update convert_reformer_trax_checkpoint_to_pytorch.py
      
      * remove isort
      
      * include sams comments
      
      * remove wrong comment in utils
      
      * correct typos
      
      * fix typo
      
      * Update reformer.rst
      
      * applied morgans optimization
      
      * make style
      
      * make gpu compatible
      
      * remove bogus file
      
      * big test refactor
      
      * add example for chunking
      
      * fix typo
      
      * add to README
      dca34695