1. 05 Jan, 2021 7 commits
    • Patrick von Platen's avatar
      LED (#9278) · 189387e9
      Patrick von Platen authored
      * create model
      
      * add integration
      
      * save current state
      
      * make integration tests pass
      
      * add one more test
      
      * add explanation to tests
      
      * remove from bart
      
      * add padding
      
      * remove unnecessary test
      
      * make all tests pass
      
      * re-add cookie cutter tests
      
      * finish PyTorch
      
      * fix attention test
      
      * Update tests/test_modeling_common.py
      
      * revert change
      
      * remove unused file
      
      * add string to doc
      
      * save intermediate
      
      * make tf integration tests pass
      
      * finish tf
      
      * fix doc
      
      * fix docs again
      
      * add led to doctree
      
      * add to auto tokenizer
      
      * added tips for led
      
      * make style
      
      * apply jplus statements
      
      * correct tf longformer
      
      * apply lysandres suggestions
      
      * apply sylvains suggestions
      
      * Apply suggestions from code review
      189387e9
    • Julien Plu's avatar
      Fix TF Funnel (#9300) · 52d62e68
      Julien Plu authored
      * Fix Funnel
      
      * Apply Patrick's comment
      
      * Remove comment
      
      * Fix dummy value
      
      * Apply style
      52d62e68
    • Stas Bekman's avatar
      [trainer] --model_parallel hasn't been implemented for most models (#9347) · 748006c0
      Stas Bekman authored
      * --model_parallel hasn't been implemented for most models
      
      * make the help clear as well
      
      * implement is_parallelizable; use it
      
      * oops
      
      * remove property
      748006c0
    • Julien Plu's avatar
      Use stable functions (#9369) · 4225740a
      Julien Plu authored
      4225740a
    • Stas Bekman's avatar
      [logging] autoflush (#9385) · 4aa8f6ad
      Stas Bekman authored
      This PR proposes to:
      
      * auto-flush `transformers` logging 
      
      When using logging for tracing signals from different parts of the code and which could be mixed with print debug this aids to get all the logging events synchronized. 
      
      I don't think this change will introduce any performance impacts.
      
      If it helps someone here is the code I used to sync `transformers` logging with various other debug prints.
      
      I was porting bart to MP and I needed to trace that the device switching happens correctly and I added a bunch of logger.info calls inside `modeling_bart.py` and also had some other helpers `print` debug messages which weren't logger based:
      
      ```
      
      # auto flush std streams
      from sys import stdout, stderr
      def stdout_write_flush(args, w=stderr.write): w(args); stderr.flush()
      def stderr_write_flush(args, w=stderr.write): w(args); stderr.flush()
      stdout.write = stdout_write_flush
      stderr.write = stderr_write_flush
      
      from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig
      
      import logging
      import transformers.utils.logging
      import transformers.models.bart.modeling_bart
      
      # I wanted a shorter simpler format
      handlers = transformers.utils.logging._get_library_root_logger().handlers
      for handler in handlers:
          formatter = logging.Formatter("[%(funcName)s] %(message)s")
          handler.setFormatter(formatter)
      
      transformers.models.bart.modeling_bart.logger.setLevel(transformers.logging.INFO)
      ```
      
      @LysandreJik, @sgugger, @patrickvonplaten
      4aa8f6ad
    • Julien Plu's avatar
      Fix TF Longformer (#9348) · 83eec97e
      Julien Plu authored
      * Fix longformer
      
      * Apply style
      
      * Remove serving content
      
      * Forgot a condition
      
      * Apply style
      
      * Address Patrick's comments
      
      * Fix dtype
      83eec97e
    • Boris Dayma's avatar
      feat(wandb): save model as artifact (#8119) · 30fa0b78
      Boris Dayma authored
      * feat(wandb): log artifacts
      
      * fix: typo
      
      * feat(wandb): ensure name is allowed
      
      * feat(wandb): log artifact
      
      * feat(wandb): saving logic
      
      * style: improve formatting
      
      * fix: unrelated typo
      
      * feat:聽use a fake trainer
      
      * fix:聽simplify
      
      * feat(wandb): log model files as artifact
      
      * style: fix style
      
      * docs(wandb): correct description
      
      * feat: unpack model + allow env Truethy values
      
      * feat: TrainerCallback can access tokenizer
      
      * style:聽fix style
      
      * feat(wandb): log more interesting metadata
      
      * feat: unpack tokenizer
      
      * feat(wandb): metadata with load_best_model_at_end
      
      * feat(wandb): more robust metadata
      
      * style(wandb): fix formatting
      30fa0b78
  2. 04 Jan, 2021 6 commits
  3. 02 Jan, 2021 3 commits
  4. 30 Dec, 2020 1 commit
  5. 29 Dec, 2020 1 commit
    • Stas Bekman's avatar
      [prophetnet] wrong import (#9349) · 8217d4e3
      Stas Bekman authored
      ```
      python -c "from apex.normalization import FusedProphetNetLayerNorm"
      Traceback (most recent call last):
        File "<string>", line 1, in <module>
      ImportError: cannot import name 'FusedProphetNetLayerNorm' from 'apex.normalization' (/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/apex/normalization/__init__.py)
      ```
      It looks like this code has never been tested, so it silently fails inside try/except.
      
      Discovered this by accident in https://github.com/huggingface/transformers/issues/9338#issuecomment-752217708
      8217d4e3
  6. 28 Dec, 2020 2 commits
  7. 27 Dec, 2020 1 commit
  8. 25 Dec, 2020 1 commit
  9. 24 Dec, 2020 6 commits
  10. 23 Dec, 2020 3 commits
    • Suraj Patil's avatar
      Add caching mechanism to BERT, RoBERTa (#9183) · 88ef8893
      Suraj Patil authored
      * add past_key_values
      
      * add use_cache option
      
      * make mask before cutting ids
      
      * adjust position_ids according to past_key_values
      
      * flatten past_key_values
      
      * fix positional embeds
      
      * fix _reorder_cache
      
      * set use_cache to false when not decoder, fix attention mask init
      
      * add test for caching
      
      * add past_key_values for Roberta
      
      * fix position embeds
      
      * add caching test for roberta
      
      * add doc
      
      * make style
      
      * doc, fix attention mask, test
      
      * small fixes
      
      * adress patrick's comments
      
      * input_ids shouldn't start with pad token
      
      * use_cache only when decoder
      
      * make consistent with bert
      
      * make copies consistent
      
      * add use_cache to encoder
      
      * add past_key_values to tapas attention
      
      * apply suggestions from code review
      
      * make coppies consistent
      
      * add attn mask in tests
      
      * remove copied from longformer
      
      * apply suggestions from code review
      
      * fix bart test
      
      * nit
      
      * simplify model outputs
      
      * fix doc
      
      * fix output ordering
      88ef8893
    • Xu Song's avatar
      Fix param error (#9273) · 4bafc43b
      Xu Song authored
      TypeError: forward() got an unexpected keyword argument 'token_type_ids'
      4bafc43b
    • Xu Song's avatar
      Fix gpt2 document (#9272) · 58e8a761
      Xu Song authored
      58e8a761
  11. 22 Dec, 2020 5 commits
  12. 21 Dec, 2020 4 commits