"vscode:/vscode.git/clone" did not exist on "16913b3c9215f592f1240511f8da271dc07c3552"
  1. 13 Apr, 2021 2 commits
  2. 08 Apr, 2021 2 commits
  3. 07 Apr, 2021 2 commits
  4. 06 Apr, 2021 1 commit
    • Sylvain Gugger's avatar
      Auto feature extractor (#11097) · 403d530e
      Sylvain Gugger authored
      * AutoFeatureExtractor
      
      * Init and first tests
      
      * Tests
      
      * Damn you gitignore
      
      * Quality
      
      * Defensive test for when not all backends are here
      
      * Use pattern for Speech2Text models
      403d530e
  5. 05 Apr, 2021 1 commit
  6. 31 Mar, 2021 1 commit
  7. 26 Mar, 2021 1 commit
  8. 19 Mar, 2021 1 commit
  9. 17 Mar, 2021 1 commit
    • Sylvain Gugger's avatar
      Check copies blackify (#10775) · 40b049c7
      Sylvain Gugger authored
      * Apply black before checking copies
      
      * Fix for class methods
      
      * Deal with lonely brackets
      
      * Remove debug and add forward changes
      
      * Separate copies and fix test
      
      * Add black as a test dependency
      40b049c7
  10. 16 Mar, 2021 1 commit
    • Sylvain Gugger's avatar
      Release utils (#10735) · 813d730c
      Sylvain Gugger authored
      * Examples version update
      
      * Refactor a bit
      
      * All version updates
      
      * Fixes
      
      * README cleanup
      
      * Post-release/patch
      
      * Fixes
      
      * More fixes
      
      * Tests
      
      * More fixes
      
      * Moar fixes
      
      * Make commands and update setup
      
      * Replace spaces with weird tabs
      
      * Fix test
      
      * Style
      813d730c
  11. 15 Mar, 2021 1 commit
  12. 10 Mar, 2021 1 commit
    • Suraj Patil's avatar
      Speech2TextTransformer (#10175) · d26b37e7
      Suraj Patil authored
      
      
      * s2t
      
      * fix config
      
      * conversion script
      
      * fix import
      
      * add tokenizer
      
      * fix tok init
      
      * fix tokenizer
      
      * first version working
      
      * fix embeds
      
      * fix lm head
      
      * remove extra heads
      
      * fix convert script
      
      * handle encoder attn mask
      
      * style
      
      * better enc attn mask
      
      * override _prepare_attention_mask_for_generation
      
      * handle attn_maks in encoder and decoder
      
      * input_ids => input_features
      
      * enable use_cache
      
      * remove old code
      
      * expand embeddings if needed
      
      * remove logits bias
      
      * masked_lm_loss => loss
      
      * hack tokenizer to support feature processing
      
      * fix model_input_names
      
      * style
      
      * fix error message
      
      * doc
      
      * remove inputs_embeds
      
      * remove input_embeds
      
      * remove unnecessary docstring
      
      * quality
      
      * SpeechToText => Speech2Text
      
      * style
      
      * remove shared_embeds
      
      * subsample => conv
      
      * remove Speech2TextTransformerDecoderWrapper
      
      * update output_lengths formula
      
      * fix table
      
      * remove max_position_embeddings
      
      * update conversion scripts
      
      * add possibility to do upper case for now
      
      * add FeatureExtractor and Processor
      
      * add tests for extractor
      
      * require_torch_audio => require_torchaudio
      
      * add processor test
      
      * update import
      
      * remove classification head
      
      * attention mask is now 1D
      
      * update docstrings
      
      * attention mask should be of type long
      
      * handle attention mask from generate
      
      * alwyas return attention_mask
      
      * fix test
      
      * style
      
      * doc
      
      * Speech2TextTransformer => Speech2Text
      
      * Speech2TextTransformerConfig => Speech2TextConfig
      
      * remove dummy_inputs
      
      * nit
      
      * style
      
      * multilinguial tok
      
      * fix tokenizer
      
      * add tgt_lang setter
      
      * save lang_codes
      
      * fix tokenizer
      
      * add forced_bos_token_id to tokenizer
      
      * apply review suggestions
      
      * add torchaudio to extra deps
      
      * add speech deps to CI
      
      * fix dep
      
      * add libsndfile to ci
      
      * libsndfile1
      
      * add speech to extras all
      
      * libsndfile1 -> libsndfile1
      
      * libsndfile
      
      * libsndfile1-dev
      
      * apt update
      
      * add sudo to install
      
      * update deps table
      
      * install libsndfile1-dev on CI
      
      * tuple to list
      
      * init conv layer
      
      * add model tests
      
      * quality
      
      * add integration tests
      
      * skip_special_tokens
      
      * add speech_to_text_transformer in toctree
      
      * fix tokenizer
      
      * fix fp16 tests
      
      * add tokenizer tests
      
      * fix copyright
      
      * input_values => input_features
      
      * doc
      
      * add model in readme
      
      * doc
      
      * change checkpoint names
      
      * fix copyright
      
      * fix code example
      
      * add max_model_input_sizes in tokenizer
      
      * fix integration tests
      
      * add do_lower_case to tokenizer
      
      * remove clamp trick
      
      * fix "Add modeling imports here"
      
      * fix copyrights
      
      * fix tests
      
      * SpeechToTextTransformer => SpeechToText
      
      * fix naming
      
      * fix table formatting
      
      * fix typo
      
      * style
      
      * fix typos
      
      * remove speech dep from extras[testing]
      
      * fix copies
      
      * rename doc file,
      
      * put imports under is_torch_available
      
      * run feat extract tests when torch is available
      
      * dummy objects for processor and extractor
      
      * fix imports in tests
      
      * fix import in modeling test
      
      * fxi imports
      
      * fix torch import
      
      * fix imports again
      
      * fix positional embeddings
      
      * fix typo in import
      
      * adapt new extractor refactor
      
      * style
      
      * fix torchscript test
      
      * doc
      
      * doc
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix docs, copied from, style
      
      * fix docstring
      
      * handle imports
      
      * remove speech from all extra deps
      
      * remove s2t from seq2seq lm mapping
      
      * better names
      
      * skip training tests
      
      * add install instructions
      
      * List => Tuple
      
      * doc
      
      * fix conversion script
      
      * fix urls
      
      * add instruction for libsndfile
      
      * fix fp16 test
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      d26b37e7
  13. 08 Mar, 2021 1 commit
    • Ratthachat (Jung)'s avatar
      Add TFRag (#9002) · 696e8a43
      Ratthachat (Jung) authored
      * Create modeling_tf_dpr.py
      
      * Add TFDPR
      
      * Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
      
      last commit accidentally deleted these 4 lines, so I recover them back
      
      * Add TFDPR
      
      * Add TFDPR
      
      * clean up some comments, add TF input-style doc string
      
      * Add TFDPR
      
      * Make return_dict=False as default
      
      * Fix return_dict bug (in .from_pretrained)
      
      * Add get_input_embeddings()
      
      * Create test_modeling_tf_dpr.py
      
      The current version is already passed all 27 tests!
      Please see the test run at : 
      https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
      
      
      
      * fix quality
      
      * delete init weights
      
      * run fix copies
      
      * fix repo consis
      
      * del config_class, load_tf_weights
      
      They shoud be 'pytorch only'
      
      * add config_class back
      
      after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
      
      * newline after .. note::
      
      * import tf, np (Necessary for ModelIntegrationTest)
      
      * slow_test from_pretrained with from_pt=True
      
      At the moment we don't have TF weights (since we don't have official official TF model)
      Previously, I did not run slow test, so I missed this bug
      
      * Add simple TFDPRModelIntegrationTest
      
      Note that this is just a test that TF and Pytorch gives approx. the same output.
      However, I could not test with the official DPR repo's output yet
      
      * upload correct tf model
      
      * remove position_ids as missing keys
      
      * create modeling_tf_rag
      
      * add tests for tf
      
      * add tf tests
      
      * revert wrong pt commit
      
      * further refactor
      
      * further refactor
      
      * refactor
      
      * Update modeling_tf_rag.py
      
      - input_processing
      - fix prepare_input_for_generation (mostly fix generate bug)
      - bring back from_pretrained hack in order to test generate
      
      * delete colab pieces of code
      
      * Show case of greedy "generate"
      
      Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output.
      
      * cosmetic update
      
      * correct typos
      
      * update
      
      * push some progress
      
      * make easy check
      
      * fix rag save from pretrained
      
      * Update src/transformers/modeling_tf_utils.py
      
      * remove commented out lines
      
      * delete unnecessary lines
      
      * add simple test case for nq_checkpoint
      
      Add nq_checkpoint test to show that current version without hack still fails
      
      * temporarily put ugly hack back again
      
      * Add TFRagSequenceForGeneration!!
      
      * __init__.py , import TFRagSequenceForGeneration
      
      * Add TFRagSequence tests!
      
      * rag init.py - add TFRagSequenceForGeneration
      
      * fix from_pretrained
      
      * fix prepare_inputs_for_generation
      
      * Beam search for RagToken!
      
      * minor clean up
      
      * add tf.cast in TFRagModel
      
      * More tf.cast
      
      * Add all remaining tests (still have issues)
      
      * delete all T5 related
      
      * make style
      
      * fix load weight prefix
      
      * fix bart
      
      * fix return_dict for tf_rag
      
      make all tests pass .. Hooray
      
      * fix some tests
      
      * fix code quality
      
      * fix qualtiy check
      
      * finish tests tf rag
      
      * add tf rag to docs
      
      * remove TFT5 from docstring
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * remove TFT5 from docstring
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Delete outdated comments
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * improve doc strings
      
      * add generative model classes
      
      * fix adjust token logic
      
      * refactor generate for TFRag
      
      * using shape_list, not _get_shape
      Co-authored-by: default avatarJulien Plu <plu.julien@gmail.com>
      
      * axis=[1]->axis=1
      
      * delete NEED_HELP comment
      
      * improve readability
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * improve readability
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * improve readability
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Indicating model is in a developing state in docstrings
      
      As suggested by Julien
      
      * small last changes
      
      * apply sylvains suggestions
      
      * finish tf rag
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarpatrickvonplaten <patrick@huggingface.co>
      Co-authored-by: default avatarJulien Plu <plu.julien@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      696e8a43
  14. 06 Mar, 2021 1 commit
    • Suraj Patil's avatar
      Add m2m100 (#10236) · f6e74a63
      Suraj Patil authored
      * m2m_100
      
      * no layernorm_embedding
      
      * sinusoidal positional embeddings
      
      * update pos embeddings
      
      * add default config values
      
      * tokenizer
      
      * add conversion script
      
      * fix config
      
      * fix pos embed
      
      * remove _float_tensor
      
      * update tokenizer
      
      * update lang codes
      
      * handle lang codes
      
      * fix pos embeds
      
      * fix spm key
      
      * put embedding weights on device
      
      * remove qa and seq classification heads
      
      * fix convert script
      
      * lang codes pn one line
      
      * fix embeds
      
      * fix tokenizer
      
      * fix tokenizer
      
      * add fast tokenizer
      
      * style
      
      * M2M100MT => M2M100
      
      * fix copyright, style
      
      * tokenizer converter
      
      * vocab file
      
      * remove fast tokenizer
      
      * fix embeds
      
      * fix tokenizer
      
      * fix tests
      
      * add tokenizer tests
      
      * add integration test
      
      * quality
      
      * fix model name
      
      * fix test
      
      * doc
      
      * doc
      
      * fix doc
      
      * add copied from statements
      
      * fix tokenizer tests
      
      * apply review suggestions
      
      * fix urls
      
      * fix shift_tokens_right
      
      * apply review suggestions
      
      * fix
      
      * fix doc
      
      * add lang code to id
      
      * remove unused function
      
      * update checkpoint names
      
      * fix copy
      
      * fix tokenizer
      
      * fix checkpoint names
      
      * fix merge issue
      
      * style
      f6e74a63
  15. 03 Mar, 2021 2 commits
  16. 25 Feb, 2021 1 commit
    • Patrick von Platen's avatar
      [PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor,... · cb38ffcc
      Patrick von Platen authored
      [PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
      
      * push to show
      
      * small improvement
      
      * small improvement
      
      * Update src/transformers/feature_extraction_utils.py
      
      * Update src/transformers/feature_extraction_utils.py
      
      * implement base
      
      * add common tests
      
      * make all tests pass for wav2vec2
      
      * make padding work & add more tests
      
      * finalize feature extractor utils
      
      * add call method to feature extraction
      
      * finalize feature processor
      
      * finish tokenizer
      
      * finish general processor design
      
      * finish tests
      
      * typo
      
      * remove bogus file
      
      * finish docstring
      
      * add docs
      
      * finish docs
      
      * small fix
      
      * correct docs
      
      * save intermediate
      
      * load changes
      
      * apply changes
      
      * apply changes to doc
      
      * change tests
      
      * apply surajs recommend
      
      * final changes
      
      * Apply suggestions from code review
      
      * fix typo
      
      * fix import
      
      * correct docstring
      cb38ffcc
  17. 15 Feb, 2021 1 commit
    • Julien Plu's avatar
      Check TF ops for ONNX compliance (#10025) · c8d3fa0d
      Julien Plu authored
      
      
      * Add check-ops script
      
      * Finish to implement check_tf_ops and start the test
      
      * Make the test mandatory only for BERT
      
      * Update tf_ops folder
      
      * Remove useless classes
      
      * Add the ONNX test for GPT2 and BART
      
      * Add a onnxruntime slow test + better opset flexibility
      
      * Fix test + apply style
      
      * fix tests
      
      * Switch min opset from 12 to 10
      
      * Update src/transformers/file_utils.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Fix GPT2
      
      * Remove extra shape_list usage
      
      * Fix GPT2
      
      * Address Morgan's comments
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      c8d3fa0d
  18. 09 Feb, 2021 1 commit
  19. 08 Feb, 2021 1 commit
  20. 04 Feb, 2021 1 commit
    • demSd's avatar
      BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128) · 00031785
      demSd authored
      
      
      * initiliaze bart4causalLM
      
      * create BartDecoderWrapper, setters/getters
      
      * delete spaces
      
      * forward and additional methods
      
      * update cache function, loss function, remove ngram* params in data class.
      
      * add bartcausallm, bartdecoder testing
      
      * correct bart for causal lm
      
      * remove at
      
      * add mbart as well
      
      * up
      
      * fix typo
      
      * up
      
      * correct
      
      * add pegasusforcausallm
      
      * add blenderbotforcausallm
      
      * add blenderbotsmallforcausallm
      
      * add marianforcausallm
      
      * add test for MarianForCausalLM
      
      * add Pegasus test
      
      * add BlenderbotSmall test
      
      * add blenderbot test
      
      * fix a fail
      
      * fix an import fail
      
      * a fix
      
      * fix
      
      * Update modeling_pegasus.py
      
      * fix models
      
      * fix inputs_embeds setting getter
      
      * adapt tests
      
      * correct repo utils check
      
      * finish test improvement
      
      * fix tf models as well
      
      * make style
      
      * make fix-copies
      
      * fix copies
      
      * run all tests
      
      * last changes
      
      * fix all tests
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      00031785
  21. 02 Feb, 2021 1 commit
  22. 27 Jan, 2021 1 commit
  23. 11 Jan, 2021 2 commits
  24. 07 Jan, 2021 2 commits
  25. 06 Jan, 2021 3 commits
  26. 05 Jan, 2021 2 commits
    • Patrick von Platen's avatar
      [PyTorch Bart] Split Bart into different models (#9343) · eef66035
      Patrick von Platen authored
      * first try
      
      * remove old template
      
      * finish bart
      
      * finish mbart
      
      * delete unnecessary line
      
      * init pegasus
      
      * save intermediate
      
      * correct pegasus
      
      * finish pegasus
      
      * remove cookie cutter leftover
      
      * add marian
      
      * finish blenderbot
      
      * replace in file
      
      * correctly split blenderbot
      
      * delete "old" folder
      
      * correct "add statement"
      
      * adapt config for tf comp
      
      * correct configs for tf
      
      * remove ipdb
      
      * fix more stuff
      
      * fix mbart
      
      * push pegasus fix
      
      * fix mbart
      
      * more fixes
      
      * fix research projects code
      
      * finish docs for bart, mbart, and marian
      
      * delete unnecessary file
      
      * correct attn typo
      
      * correct configs
      
      * remove pegasus for seq class
      
      * correct peg docs
      
      * correct peg docs
      
      * finish configs
      
      * further improve docs
      
      * add copied from statements to mbart
      
      * fix copied from in mbart
      
      * add copy statements to marian
      
      * add copied from to marian
      
      * add pegasus copied from
      
      * finish pegasus
      
      * finish copied from
      
      * Apply suggestions from code review
      
      * make style
      
      * backward comp blenderbot
      
      * apply lysandres and sylvains suggestions
      
      * apply suggestions
      
      * push last fixes
      
      * fix docs
      
      * fix tok tests
      
      * fix imports code style
      
      * fix doc
      eef66035
    • Patrick von Platen's avatar
      LED (#9278) · 189387e9
      Patrick von Platen authored
      * create model
      
      * add integration
      
      * save current state
      
      * make integration tests pass
      
      * add one more test
      
      * add explanation to tests
      
      * remove from bart
      
      * add padding
      
      * remove unnecessary test
      
      * make all tests pass
      
      * re-add cookie cutter tests
      
      * finish PyTorch
      
      * fix attention test
      
      * Update tests/test_modeling_common.py
      
      * revert change
      
      * remove unused file
      
      * add string to doc
      
      * save intermediate
      
      * make tf integration tests pass
      
      * finish tf
      
      * fix doc
      
      * fix docs again
      
      * add led to doctree
      
      * add to auto tokenizer
      
      * added tips for led
      
      * make style
      
      * apply jplus statements
      
      * correct tf longformer
      
      * apply lysandres suggestions
      
      * apply sylvains suggestions
      
      * Apply suggestions from code review
      189387e9
  27. 04 Jan, 2021 2 commits
  28. 23 Dec, 2020 1 commit
  29. 22 Dec, 2020 1 commit
  30. 11 Dec, 2020 1 commit