"docs/source/es/tasks/language_modeling.md" did not exist on "867f3950fa908632ddb3564873293b620d73c2dc"
  1. 19 Jan, 2022 4 commits
    • Nicolas Patry's avatar
      Make chuking smartly (long files) work on asr ctc_with_lm. (#15219) · 3fefee99
      Nicolas Patry authored
      
      
      * [WIP] Make chuking smartly (long files) work on asr ctc_with_lm.
      
      * Slow test with functionality.
      
      * Fixing regular test.
      
      * fix for batch size 1
      
      * Handling batch outside `rescale_Stride`.
      
      - Renamed to `rescale_stride`.
      
      * Disable equality in the test.
      
      * Remove print.
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      3fefee99
    • NielsRogge's avatar
      Add ViLT (#14895) · ac227093
      NielsRogge authored
      
      
      * First commit
      
      * Add conversion script
      
      * Make conversion script work for base model
      
      * More improvements
      
      * Update conversion script, works for vqa
      
      * Add indexing argument to meshgrid
      
      * Make conversion script work for ViltForPreTraining
      
      * Add ViltForPreTraining to docs
      
      * Fix device issue
      
      * Add processor
      
      * Add MinMaxResize to feature extractor
      
      * Implement call method of ViltProcessor
      
      * Fix tests
      
      * Add integration test
      
      * Add loss calculation for VQA
      
      * Improve tests
      
      * Improve some more tests
      
      * Debug tests
      
      * Small improvements
      
      * Add support for attention_mask
      
      * Remove mask_it
      
      * Add pixel_mask
      
      * Add tests for ViltFeatureExtractor
      
      * Improve tests
      
      * Add ViltForNaturalLanguageVisualReasoning
      
      * Add ViltForNaturalLanguageVisualReasoning to conversion script
      
      * Minor fixes
      
      * Add support for image_embeds, update docstrings to markdown
      
      * Update docs to markdown
      
      * Improve conversion script
      
      * Rename ViltForPreTraining to ViltForMaskedLM
      
      * Improve conversion script
      
      * Convert docstrings to markdown
      
      * Fix code example of retrieval model
      
      * Properly convert masked language model
      
      * Add integration test for nlvr
      
      * Fix code quality
      
      * Apply suggestions from code review
      
      * Add copied from statements
      
      * Fix pretrained_config_archive_map
      
      * Fix docs
      
      * Add model to README
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply more suggestions from code review
      
      * Make code more readable
      
      * Add ViltForNaturalLanguageVisualReasoning to the tests
      
      * Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering
      
      * Replace pixel_values_2 by single tensor
      
      * Add hidden_states and attentions
      
      * Fix one more test
      
      * Fix all tests
      
      * Update year
      
      * Fix rebase issues
      
      * Fix another rebase issue
      
      * Remove ViltForPreTraining from auto mapping
      
      * Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval
      
      * Make it possible to use BertTokenizerFast in the processor
      
      * Use BertTokenizerFast by default
      
      * Rename ViltForNaturalLanguageVisualReasoning, define custom model output
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ac227093
    • Li-Huai (Allan) Lin's avatar
      Add FastTokenizer to REALM (#15211) · 841d9791
      Li-Huai (Allan) Lin authored
      * Remove BertTokenizer abstraction
      
      * Add FastTokenizer to REALM
      
      * Fix config archive map
      
      * Fix copies
      
      * Update realm.mdx
      
      * Apply suggestions from code review
      841d9791
    • Matt's avatar
      Rename compute_loss in TF models (#15207) · 2708bfa1
      Matt authored
      * Rename compute_loss to hf_compute_loss to avoid conflicts with the new Keras method
      
      * make style
      
      * Adding deprecation warning to `compute_loss`
      
      * Fix sneaky reference to compute_loss
      
      * Replace logger.warning with warnings.warn
      
      * Clarifying warning and deprecation timeline
      2708bfa1
  2. 18 Jan, 2022 8 commits
    • Jake Tae's avatar
      Enable tqdm toggling (#15167) · fe78fe98
      Jake Tae authored
      
      
      * feature: enable tqdm toggle
      
      * test: add tqdm unit test
      
      * style: run linter
      
      * Update tests/test_tqdm_utils.py
      Co-authored-by: default avatarStas Bekman <stas00@users.noreply.github.com>
      
      * refactor: use tiny model, run linter
      
      * docs: add tqdm to logging
      
      * docs: add tqdm reference to `http_get`
      
      * style: run linter
      
      * Update docs/source/main_classes/logging.mdx
      Co-authored-by: default avatarStas Bekman <stas00@users.noreply.github.com>
      
      * fix: use `AutoConfig` for framework agnostic testing
      
      * chore: mv tqdm test to `test_logging.py`
      
      * feature: implement enable/disable functions
      
      * docs: mv docstring to comment
      
      * chore: mv tqdm functions to `logging.py`
      
      * docs: update docs to reference `enable/disable` funcs
      
      * test: update test to use `enable/disable` func
      
      * chore: update function reference in comment
      Co-authored-by: default avatarStas Bekman <stas00@users.noreply.github.com>
      fe78fe98
    • matt's avatar
      1a354d53
    • matt's avatar
      Fix a sneaky reference to compute_loss in the tests · 2085f209
      matt authored
      2085f209
    • NielsRogge's avatar
      Add MAE (#15120) · 74bec986
      NielsRogge authored
      * First draft
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * Fix embeddings
      
      * Add conversion script
      
      * Finish conversion script
      
      * More improvements
      
      * Fix forward pass
      
      * Remove print statements
      
      * Add weights initialization
      
      * Add initialization of decoder weights
      
      * Add support for other models in the conversion script
      
      * Fix patch_size for huge model
      
      * Fix most of the tests
      
      * Fix integration test
      
      * Fix docs
      
      * Fix archive_list
      
      * Apply suggestions from code review
      
      * Improve documentation
      
      * Apply more suggestions
      
      * Skip some tests due to non-deterministic behaviour
      
      * Fix test_initialization
      
      * Remove unneccessary initialization of nn.Embedding
      
      * Improve docs
      
      * Fix dummies
      
      * Remove ViTMAEFeatureExtractor from docs
      
      * Add model to README and table of contents
      
      * Delete inference file
      74bec986
    • Patrick von Platen's avatar
      [ASR pipeline] correct with lm pipeline (#15200) · 497346d0
      Patrick von Platen authored
      * [ASR pipeline] correct with lm pipeline
      
      * improve error
      497346d0
    • Sylvain Gugger's avatar
      Fix deprecation warnings for int div (#15180) · 531336bb
      Sylvain Gugger authored
      
      
      * Fix deprecation warnings for int div
      Co-authored-by: default avatarmgoldey <matthew.goldey@gmail.com>
      
      * Fix import
      
      * ensure that tensor output is python scalar
      
      * make backward compatible
      
      * make code more readable
      
      * adapt test functions
      Co-authored-by: default avatarmgoldey <matthew.goldey@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      531336bb
    • Li-Huai (Allan) Lin's avatar
      Add REALM (#13292) · 22454ae4
      Li-Huai (Allan) Lin authored
      
      
      * REALM initial commit
      
      * Retriever OK (Update new_gelu).
      
      * Encoder prediction score OK
      
      * Encoder pretrained model OK
      
      * Update retriever comments
      
      * Update docs, tests, and imports
      
      * Prune unused models
      
      * Make embedder as a module `RealmEmbedder`
      
      * Add RealmRetrieverOutput
      
      * Update tokenization
      
      * Pass all tests in test_modeling_realm.py
      
      * Prune RealmModel
      
      * Update docs
      
      * Add training test.
      
      * Remove completed TODO
      
      * Style & Quality
      
      * Prune `RealmModel`
      
      * Fixup
      
      * Changes:
      1. Remove RealmTokenizerFast
      2. Update docstrings
      3. Add a method to RealmTokenizer to handle candidates tokenization.
      
      * Fix up
      
      * Style
      
      * Add tokenization tests
      
      * Update `from_pretrained` tests
      
      * Apply suggestions
      
      * Style & Quality
      
      * Copy BERT model
      
      * Fix comment to avoid docstring copying
      
      * Make RealmBertModel private
      
      * Fix bug
      
      * Style
      
      * Basic QA
      
      * Save
      
      * Complete reader logits
      
      * Add searcher
      
      * Complete searcher & reader
      
      * Move block records init to constructor
      
      * Fix training bug
      
      * Add some outputs to RealmReader
      
      * Add finetuned checkpoint variable names parsing
      
      * Fix bug
      
      * Update REALM config
      
      * Add RealmForOpenQA
      
      * Update convert_tfrecord logits
      
      * Fix bugs
      
      * Complete imports
      
      * Update docs
      
      * Update naming
      
      * Add brute-force searcher
      
      * Pass realm model tests
      
      * Style
      
      * Exclude RealmReader from common tests
      
      * Fix
      
      * Fix
      
      * convert docs
      
      * up
      
      * up
      
      * more make style
      
      * up
      
      * upload
      
      * up
      
      * Fix
      
      * Update src/transformers/__init__.py
      
      * adapt testing
      
      * change modeling code
      
      * fix test
      
      * up
      
      * up
      
      * up
      
      * correct more
      
      * make retriever work
      
      * update
      
      * make style
      
      * finish main structure
      
      * Resolve merge conflict
      
      * Make everything work
      
      * Style
      
      * Fixup
      
      * Fixup
      
      * Update training test
      
      * fix retriever
      
      * remove hardcoded path
      
      * Fix
      
      * Fix modeling test
      
      * Update model links
      
      * Initial retrieval test
      
      * Fix modeling test
      
      * Complete retrieval tests
      
      * Fix
      
      * style
      
      * Fix tests
      
      * Fix docstring example
      
      * Minor fix of retrieval test
      
      * Update license headers and docs
      
      * Apply suggestions from code review
      
      * Style
      
      * Apply suggestions from code review
      
      * Add an example to RealmEmbedder
      
      * Fix
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      22454ae4
    • Nicolas Patry's avatar
      `is_ctc` needs to be updated to `self.type == "ctc". (#15194) · dea563c9
      Nicolas Patry authored
      * `is_ctc` needs to be updated to `self.type == "ctc".
      
      * Adding fast test for this functionality.
      dea563c9
  3. 17 Jan, 2022 1 commit
  4. 16 Jan, 2022 1 commit
  5. 14 Jan, 2022 5 commits
  6. 13 Jan, 2022 2 commits
  7. 12 Jan, 2022 2 commits
  8. 11 Jan, 2022 5 commits
  9. 10 Jan, 2022 2 commits
    • Yih-Dar's avatar
      Add TFVisionEncoderDecoderModel (#14148) · b67fd797
      Yih-Dar authored
      
      
      * Start the work on TFVisionEncoderDecoderModel
      
      * Expose TFVisionEncoderDecoderModel
      
      * fix import
      
      * Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules()
      
      * reorder
      
      * Apply the fix for checkpoint loading as in #14016
      
      * remove attention_mask + fix VISION_DUMMY_INPUTS
      
      * A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting
      
      * fix wrong condition: shape_list(input_ids) == 2
      
      * add tests
      
      * use personal TFViTModel checkpoint (for now)
      
      * Add equivalence tests + projection layer
      
      * style
      
      * make sure projection layer can run
      
      * Add examples
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Clean comments (need to work on TODOs for PyTorch models)
      
      * Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel
      
      * fixes
      
      * Revert changes in PT code.
      
      * Update tests/test_modeling_tf_vision_encoder_decoder.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Add test_inference_coco_en for TF test
      
      * fix quality
      
      * fix name
      
      * build doc
      
      * add main_input_name
      
      * Fix ckpt name in test
      
      * fix diff between master and this PR
      
      * fix doc
      
      * fix style and quality
      
      * fix missing doc
      
      * fix labels handling
      
      * Delete auto.rst
      
      * Add the changes done in #14016
      
      * fix prefix
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * make style
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      b67fd797
    • cody-moveworks's avatar
      Make OpenAIGPTTokenizer work with SpaCy 2.x and 3.x (#15019) · a54961c5
      cody-moveworks authored
      * Make OpenAIGPTTokenizer work with SpaCy 3.x
      
      SpaCy 3.x introduced an API change to creating the tokenizer that
      breaks OpenAIGPTTokenizer. The old API for creating the tokenizer in
      SpaCy 2.x no longer works under SpaCy 3.x, but the new API for creating
      the tokenizer in SpaCy 3.x DOES work under SpaCy 2.x. Switching to the
      new API should allow OpenAIGPTTokenizer to work under both SpaCy 2.x and
      SpaCy 3.x versions.
      
      * Add is_spacy_available and is_ftfy_available methods to file utils
      
      * Add spacy and ftfy unittest decorator to testing utils
      
      * Add tests for OpenAIGPTTokenizer that require spacy and ftfy
      
      * Modify CircleCI config to run tests that require spacy and ftfy
      
      * Remove unneeded unittest decorators are reuse test code
      
      * Run make fixup
      a54961c5
  10. 06 Jan, 2022 5 commits
  11. 05 Jan, 2022 3 commits
  12. 04 Jan, 2022 2 commits