1. 10 Jan, 2022 1 commit
    • cody-moveworks's avatar
      Make OpenAIGPTTokenizer work with SpaCy 2.x and 3.x (#15019) · a54961c5
      cody-moveworks authored
      * Make OpenAIGPTTokenizer work with SpaCy 3.x
      
      SpaCy 3.x introduced an API change to creating the tokenizer that
      breaks OpenAIGPTTokenizer. The old API for creating the tokenizer in
      SpaCy 2.x no longer works under SpaCy 3.x, but the new API for creating
      the tokenizer in SpaCy 3.x DOES work under SpaCy 2.x. Switching to the
      new API should allow OpenAIGPTTokenizer to work under both SpaCy 2.x and
      SpaCy 3.x versions.
      
      * Add is_spacy_available and is_ftfy_available methods to file utils
      
      * Add spacy and ftfy unittest decorator to testing utils
      
      * Add tests for OpenAIGPTTokenizer that require spacy and ftfy
      
      * Modify CircleCI config to run tests that require spacy and ftfy
      
      * Remove unneeded unittest decorators are reuse test code
      
      * Run make fixup
      a54961c5
  2. 27 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler v2 (#14950) · 87e6e4fe
      Sylvain Gugger authored
      * New doc styler
      
      * Fix issue with args at the start
      
      * Code sample fixes
      
      * Style code examples in MDX
      
      * Fix more patterns
      
      * Typo
      
      * Typo
      
      * More patterns
      
      * Do without black for now
      
      * Get more info in error
      
      * Docstring style
      
      * Re-enable check
      
      * Quality
      
      * Fix add_end_docstring decorator
      
      * Fix docstring
      87e6e4fe
  3. 21 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Convert docstrings of modeling files (#14850) · 7af80f66
      Sylvain Gugger authored
      * Convert file_utils docstrings to Markdown
      
      * Test on BERT
      
      * Return block indent
      
      * Temporarily disable doc styler
      
      * Remove from quality checks as well
      
      * Remove doc styler mess
      
      * Remove check from circleCI
      
      * Fix typo
      
      * Convert file_utils docstrings to Markdown
      
      * Test on BERT
      
      * Return block indent
      
      * Temporarily disable doc styler
      
      * Remove from quality checks as well
      
      * Remove doc styler mess
      
      * Remove check from circleCI
      
      * Fix typo
      
      * Let's go on all other model files
      
      * Add templates too
      
      * Styling and quality
      7af80f66
  4. 17 Dec, 2021 1 commit
  5. 09 Dec, 2021 1 commit
  6. 08 Dec, 2021 1 commit
  7. 06 Dec, 2021 2 commits
  8. 02 Dec, 2021 1 commit
  9. 01 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc new front (#14590) · 4df7d05a
      Sylvain Gugger authored
      
      
      * Convert PretrainedConfig doc to Markdown
      
      * Use syntax
      
      * Add necessary doc files (#14496)
      
      * Doc fixes (#14499)
      
      * Fixes for the new front
      
      * Convert DETR file for table
      
      * Title is needed
      
      * Simplify a bit
      
      * Even simpler
      
      * Remove imports
      
      * Fix typo in toctree (#14516)
      
      * Fix checkpoints badge
      
      * Update versions.yml format (#14517)
      
      * Doc new front github actions (#14512)
      
      * Doc new front github actions
      
      * Fix docstring
      
      * Fix feature extraction utils import (#14515)
      
      * Address Julien's comments
      
      * Push to doc-builder
      
      * Ready for merge
      
      * Remove old build and deploy
      
      * Doc misc fixes (#14583)
      
      * Rm versions.yml from doc
      
      * Fix converting.rst
      
      * Rm pretrained_models from toctree
      
      * Fix index links (#14567)
      
      * Fix links in README
      
      * Localized READMEs
      
      * Fix copy script
      
      * Fix find doc script
      
      * Update README_ko.md
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      
      * Adapt build command to new CLI tools (#14578)
      
      * Fix typo
      
      * Fix doc interlinks (#14589)
      
      * Convert PretrainedConfig doc to Markdown
      
      * Use syntax
      
      * Rm pattern <[a-z]+(.html).*>
      
      * Rm huggingface.co/transformers/master
      
      * Rm .html
      
      * Rm .html from index.mdx
      
      * Rm .html from model_summary.rst
      
      * Update index.mdx rm html
      
      * Update remove .html
      
      * Fix inner doc links
      
      * Fix interlink in preprocssing.rst
      
      * Update pr_checks
      Co-authored-by: default avatarSylvain Gugger <sylvain.gugger@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Convert PretrainedConfig doc to Markdown
      
      * Use syntax
      
      * Add necessary doc files (#14496)
      
      * Doc fixes (#14499)
      
      * Fixes for the new front
      
      * Convert DETR file for table
      
      * Title is needed
      
      * Simplify a bit
      
      * Even simpler
      
      * Remove imports
      
      * Fix checkpoints badge
      
      * Fix typo in toctree (#14516)
      
      * Update versions.yml format (#14517)
      
      * Doc new front github actions (#14512)
      
      * Doc new front github actions
      
      * Fix docstring
      
      * Fix feature extraction utils import (#14515)
      
      * Address Julien's comments
      
      * Push to doc-builder
      
      * Ready for merge
      
      * Remove old build and deploy
      
      * Doc misc fixes (#14583)
      
      * Rm versions.yml from doc
      
      * Fix converting.rst
      
      * Rm pretrained_models from toctree
      
      * Fix index links (#14567)
      
      * Fix links in README
      
      * Localized READMEs
      
      * Fix copy script
      
      * Fix find doc script
      
      * Update README_ko.md
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      
      * Adapt build command to new CLI tools (#14578)
      
      * Fix typo
      
      * Fix doc interlinks (#14589)
      
      * Convert PretrainedConfig doc to Markdown
      
      * Use syntax
      
      * Rm pattern <[a-z]+(.html).*>
      
      * Rm huggingface.co/transformers/master
      
      * Rm .html
      
      * Rm .html from index.mdx
      
      * Rm .html from model_summary.rst
      
      * Update index.mdx rm html
      
      * Update remove .html
      
      * Fix inner doc links
      
      * Fix interlink in preprocssing.rst
      
      * Update pr_checks
      Co-authored-by: default avatarSylvain Gugger <sylvain.gugger@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Styling
      Co-authored-by: default avatarMishig Davaadorj <mishig.davaadorj@coloradocollege.edu>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarJulien Chaumond <julien@huggingface.co>
      4df7d05a
  10. 30 Nov, 2021 1 commit
    • Kamal Raj's avatar
      Tapas tf (#13393) · c468a87a
      Kamal Raj authored
      * TF Tapas first commit
      
      * updated docs
      
      * updated logger message
      
      * updated pytorch weight conversion
      script to support scalar array
      
      * added use_cache to tapas model config to
      work properly with tf input_processing
      
      * 1. rm embeddings_sum
      2. added # Copied
      3. + TFTapasMLMHead
      4. and lot other small fixes
      
      * updated docs
      
      * + test for tapas
      
      * updated testing_utils to check
      is_tensorflow_probability_available
      
      * converted model logits post processing using
      numpy to work with both PT and TF models
      
      * + TFAutoModelForTableQuestionAnswering
      
      * added TF support
      
      * added test for
      TFAutoModelForTableQuestionAnswering
      
      * added test for
      TFAutoModelForTableQuestionAnswering pipeline
      
      * updated auto model docs
      
      * fixed typo in import
      
      * added tensorflow_probability to run tests
      
      * updated MLM head
      
      * updated tapas.rst with TF  model docs
      
      * fixed optimizer import in docs
      
      * updated convert to np
      data from pt model is not
      `transformers.tokenization_utils_base.BatchEncoding`
      after pipeline upgrade
      
      * updated pipeline:
      1. with torch.no_gard removed, pipeline forward handles
      2. token_type_ids converted to numpy
      
      * updated docs.
      
      * removed `use_cache` from config
      
      * removed floats_tensor
      
      * updated code comment
      
      * updated Copyright Year and
      logits_aggregation Optional
      
      * updated docs and comments
      
      * updated docstring
      
      * fixed model weight loading
      
      * make fixup
      
      * fix indentation
      
      * added tf slow pipeline test
      
      * pip upgrade
      
      * upgrade python to 3.7
      
      * removed from_pt from tests
      
      * revert commit f18cfa9
      c468a87a
  11. 24 Nov, 2021 1 commit
  12. 19 Nov, 2021 2 commits
  13. 17 Nov, 2021 1 commit
  14. 16 Nov, 2021 1 commit
  15. 03 Nov, 2021 1 commit
  16. 29 Oct, 2021 4 commits
  17. 28 Oct, 2021 1 commit
  18. 14 Oct, 2021 1 commit
  19. 06 Oct, 2021 1 commit
  20. 30 Sep, 2021 1 commit
  21. 29 Sep, 2021 1 commit
  22. 27 Sep, 2021 1 commit
  23. 25 Sep, 2021 1 commit
  24. 16 Sep, 2021 1 commit
  25. 10 Sep, 2021 1 commit
  26. 01 Sep, 2021 2 commits
  27. 31 Aug, 2021 3 commits
  28. 30 Aug, 2021 2 commits
    • Li-Huai (Allan) Lin's avatar
      Correct wrong function signatures on the docs website (#13198) · ffecfea9
      Li-Huai (Allan) Lin authored
      * Correct outdated function signatures on website.
      
      * Upgrade sphinx to 3.5.4 (latest 3.x)
      
      * Test
      
      * Test
      
      * Test
      
      * Test
      
      * Test
      
      * Test
      
      * Revert unnecessary changes.
      
      * Change sphinx version to 3.5.4"
      
      * Test python 3.7.11
      ffecfea9
    • NielsRogge's avatar
      Add LayoutLMv2 + LayoutXLM (#12604) · b6ddb08a
      NielsRogge authored
      
      
      * First commit
      
      * Make style
      
      * Fix dummy objects
      
      * Add Detectron2 config
      
      * Add LayoutLMv2 pooler
      
      * More improvements, add documentation
      
      * More improvements
      
      * Add model tests
      
      * Add clarification regarding image input
      
      * Improve integration test
      
      * Fix bug
      
      * Fix another bug
      
      * Fix another bug
      
      * Fix another bug
      
      * More improvements
      
      * Make more tests pass
      
      * Make more tests pass
      
      * Improve integration test
      
      * Remove gradient checkpointing and add head masking
      
      * Add integration test
      
      * Add LayoutLMv2ForSequenceClassification to the tests
      
      * Add LayoutLMv2ForQuestionAnswering
      
      * More improvements
      
      * More improvements
      
      * Small improvements
      
      * Fix _LazyModule
      
      * Fix fast tokenizer
      
      * Move sync_batch_norm to a separate method
      
      * Replace dummies by requires_backends
      
      * Move calculation of visual bounding boxes to separate method + update README
      
      * Add models to main init
      
      * First draft
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * Remove is_split_into_words
      
      * More improvements
      
      * Simply tesseract - no use of pandas anymore
      
      * Add LayoutLMv2Processor
      
      * Update is_pytesseract_available
      
      * Fix bugs
      
      * Improve feature extractor
      
      * Fix bug
      
      * Add print statement
      
      * Add truncation of bounding boxes
      
      * Add tests for LayoutLMv2FeatureExtractor and LayoutLMv2Tokenizer
      
      * Improve tokenizer tests
      
      * Make more tokenizer tests pass
      
      * Make more tests pass, add integration tests
      
      * Finish integration tests
      
      * More improvements
      
      * More improvements - update API of the tokenizer
      
      * More improvements
      
      * Remove support for VQA training
      
      * Remove some files
      
      * Improve feature extractor
      
      * Improve documentation and one more tokenizer test
      
      * Make quality and small docs improvements
      
      * Add batched tests for LayoutLMv2Processor, remove fast tokenizer
      
      * Add truncation of labels
      
      * Apply suggestions from code review
      
      * Improve processor tests
      
      * Fix failing tests and add suggestion from code review
      
      * Fix tokenizer test
      
      * Add detectron2 CI job
      
      * Simplify CI job
      
      * Comment out non-detectron2 jobs and specify number of processes
      
      * Add pip install torchvision
      
      * Add durations to see which tests are slow
      
      * Fix tokenizer test and make model tests smaller
      
      * Frist draft
      
      * Use setattr
      
      * Possible fix
      
      * Proposal with configuration
      
      * First draft of fast tokenizer
      
      * More improvements
      
      * Enable fast tokenizer tests
      
      * Make more tests pass
      
      * Make more tests pass
      
      * More improvements
      
      * Addd padding to fast tokenizer
      
      * Mkae more tests pass
      
      * Make more tests pass
      
      * Make all tests pass for fast tokenizer
      
      * Make fast tokenizer support overflowing boxes and labels
      
      * Add support for overflowing_labels to slow tokenizer
      
      * Add support for fast tokenizer to the processor
      
      * Update processor tests for both slow and fast tokenizers
      
      * Add head models to model mappings
      
      * Make style & quality
      
      * Remove Detectron2 config file
      
      * Add configurable option to label all subwords
      
      * Fix test
      
      * Skip visual segment embeddings in test
      
      * Use ResNet-18 backbone in tests instead of ResNet-101
      
      * Proposal
      
      * Re-enable all jobs on CI
      
      * Fix installation of tesseract
      
      * Fix failing test
      
      * Fix index table
      
      * Add LayoutXLM doc page, first draft of code examples
      
      * Improve documentation a lot
      
      * Update expected boxes for Tesseract 4.0.0 beta
      
      * Use offsets to create labels instead of checking if they start with ##
      
      * Update expected boxes for Tesseract 4.1.1
      
      * Fix conflict
      
      * Make variable names cleaner, add docstring, add link to notebooks
      
      * Revert "Fix conflict"
      
      This reverts commit a9b46ce9afe47ebfcfe7b45e6a121d49e74ef2c5.
      
      * Revert to make integration test pass
      
      * Apply suggestions from @LysandreJik's review
      
      * Address @patrickvonplaten's comments
      
      * Remove fixtures DocVQA in favor of dataset on the hub
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      b6ddb08a
  29. 13 Aug, 2021 1 commit
  30. 10 Aug, 2021 1 commit
  31. 09 Aug, 2021 1 commit