- 04 Oct, 2023 1 commit
-
-
Sylvain Gugger authored
* Fix number of minimal calls to the Hub with peft integration * Alternate design * And this way? * Revert * Nits to fix * Add util * Print when changes are made * Add list to ignore * Add more rules * Manual fixes * deal with kwargs * deal with enum defaults * avoid many digits for floats * Manual fixes * Fix regex * Fix regex * Auto fix * Style * Apply script * Add ignored list * Add check that templates are filled * Adding to CI checks * Add back semi-fix * Ignore more objects * More auto-fixes * Ignore missing objects * Remove temp semi-fix * Fixes * Update src/transformers/models/pvt/configuration_pvt.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update utils/check_docstrings.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Deal with float defaults * Fix small defaults * Address review comment * Treat * Post-rebase cleanup * Address review comment * Update src/transformers/models/deprecated/mctct/configuration_mctct.py Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comment --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr>
-
- 28 Jun, 2023 1 commit
-
-
amyeroberts authored
Make sure feature_extractor is defined in all cases
-
- 06 Feb, 2023 1 commit
-
-
Sylvain Gugger authored
* Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies
-
- 09 Dec, 2022 1 commit
-
-
amyeroberts authored
* Replace FE references with IPs * Update processor tests * Update src/transformers/models/clip/processing_clip.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/clip/processing_clip.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update warning messages v4.27 -> v5 * Fixup * Update Chinese CLIP processor * Add feature_extractor property * Add attributes * Add tests Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 10 Nov, 2022 1 commit
-
-
Sanchit Gandhi authored
* [processor] Add 'model input names' property * add test * no f string * add generic property method to mixin * copy to multimodal * copy to vision * tests for all audio * remove ad-hoc tests * style * fix flava test * fix test * fix processor code
-
- 21 Oct, 2022 1 commit
-
-
Sylvain Gugger authored
* Use None to detect if truncation was unset * Fix repo consistency
-
- 24 May, 2022 1 commit
-
-
NielsRogge authored
* Make forward pass work * More improvements * Remove unused imports * Remove timm dependency * Improve loss calculation of token classifier * Fix most tests * Add docs * Add model integration test * Make all tests pass * Add LayoutLMv3FeatureExtractor * Improve integration test + make fixup * Add example script * Fix style * Add LayoutLMv3Processor * Fix style * Add option to add visual labels * Make more tokenizer tests pass * Fix more tests * Make more tests pass * Fix bug and improve docs * Fix import of processors * Improve docstrings * Fix toctree and improve docs * Fix auto tokenizer * Move tests to model folder * Move tests to model folder * change default behavior add_prefix_space * add prefix space for fast * add_prefix_spcae set to True for Fast * no space before `unique_no_split` token * add test to hightligh special treatment of added tokens * fix `test_batch_encode_dynamic_overflowing` by building a long enough example * fix `test_full_tokenizer` with add_prefix_token * Fix tokenizer integration test * Make the code more readable * Add tests for LayoutLMv3Processor * Fix style * Add model to README and update init * Apply suggestions from code review * Replace asserts by value errors * Add suggestion by @ducviet00 * Add model to doc tests * Simplify script * Improve README * a step ahead to fix * Update pair_input_test * Make all tokenizer tests pass - phew * Make style * Add LayoutLMv3 to CI job * Fix auto mapping * Fix CI job name * Make all processor tests pass * Make tests of LayoutLMv2 and LayoutXLM consistent * Add copied from statements to fast tokenizer * Add copied from statements to slow tokenizer * Remove add_visual_labels attribute * Fix tests * Add link to notebooks * Improve docs of LayoutLMv3Processor * Fix reference to section Co-authored-by:
SaulLu <lucilesaul.com@gmail.com> Co-authored-by:
Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
- 09 May, 2022 1 commit
-
-
ghlai9665 authored
LayoutLMv2Processor: ensure 1-to-1 mapping between images and samples in case of overflowing tokens (#17092) * add get_overflowing_images function to ensure 1-to-1 mapping between samples and images in LayoutLMv2Processor * make style * add test for overflowing_tokens, change assert to ValueError, avoiding unrelated formatting changes * change line length by passing --preview into black
-
- 23 Mar, 2022 1 commit
-
-
Sylvain Gugger authored
* Split file_utils in several submodules * Fixes * Add back more objects * More fixes * Who exactly decided to import that from there? * Second suggestion to code with code review * Revert wront move * Fix imports * Adapt all imports * Adapt all imports everywhere * Revert this import, will fix in a separate commit
-
- 09 Feb, 2022 1 commit
-
-
Sylvain Gugger authored
* PoC for a ProcessorMixin class * Documentation * Apply suggestions from code review Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Roll out to other processors * Add base feature extractor class in init * Use args and kwargs Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 30 Dec, 2021 1 commit
-
-
Patrick von Platen authored
* [AutoProcessor] Correct AutoProcessor and automatically add processor class * up * up * up * up * up * up * up * up * continue tomorrow * up * up * up * make processor class private * fix loop
-
- 27 Dec, 2021 1 commit
-
-
Sylvain Gugger authored
* New doc styler * Fix issue with args at the start * Code sample fixes * Style code examples in MDX * Fix more patterns * Typo * Typo * More patterns * Do without black for now * Get more info in error * Docstring style * Re-enable check * Quality * Fix add_end_docstring decorator * Fix docstring
-
- 21 Dec, 2021 1 commit
-
-
Sylvain Gugger authored
* Convert docstrings of all configurations and tokenizers * Processors and fixes * Last modeling files and fixes to models * Pipeline modules * Utils files * Data submodule * All the other files * Style * Missing examples * Style again * Fix copies * Say bye bye to rst docstrings forever
-
- 30 Aug, 2021 1 commit
-
-
NielsRogge authored
* First commit * Make style * Fix dummy objects * Add Detectron2 config * Add LayoutLMv2 pooler * More improvements, add documentation * More improvements * Add model tests * Add clarification regarding image input * Improve integration test * Fix bug * Fix another bug * Fix another bug * Fix another bug * More improvements * Make more tests pass * Make more tests pass * Improve integration test * Remove gradient checkpointing and add head masking * Add integration test * Add LayoutLMv2ForSequenceClassification to the tests * Add LayoutLMv2ForQuestionAnswering * More improvements * More improvements * Small improvements * Fix _LazyModule * Fix fast tokenizer * Move sync_batch_norm to a separate method * Replace dummies by requires_backends * Move calculation of visual bounding boxes to separate method + update README * Add models to main init * First draft * More improvements * More improvements * More improvements * More improvements * More improvements * Remove is_split_into_words * More improvements * Simply tesseract - no use of pandas anymore * Add LayoutLMv2Processor * Update is_pytesseract_available * Fix bugs * Improve feature extractor * Fix bug * Add print statement * Add truncation of bounding boxes * Add tests for LayoutLMv2FeatureExtractor and LayoutLMv2Tokenizer * Improve tokenizer tests * Make more tokenizer tests pass * Make more tests pass, add integration tests * Finish integration tests * More improvements * More improvements - update API of the tokenizer * More improvements * Remove support for VQA training * Remove some files * Improve feature extractor * Improve documentation and one more tokenizer test * Make quality and small docs improvements * Add batched tests for LayoutLMv2Processor, remove fast tokenizer * Add truncation of labels * Apply suggestions from code review * Improve processor tests * Fix failing tests and add suggestion from code review * Fix tokenizer test * Add detectron2 CI job * Simplify CI job * Comment out non-detectron2 jobs and specify number of processes * Add pip install torchvision * Add durations to see which tests are slow * Fix tokenizer test and make model tests smaller * Frist draft * Use setattr * Possible fix * Proposal with configuration * First draft of fast tokenizer * More improvements * Enable fast tokenizer tests * Make more tests pass * Make more tests pass * More improvements * Addd padding to fast tokenizer * Mkae more tests pass * Make more tests pass * Make all tests pass for fast tokenizer * Make fast tokenizer support overflowing boxes and labels * Add support for overflowing_labels to slow tokenizer * Add support for fast tokenizer to the processor * Update processor tests for both slow and fast tokenizers * Add head models to model mappings * Make style & quality * Remove Detectron2 config file * Add configurable option to label all subwords * Fix test * Skip visual segment embeddings in test * Use ResNet-18 backbone in tests instead of ResNet-101 * Proposal * Re-enable all jobs on CI * Fix installation of tesseract * Fix failing test * Fix index table * Add LayoutXLM doc page, first draft of code examples * Improve documentation a lot * Update expected boxes for Tesseract 4.0.0 beta * Use offsets to create labels instead of checking if they start with ## * Update expected boxes for Tesseract 4.1.1 * Fix conflict * Make variable names cleaner, add docstring, add link to notebooks * Revert "Fix conflict" This reverts commit a9b46ce9afe47ebfcfe7b45e6a121d49e74ef2c5. * Revert to make integration test pass * Apply suggestions from @LysandreJik's review * Address @patrickvonplaten's comments * Remove fixtures DocVQA in favor of dataset on the hub Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr>
-