- 10 Jan, 2022 4 commits
-
-
cody-moveworks authored
* Make OpenAIGPTTokenizer work with SpaCy 3.x SpaCy 3.x introduced an API change to creating the tokenizer that breaks OpenAIGPTTokenizer. The old API for creating the tokenizer in SpaCy 2.x no longer works under SpaCy 3.x, but the new API for creating the tokenizer in SpaCy 3.x DOES work under SpaCy 2.x. Switching to the new API should allow OpenAIGPTTokenizer to work under both SpaCy 2.x and SpaCy 3.x versions. * Add is_spacy_available and is_ftfy_available methods to file utils * Add spacy and ftfy unittest decorator to testing utils * Add tests for OpenAIGPTTokenizer that require spacy and ftfy * Modify CircleCI config to run tests that require spacy and ftfy * Remove unneeded unittest decorators are reuse test code * Run make fixup
-
Kamal Raj authored
added new line
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Patrick von Platen authored
* up * up * up * up * up * up * improve * up * up * Update src/transformers/trainer.py * up * up * up
-
- 08 Jan, 2022 1 commit
-
-
yoquankara authored
* Fix convert for newer megatron-lm models * Save megatron-bert config in a proper way * Fix code style
-
- 07 Jan, 2022 3 commits
-
-
Yih-Dar authored
* fix doc example - TypeError: get_text_features() got an unexpected keyword argument 'token_type_ids' * add token_type_ids param Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* fix doc example - ValueError: Parameter config should be an instance of class `PretrainedConfig` * Update src/transformers/models/segformer/modeling_segformer.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * update Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
K.C. Tung authored
-
- 06 Jan, 2022 8 commits
-
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
flozi00 authored
-
Tavin Turner authored
-
Nicolas Patry authored
-
NielsRogge authored
-
Matt Churgin authored
-
Nicolas Patry authored
-
Yih-Dar authored
* add image captioning example * update README * fix style & quality * simplify * apply review suggestions * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Apply review suggestions * add comments about using np instead jax array * remove unused lines * add model creation script * only support from_pretrained * fix style * fix * not use cache_dir when creating model * fix tokenizer creation * update README * fix quality * apply suggestion * simplify some blocks * Update examples/flax/image-captioning/README.md * Update examples/flax/image-captioning/run_image_captioning_flax.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * apply suggestion Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
- 05 Jan, 2022 6 commits
-
-
Suraj Patil authored
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Nicolas Patry authored
* Adding QoL for `batch_size` arg (like others enabled everywhere). * Typo.
-
Yih-Dar authored
* fix doc example - AttributeError: 'numpy.ndarray' object has no attribute 'to' * fix more * Apply suggestions from code review * Update src/transformers/models/unispeech/modeling_unispeech.py Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Stas Bekman authored
* [megatron convert] PYTHONPATH requirements * more info
-
- 04 Jan, 2022 5 commits
-
-
Kevin Ko authored
* Update parallelism.mdx * Update parallelism.mdx
-
Nicolas Patry authored
* Hotfix `chunk_length_s` instead of `_ms`. * Adding fix of `pad_token` which should be last/previous token for CTC proper decoding * Fixing ChunkPipeline unwrapping. * Adding a PackIterator specific test.
-
Daniel Stancl authored
* Add FlaxRoFormer * Clean code + make quality * Fix output pooling for FlaxRoFormerForMultipleChoiceModule * Apply suggestions from code review * add flax model to repos Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
milyiyo authored
-
flozi00 authored
-
- 03 Jan, 2022 7 commits
-
-
Kevin Ko authored
* Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx
-
Patrick von Platen authored
* up * up * up
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Anton Lozhkov authored
* Naive ASR chunking * Fixing batching for ASR. Co-authored-by:Nicolas Patry <patry.nicolas@protonmail.com>
-
Nicolas Patry authored
* Enabling `truncation_side` for Slow and Fast tokenizer. Co-Authored-by:
Niels Rogge <48327001+NielsRogge@users.noreply.github.com> * Disable failing tests. * Layout xlm. * assert -> assertEqual. Co-authored-by:
Niels Rogge <48327001+NielsRogge@users.noreply.github.com>
-
Nicolas Patry authored
Backward compatibility broken in https://github.com/huggingface/transformers/pull/14988
-
Sylvain Gugger authored
* Map model_type and doc pages names * Add script * Fix typo * Quality * Manual check for Auto Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 30 Dec, 2021 6 commits
-
-
Sylvain Gugger authored
* Allow training to resume even if RNG states are not properly loaded * Proper f-string
-
Nicolas Patry authored
* Enabling `tokenizers` upgrade. * Moved ugly comment. * Tokenizers==0.11.1 needs an update to keep borrow checker happy in highly contiguous calls. * Support both 0.11.1 and 0.11.0
-
Nicolas Patry authored
* Adding `num_return_sequences` support for text2text generation. Co-Authored-By:
Enze <pu.miao@foxmail.com> * Update tests/test_pipelines_text2text_generation.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_pipelines_text2text_generation.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Enze <pu.miao@foxmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Patrick von Platen authored
* [Generate] correct encoder_outputs are passed without attention_mask * Apply suggestions from code review * up
-
Patrick von Platen authored
* [AutoProcessor] Correct AutoProcessor and automatically add processor class * up * up * up * up * up * up * up * up * continue tomorrow * up * up * up * make processor class private * fix loop
-
Nicolas Patry authored
* Fixing a pathological case for slow tokenizers * Update src/transformers/tokenization_utils.py
-