- 04 Aug, 2022 5 commits
-
-
Sylvain Gugger authored
-
Kian Sierra McGettigan authored
* swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen
-
Michael Benayoun authored
* Enable HFTracer to trace with custom dummy inputs instead of pre-computed ones * Add HFTracer.trace docstring, and make it possible to handle callable and torch.nn.Module in general * Remove pdb comment * Apply suggestions
-
nlpcat authored
* change shape to support dynamic batch input in tf.generate * add tests Co-authored-by:nlpcatcode <nlpcodecat@gmail.com>
-
Thomas Wang authored
* Cleanup some code * Improve signatures * Try to reduce the number of reshape/copies * I don't think we actually need the layer_num scaling trick * No need for duplication * Try to fix beam_search * Fix beam search * Removing layer num normalization seems to be breaking * Not sure self.layer_number normalization actually matters * Try and be backward compatible * Try to fix beam_search * Revert attempt to be backward compatible * Improve documentation on past_key_values format * Optimize the device allocation in case of hidden_states in multiple devices * No need to manually cast the values to a specific device * Rename with long version of variables * Improve type hinting * Add comment that explains that some methods return views * Actually i think the attention casting only makes sense when we use torch.float16 * We don't actually need layer_number to be passed anymore * Fix FX test * Bypass torch.baddbmm * Apply suggestions from code review * Add comment about support for torchScript v1.11 * fix ONNX support for bloom (#18456) Co-authored-by:
Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by:
Nouamane Tazi <nouamane98@gmail.com>
-
- 03 Aug, 2022 10 commits
-
-
LSinev authored
Comparisons like version.parse(torch.__version__) > version.parse("1.6") are True for torch==1.6.0+cu101 or torch==1.6.0+cpu version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py -
Sayak Paul authored
* fix: keras fit tests for segformer tf and minor refactors. * refactor: test_keras_fit to make it simpler using the existing one. * fix: styling issues.
-
Alara Dirik authored
-
Daniel Suess authored
* Fix failing test_xla_generate_slow tests * Fix failing speech-to-text xla_generate tests
-
Omar Sanseviero authored
* Update pinned hhub version * Make style
-
Ritik Nandwal authored
* Update no_trainer script for image-classification * Update no_trainer scripts for language-modeling examples * Remove unused variable * Removing truncation from losses array for language modeling examples
-
Ian Castillo authored
* Add file in spanish docs to be translated * Translate first two sections to Spanish * Translate four additional sections to Spanish * Finish translation to Spanish * Improve writing style in Spanish * Add suggested changes from reviewer
-
Gary Miguel authored
* support ONNX export of XDropout in deberta{,_v2} * black * copy to sew_d * add test * isort * use pytest.mark.filterwarnings * review comments -
Steven Liu authored
This PR moves GroupViT and LXMert to their correct sections. As pointed out by @NielsRogge and @LysandreJik, GroupViT and LXMert are both multimodal models.
-
Sourab Mangrulkar authored
-
- 02 Aug, 2022 10 commits
-
-
Christopher Akiki authored
The current wording makes it sound as if the programming languages are part of the 46 natural languages.
-
David authored
* Update pipeline word heuristic to work with whitespace in token offsets This change checks for whitespace in the input string at either the character preceding the token or in the first character of the token. This works with tokenizers that return offsets excluding whitespace between words or with offsets including whitespace. fixes #18111 starting * Use smaller model, ensure expected tokenization * Re-run CI (please squash)
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Jo茫o Lages authored
* improve generate docstring * Remove 'defaults to None' comment
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Alara Dirik authored
* update maskformer docs * fix typo
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Piotr Dabkowski authored
`torch.Tensor` creates an unitialized tensor (as via `torch.empty`), this leads to undeterministic behavior, poor initialization, and nans if you have unlucky init. The paper does not specify the initialization for bias terms, so I guess zero seems like a good choice - no bias initially. `torch.Tensor` is usually populated with zeros, so this fix will be close to the intended behavior: ``` >>> torch.Tensor(100, 100).sum() tensor(0.) >>> torch.Tensor(100, 100).sum() tensor(nan) >>> torch.Tensor(100, 100).sum() tensor(0.) ```
-
- 01 Aug, 2022 15 commits
-
-
Yassine authored
-
Kelvin Kong authored
* Added option for users to modify config parameter used by pytesseract during feature extraction - Added optional 'tess_config' kwarg when setting up LayoutLMV2 processor that is used by pytesseract during feature extraction - Eg. Can be used to modify psm values by setting tess_config to '--psm 7' - Different psm values significantly influences the output of layoutlmv2 * Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updated variable names to be more explicit * Fixed styles * Added option for users to modify config parameter when calling pytesseract during feature extraction - Added option to set "tesseract_config" parameter during LayoutLMV3 processor initialization - Can be used to modify PSM values, eg. by setting tesseract_config="--psm 6" * Removed from function signature Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
Steven Liu authored
*
馃摑 split up model list * Adapt script to reorg * apply niels feedback Co-authored-by:Sylvain Gugger <Sylvain.gugger@gmail.com>
-
Sylvain Gugger authored
* Rewrite push_to_hub to use upload_files * Adapt the doc a bit * Address review comments and clean doc
-
Duong A. Nguyen authored
* add bart pretraining flax script * fixup * add bart pretraining flax script * add BART to README * add BART to README * add BART to README * add BART to README * add BART to README * add bos eos document * Update README.md * Update README.md * Update examples/flax/language-modeling/run_bart_dlm_flax.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * final * final * final * remove use_auth_token ing from_config Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
Sylvain Gugger authored
* Fix ROUGE add example check and update README * Stay consistent in values
-
Ikuya Yamada authored
* add LUKE models for downstream tasks * add new LUKE models to docs * fix typos * remove commented lines * exclude None items from tuple return values
-
NielsRogge authored
Co-authored-by:Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
Sylvain Gugger authored
* Add balanced strategies for device_map in from_pretrained * Add safeguards for Accelerate version * Update src/transformers/modeling_utils.py Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Style Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr>
-
NielsRogge authored
Co-authored-by:Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
Arthur authored
-
Sylvain Gugger authored
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
YouJiacheng authored
Fix #18385 I don't know whether `use_auth_token`, `cache_dir` and `local_files_only` should be passed to `(cls.slow_tokenizer_class)._from_pretrained`, but I guess it should.
-
amyeroberts authored
-