- 03 Aug, 2022 1 commit
-
-
Sourab Mangrulkar authored
-
- 02 Aug, 2022 10 commits
-
-
Christopher Akiki authored
The current wording makes it sound as if the programming languages are part of the 46 natural languages.
-
David authored
* Update pipeline word heuristic to work with whitespace in token offsets This change checks for whitespace in the input string at either the character preceding the token or in the first character of the token. This works with tokenizers that return offsets excluding whitespace between words or with offsets including whitespace. fixes #18111 starting * Use smaller model, ensure expected tokenization * Re-run CI (please squash)
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Jo茫o Lages authored
* improve generate docstring * Remove 'defaults to None' comment
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Alara Dirik authored
* update maskformer docs * fix typo
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Piotr Dabkowski authored
`torch.Tensor` creates an unitialized tensor (as via `torch.empty`), this leads to undeterministic behavior, poor initialization, and nans if you have unlucky init. The paper does not specify the initialization for bias terms, so I guess zero seems like a good choice - no bias initially. `torch.Tensor` is usually populated with zeros, so this fix will be close to the intended behavior: ``` >>> torch.Tensor(100, 100).sum() tensor(0.) >>> torch.Tensor(100, 100).sum() tensor(nan) >>> torch.Tensor(100, 100).sum() tensor(0.) ```
-
- 01 Aug, 2022 19 commits
-
-
Yassine authored
-
Kelvin Kong authored
* Added option for users to modify config parameter used by pytesseract during feature extraction - Added optional 'tess_config' kwarg when setting up LayoutLMV2 processor that is used by pytesseract during feature extraction - Eg. Can be used to modify psm values by setting tess_config to '--psm 7' - Different psm values significantly influences the output of layoutlmv2 * Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updated variable names to be more explicit * Fixed styles * Added option for users to modify config parameter when calling pytesseract during feature extraction - Added option to set "tesseract_config" parameter during LayoutLMV3 processor initialization - Can be used to modify PSM values, eg. by setting tesseract_config="--psm 6" * Removed from function signature Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
Steven Liu authored
*
馃摑 split up model list * Adapt script to reorg * apply niels feedback Co-authored-by:Sylvain Gugger <Sylvain.gugger@gmail.com>
-
Sylvain Gugger authored
* Rewrite push_to_hub to use upload_files * Adapt the doc a bit * Address review comments and clean doc
-
Duong A. Nguyen authored
* add bart pretraining flax script * fixup * add bart pretraining flax script * add BART to README * add BART to README * add BART to README * add BART to README * add BART to README * add bos eos document * Update README.md * Update README.md * Update examples/flax/language-modeling/run_bart_dlm_flax.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * final * final * final * remove use_auth_token ing from_config Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
Sylvain Gugger authored
* Fix ROUGE add example check and update README * Stay consistent in values
-
Ikuya Yamada authored
* add LUKE models for downstream tasks * add new LUKE models to docs * fix typos * remove commented lines * exclude None items from tuple return values
-
NielsRogge authored
Co-authored-by:Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
Sylvain Gugger authored
* Add balanced strategies for device_map in from_pretrained * Add safeguards for Accelerate version * Update src/transformers/modeling_utils.py Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Style Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr>
-
NielsRogge authored
Co-authored-by:Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
Arthur authored
-
Sylvain Gugger authored
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
YouJiacheng authored
Fix #18385 I don't know whether `use_auth_token`, `cache_dir` and `local_files_only` should be passed to `(cls.slow_tokenizer_class)._from_pretrained`, but I guess it should.
-
amyeroberts authored
-
Ogundepo Odunayo authored
-
atturaioe authored
* Migrate metric to Evaluate in pytorch examples * Remove unused imports
-
dependabot[bot] authored
Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3. - [Release notes](https://github.com/lepture/mistune/releases) - [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst) - [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3 ) --- updated-dependencies: - dependency-name: mistune dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
dependabot[bot] authored
Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3. - [Release notes](https://github.com/lepture/mistune/releases) - [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst) - [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3 ) --- updated-dependencies: - dependency-name: mistune dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 30 Jul, 2022 1 commit
-
-
Sourab Mangrulkar authored
renaming it
-
- 29 Jul, 2022 6 commits
-
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Michael Benayoun authored
* Bloom model can now be traced * Bloom traced model can be torch scripted and serialized * Bloom can be traced with variable keyword arguments * Enable XLNet support * Disable XLNet for now
-
Yih-Dar authored
* Fix some doctests Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Sylvain Gugger authored
* Preliminary work on tokenizers * Quality + fix tests * Treat processors * Fix pad * Remove all uses of in tests, docs and examples * Replace all as_target_tokenizer * Fix tests * Fix quality * Update examples/flax/image-captioning/run_image_captioning_flax.py Co-authored-by:
amyeroberts <amy@huggingface.co> * Style Co-authored-by:
amyeroberts <amy@huggingface.co>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Sanchit Gandhi authored
* [Docs] Fix Speech Encoder Decoder doc sample * improve pre-processing comment * make style
-
- 28 Jul, 2022 3 commits
-
-
Vijay S Kalmath authored
Currently, tensorflow examples use the `load_metric` function from Datasets library, commit migrates function call to `load` function from Evaluate library.
-
Vijay S Kalmath authored
* Migrate metric to Evaluate library in tf examples Currently tensorflow examples use `load_metric` function from Datasets library , commit migrates function call to `load` function to Evaluate library. Fix for #18306 * Migrate metric to Evaluate library in tf examples Currently tensorflow examples use `load_metric` function from Datasets library , commit migrates function call to `load` function to Evaluate library. Fix for #18306 * Migrate `metric` to Evaluate for all tf examples Currently tensorflow examples use `load_metric` function from Datasets library , commit migrates function call to `load` function to Evaluate library.
-
Thomas Wang authored
-