- 29 Mar, 2021 6 commits
-
-
Sylvain Gugger authored
-
Daniel Stancl authored
* Add NER example with accelerate library * This commit contains the first (yet really unfinished) version of a script for showing how to train HuggingFace model with their new accelerate library. * Fix metric calculation * make style quality * mv ner_no_trainer to token-classification dir * Delete --debug flag from running script * hf_datasets -> raw_datasets * Make a few slight adjustments * Add an informative comment + rewrite a help comment * Change header * Fix a few things * Enforce to use fast tokenizers only * DataCollatorWithPadding -> DataCollatorForTokenClassification * Change bash script: python3 -> accelerate launch * make style * Add a few missing things (see below) * Add a max-lenghth padding to predictions and labels to enable accelerate gather functionality * Add PyTorch no trainer example to the example README.md * Remove --do-train from args as being redundant for now * DataCollatorWithPadding -> DataCollatorForTokenClassification * Remove some obsolete args.do_train conditions from the script * Delete --do_train from bash running script * Delete use_slow_tokenizer from args * Add unintentionally removed flag --label_all_tokens * Delete --debug flag from running script
-
Sylvain Gugger authored
* Instantiate model only once in pipeline * Remove documentation of deprecated method * Add FutureWarning * Update src/transformers/pipelines/base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Masatoshi Suzuki authored
-
WybeKoper authored
Co-authored-by:WybeKoper <WybeKoper@users.noreply.github.com>
-
Guillaume Filion authored
-
- 28 Mar, 2021 1 commit
-
-
Bhadresh Savani authored
-
- 26 Mar, 2021 3 commits
-
-
Sylvain Gugger authored
* Add ImageFeatureExtractionMixin * Add dummy vision objects * Add require_vision * Add tests * Fix test
-
Tomy Hsieh authored
* Rename NLP library to Datasets library * Update github template * Fix styling
-
- 25 Mar, 2021 7 commits
-
-
lexhuismans authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Amir Tahmasbi authored
* Added embeddings layer * Added layoutlm layers, main model, maskedlm and token classification classes * Added model classes to tf auto models * Added model to PT to TF conversion script * Added model to doc README * Added tests * Removed unused imports * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py * Made tests pass! * Fixed typos in imports and docs * Fixed a typo in embeddings layer * Removed imports * Fixed formatting issues, imports, tests * Added layoutlm layers, main model, maskedlm and token classification classes * Added model classes to tf auto models * Added model to PT to TF conversion script * Removed unused imports * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py * Made tests pass! * Fixed typos in imports and docs * Removed imports * Fixed small formatting issues * Removed duplicates import from main __init__.py * Chnaged deafult arg to true for adding pooling layer to tf layoutlm * Fixed formatting issues * Style * Added copied from to classes copied from bert * Fixed doc strings examples to work with layoutlm inputs * Removed PyTorch reference in doc strings example * Added integration tests * Cleaned up initialization file * Updated model checkpoint identifiers * Fixed imports Co-authored-by:
Amir Tahmasbi <amir@ehsai.ca> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Philipp Schmid authored
-
Jethro Kuan authored
Use the correct variable (raw_datasets) instead of the module (datasets) where appropriate.
-
- 24 Mar, 2021 6 commits
-
-
Sidd Karamcheti authored
-
Sylvain Gugger authored
* Remove version warning in pretrained BART models * Put it at the base model
-
Lysandre Debut authored
* Removes overflowing bad word IDs * Raise warning
-
Eliza Szczechla authored
Co-authored-by:Eliza <eliza@habanero.tiger.com.pl>
-
imzhengzx authored
the orignal code in line 246 is ``` tokenizer: Optional["PreTrainedTokenizerBase"] = None, ``` it should be ``` tokenizer: Optional[PreTrainedTokenizerBase] = None, ```
-
Sylvain Gugger authored
-
- 23 Mar, 2021 12 commits
-
-
Sylvain Gugger authored
-
Philipp Schmid authored
* rewrote is_sagemaker_model_parallel_available * added is_sagemaker_model_parallel_available to SageMakerTrainer * removed unnecessary mp_parameters as TrainingArguments * make style happy * added mp_parameters again to parse mp-specific args.
-
RafaelWO authored
-
Bhadresh Savani authored
* added predict stage * added test keyword in exception message * removed example specific saving predictions * fixed f-string error * removed extra line Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Stas Bekman authored
* import refactor * fix the fallback
-
Lysandre authored
-
Philipp Schmid authored
* added finished documentation * changed version from 1.6 to 1.6.0 for distributed * updated versions * updated urls
-
Sylvain Gugger authored
-
Marta Ma艣lankowska authored
-
Bhadresh Savani authored
-
Stas Bekman authored
-
Sylvain Gugger authored
-
- 22 Mar, 2021 5 commits
-
-
Patrick von Platen authored
* push * finish * finish * make fix copies * change name
-
Eliza Szczechla authored
Co-authored-by:Eliza <eliza@habanero.tiger.com.pl>
-
Ruan Chaves authored
* Modify the _hp_search_setup method on the Trainer class to handle the wandb argument passed by Ray Tune to model config. * Reformat single quotes as double quotes.
-
Boris Dayma authored
* feat: ensure unique artifact id * feat: allow manual init * fix: simplify reinit logic * fix: no dropped value + immediate commits * fix: wandb use in sagemaker * docs: improve documenation and formatting * fix: typos * docs: improve formatting
-
Sidd Karamcheti authored
Add simple one character fix so that on_step_begin and on_step_end are called at the right times (#10839)
-