- 01 Apr, 2021 1 commit
-
-
Josh authored
* Update optimization.py Fix documentation to reflect optimal settings for Adafactor * update and expand on the recommendations * style * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * flip scale_parameter to True for the 2nd recommendatoin Co-authored-by:
Stas Bekman <stas@stason.org> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 31 Mar, 2021 13 commits
-
-
Hemil Desai authored
* Add initial script for finetuning MLM models with accelerate * Add evaluation metric calculation * Fix bugs * Use no_grad on evaluation * update script docstring * Update examples/language-modeling/run_mlm_no_trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * PR feedback * Fix CI failure * Update examples/language-modeling/run_mlm_no_trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
JohnnyC08 authored
In the group by length documentation length is misspelled as legnth
-
Patrick von Platen authored
-
Sylvain Gugger authored
* Replace is_sagemaker_distributed_available * Merge SageMakerTrainer into Trainer * Test with shorter condition * Put back deleted line * Deprecate SageMakerTrainer and SageMakerTrainingArguments * Apply suggestions from code review Co-authored-by:
Philipp Schmid <32632186+philschmid@users.noreply.github.com> Co-authored-by:
Philipp Schmid <32632186+philschmid@users.noreply.github.com>
-
Patrick von Platen authored
-
Sylvain Gugger authored
* First third * Styling and fix mistake * Quality * All the rest * Treat %s and %d * typo * Missing ) * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sylvain Gugger authored
* Add more metadata to the user agent * Fix typo * Use DISABLE_TELEMETRY * Address review comments * Use global env * Add clean envs on circle CI
-
Suraj Patil authored
-
Lysandre Debut authored
-
Lysandre Debut authored
-
Philipp Schmid authored
* wrong makefile command * ddp test fix
-
WybeKoper authored
* Fixed typos * Removed legacy colab notebook from readme Co-authored-by:WybeKoper <WybeKoper@users.noreply.github.com>
-
Patrick von Platen authored
* add first code structures * add all bert models * add to init and docs * correct docs * make style
-
- 30 Mar, 2021 11 commits
-
-
Yih-Dar authored
-
Philipp Schmid authored
* added py7zr * comment out check_min for sagemaker test * added min version again
-
Nicolas Patry authored
a fully qualified model. We simply forgot to change the call for this one when this landed: https://github.com/huggingface/transformers/pull/10888 It's odd that tests didn't catch that. Should we add some ? (It's a pretty edgy test case, but it does run within the API).
-
Philipp Schmid authored
* improved branch usage * fixed grammar and comma
-
Suraj Patil authored
* fix checkpoint names * auto model * fix doc
-
Patrick von Platen authored
-
Suraj Patil authored
* lets begin * boom boom * fix out proj in attn * fix attention * fix local attention * add tokenizer * fix imports * autotokenizer * fix checkpoint name * cleanup * more clean-up * more cleanup * output attentions * fix attn mask creation * fix imports * config doc * add tests * add slow tests * quality * add conversion script * copyright * typo * another bites the dust * fix attention tests * doc * add embed init in convert function * fix copies * remove tokenizer * enable caching * address review comments * improve config and create attn layer list internally * more consistent naming * init hf config from mesh-tf config json file * remove neo tokenizer from doc * handle attention_mask in local attn layer * attn_layers => attention_layers * add tokenizer_class in config * fix docstring * raise if len of attention_layers is not same as num_layers * remove tokenizer_class from config * more consistent naming * fix doc * fix checkpoint names * fp16 compat * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Philipp Schmid authored
-
Patrick von Platen authored
* save intermediate * finish first version * delete some more * improve import * fix roberta * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * small corrections * apply all comments * fix deterministic * make fix-copies Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Philipp Schmid authored
* init * first working test * added todo for setup.py * working test for single node multi node ddp and smd * added tensorflow single node test * added directory for pytorch and tensorflow due to different requirements.txt * added directory for pytorch and tensorflow * added comment for run_glue until it is available * added output_dir to it * smaller dataset to make test running faster * adjust HP and script * adjusted parameter for tensorflow * refactored test scripts * adjusted make file * init * first working test * added todo for setup.py * working test for single node multi node ddp and smd * added tensorflow single node test * added directory for pytorch and tensorflow due to different requirements.txt * added directory for pytorch and tensorflow * added comment for run_glue until it is available * added output_dir to it * smaller dataset to make test running faster * adjust HP and script * adjusted parameter for tensorflow * refactored test scripts * adjusted make file * updated dlc container * commented in all tests * added both ecr images * added new master branches * debug * added new datasets version * init * strange rebase bug * removed changes * changed min version for tests to work * updated DLC * added model parallel test * removed test files * removed test files * tested with ned dlc * added correct sagemaker sdk version * adjust DLCs for official one * reworked tests * quality * removed default profile added documentation to it * added step in release for sagemaker tests * reverted version for example script removed duplicated script and added install from master to requirements.txt * removed mistaken .DS_Stores from mac * fixed tests * added Sylvains feedback * make style * added lysandre's feedback
-
Vasudev Gupta authored
* init bigbird * model.__init__ working, conversion script ready, config updated * add conversion script * BigBirdEmbeddings working :) * slightly update conversion script * BigBirdAttention working :) ; some bug in layer.output.dense * add debugger-notebook * forward() working for BigBirdModel :) ; replaced gelu with gelu_fast * tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :) * BigBirdModel working in block-sparse attention mode :) * add BigBirdForPreTraining * small fix * add tokenizer for BigBirdModel * fix config & hence modeling * fix base prefix * init testing * init tokenizer test * pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements * remove position_embedding_type arg * complete normal tests * add comments to block sparse attention * add attn_probs for sliding & global tokens * create fn for block sparse attn mask creation * add special tests * restore pos embed arg * minor fix * attn probs update * make big bird fully gpu friendly * fix tests * remove pruning * correct tokenzier & minor fixes * update conversion script , remove norm_type * tokenizer-inference test add * remove extra comments * add docs * save intermediate * finish trivia_qa conversion * small update to forward * correct qa and layer * better error message * BigBird QA ready * fix rebased * add triva-qa debugger notebook * qa setup * fixed till embeddings * some issue in q/k/v_layer * fix bug in conversion-script * fixed till self-attn * qa fixed except layer norm * add qa end2end test * fix gradient ckpting ; other qa test * speed-up big bird a bit * hub_id=google * clean up * make quality * speed up einsum with bmm * finish perf improvements for big bird * remove wav2vec2 tok * fix tokenizer * include docs * correct docs * add helper to auto pad block size * make style * remove fast tokenizer for now * fix some * add pad test * finish * fix some bugs * fix another bug * fix buffer tokens * fix comment and merge from master * add comments * make style * commit some suggestions Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix typos * fix some more suggestions * add another patch Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * another path Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * update * update nit suggestions * make style Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 29 Mar, 2021 11 commits
-
-
Sylvain Gugger authored
* Fixes in the templates * Define in all cases * Dimensionality -> Dimension Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr>
-
Stas Bekman authored
* fix cpu mem metrics; reformat runtime metric * adjust dependency * extend docs * soft dependency * cleanup * fix the runtime metric issue * restore * move docs, cross reference from 2 places, improve * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Daniel Stancl authored
* Initial commit * Another bunch of updates * make style quliaty + delete debug arg from bash script * Use compue_metrics func * Do a few fixes * Add copyright * Fix typos
-
pcuenca authored
A new argument `length_column_name` has been added to `TrainingArguments`, with default value `"length"`. If this column exists and `group_by_length` is `True`, the train sampler will use it for grouping rather than computing it before training starts. This is an optimization that allows the user to prepare data for fast processing, preventing sequential access to the dataset as described in issue #10909.
-
Sylvain Gugger authored
-
Daniel Stancl authored
* Add NER example with accelerate library * This commit contains the first (yet really unfinished) version of a script for showing how to train HuggingFace model with their new accelerate library. * Fix metric calculation * make style quality * mv ner_no_trainer to token-classification dir * Delete --debug flag from running script * hf_datasets -> raw_datasets * Make a few slight adjustments * Add an informative comment + rewrite a help comment * Change header * Fix a few things * Enforce to use fast tokenizers only * DataCollatorWithPadding -> DataCollatorForTokenClassification * Change bash script: python3 -> accelerate launch * make style * Add a few missing things (see below) * Add a max-lenghth padding to predictions and labels to enable accelerate gather functionality * Add PyTorch no trainer example to the example README.md * Remove --do-train from args as being redundant for now * DataCollatorWithPadding -> DataCollatorForTokenClassification * Remove some obsolete args.do_train conditions from the script * Delete --do_train from bash running script * Delete use_slow_tokenizer from args * Add unintentionally removed flag --label_all_tokens * Delete --debug flag from running script
-
Sylvain Gugger authored
* Instantiate model only once in pipeline * Remove documentation of deprecated method * Add FutureWarning * Update src/transformers/pipelines/base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Masatoshi Suzuki authored
-
WybeKoper authored
Co-authored-by:WybeKoper <WybeKoper@users.noreply.github.com>
-
Guillaume Filion authored
-
- 28 Mar, 2021 1 commit
-
-
Bhadresh Savani authored
-
- 26 Mar, 2021 3 commits
-
-
Sylvain Gugger authored
* Add ImageFeatureExtractionMixin * Add dummy vision objects * Add require_vision * Add tests * Fix test
-
Tomy Hsieh authored
* Rename NLP library to Datasets library * Update github template * Fix styling
-