"docs/vscode:/vscode.git/clone" did not exist on "fe78fe98ca56c093d0503590cbad0b39ce3326a0"
- 09 Apr, 2021 4 commits
-
-
Kevin Canwen Xu authored
* Add a special tokenizer for CPM model * make style * fix * Add docs * styles * cpm doc * fix ci * fix the overview * add test * make style * typo * Custom tokenizer flag * Add REAMDE.md Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr>
-
Sylvain Gugger authored
-
Niklas Muennighoff authored
* Add Wav2Vec Inference notebook * Update docs/source/community.md Co-authored-by:Suraj Patil <surajp815@gmail.com>
-
Stas Bekman authored
* typo * style
-
- 08 Apr, 2021 5 commits
-
-
Stas Bekman authored
* make fairscale and deepspeed setup extras * fix default * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * no reason not to ask for the good version * update the CIs Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* relocate core integration tests * add sys.path context manager * cleanup * try * try2 * fix path * doc * style * add dep * add 2 more deps
-
Julien Demouth authored
* Add support for NVIDIA Megatron models * Add support for NVIDIA Megatron GPT2 and BERT Add the megatron_gpt2 model. That model reuses the existing GPT2 model. This commit includes a script to convert a Megatron-GPT2 checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. Add the megatron_bert model. That model is implemented as a modification of the existing BERT model in Transformers. This commit includes a script to convert a Megatron-BERT checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Remove model.half in tests + add "# Copied ..." Remove the model.half() instruction which makes tests fail on the CPU. Add a comment "# Copied ..." before many classes in the model to enable automatic tracking in CI between the new Megatron classes and the original Bert ones. * Fix issues * Fix Flax/TF tests * Fix copyright * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/megatron_bert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/megatron_gpt2.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Resolve most of 'sgugger' comments * Fix conversion issue + Run make fix-copies/quality/docs * Apply suggestions from code review * Causal LM & merge * Fix init * Add CausalLM to last auto class Co-authored-by:
Julien Demouth <jdemouth@nvidia.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Stas Bekman authored
* synced gpus * fix * fix * need to use t5-small for quality tests * notes * complete merge * fix a disappearing std stream problem * start zero3 tests * wip * tune params * sorting out the pre-trained model loading * reworking generate loop wip * wip * style * fix tests * split the tests * refactor tests * wip * parameterized * fix * workout the resume from non-ds checkpoint pass + test * cleanup * remove no longer needed code * split getter/setter functions * complete the docs * suggestions * gpus and their compute capabilities link * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * style * remove invalid paramgd * automatically configure zero3 params that rely on hidden size * make _get_resized_embeddings zero3-aware * add test exercising resize_token_embeddings() * add docstring Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Yusuke Mori authored
-
- 06 Apr, 2021 5 commits
-
-
Sylvain Gugger authored
* AutoFeatureExtractor * Init and first tests * Tests * Damn you gitignore * Quality * Defensive test for when not all backends are here * Use pattern for Speech2Text models
-
Stas Bekman authored
make the example work
-
Lysandre authored
-
Philipp Schmid authored
-
Sylvain Gugger authored
-
- 05 Apr, 2021 5 commits
-
-
Amala Deshmukh authored
* Add example for callback registry Resolves: #9036 * Update callback registry documentation * Added comments for other ways to register callback
-
Lysandre Debut authored
* Documentation about loading a fast tokenizer within Transformers * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Refactor AutoModel classes and add Flax Auto classes * Add new objects to the init * Fix hubconf and sort models * Fix TF tests * Missing coma * Update src/transformers/models/auto/auto_factory.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Fix init * Fix dummies * Other init to fix Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Lysandre Debut authored
-
Eren 艦ahin authored
double : prevents code-block section to be rendered, so made it single :
-
- 01 Apr, 2021 4 commits
-
-
Philipp Schmid authored
* added new notebook and merge of trainer * Update docs/source/sagemaker.md Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Julien Chaumond authored
-
Joe Davison authored
*negative* log-likelihood
-
NielsRogge authored
* Squash all commits into one * Update ViTFeatureExtractor to use image_utils instead of torchvision * Remove torchvision and add Pillow * Small docs improvement * Address most comments by @sgugger * Fix tests * Clean up conversion script * Pooler first draft * Fix quality * Improve conversion script * Make style and quality * Make fix-copies * Minor docs improvements * Should use fix-copies instead of manual handling * Revert "Should use fix-copies instead of manual handling" This reverts commit fd4e591bce4496d41406425c82606a8fdaf8a50b. * Place ViT in alphabetical order Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 31 Mar, 2021 3 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Patrick von Platen authored
* add first code structures * add all bert models * add to init and docs * correct docs * make style
-
- 30 Mar, 2021 4 commits
-
-
Philipp Schmid authored
* improved branch usage * fixed grammar and comma
-
Suraj Patil authored
* fix checkpoint names * auto model * fix doc
-
Suraj Patil authored
* lets begin * boom boom * fix out proj in attn * fix attention * fix local attention * add tokenizer * fix imports * autotokenizer * fix checkpoint name * cleanup * more clean-up * more cleanup * output attentions * fix attn mask creation * fix imports * config doc * add tests * add slow tests * quality * add conversion script * copyright * typo * another bites the dust * fix attention tests * doc * add embed init in convert function * fix copies * remove tokenizer * enable caching * address review comments * improve config and create attn layer list internally * more consistent naming * init hf config from mesh-tf config json file * remove neo tokenizer from doc * handle attention_mask in local attn layer * attn_layers => attention_layers * add tokenizer_class in config * fix docstring * raise if len of attention_layers is not same as num_layers * remove tokenizer_class from config * more consistent naming * fix doc * fix checkpoint names * fp16 compat * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Vasudev Gupta authored
* init bigbird * model.__init__ working, conversion script ready, config updated * add conversion script * BigBirdEmbeddings working :) * slightly update conversion script * BigBirdAttention working :) ; some bug in layer.output.dense * add debugger-notebook * forward() working for BigBirdModel :) ; replaced gelu with gelu_fast * tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :) * BigBirdModel working in block-sparse attention mode :) * add BigBirdForPreTraining * small fix * add tokenizer for BigBirdModel * fix config & hence modeling * fix base prefix * init testing * init tokenizer test * pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements * remove position_embedding_type arg * complete normal tests * add comments to block sparse attention * add attn_probs for sliding & global tokens * create fn for block sparse attn mask creation * add special tests * restore pos embed arg * minor fix * attn probs update * make big bird fully gpu friendly * fix tests * remove pruning * correct tokenzier & minor fixes * update conversion script , remove norm_type * tokenizer-inference test add * remove extra comments * add docs * save intermediate * finish trivia_qa conversion * small update to forward * correct qa and layer * better error message * BigBird QA ready * fix rebased * add triva-qa debugger notebook * qa setup * fixed till embeddings * some issue in q/k/v_layer * fix bug in conversion-script * fixed till self-attn * qa fixed except layer norm * add qa end2end test * fix gradient ckpting ; other qa test * speed-up big bird a bit * hub_id=google * clean up * make quality * speed up einsum with bmm * finish perf improvements for big bird * remove wav2vec2 tok * fix tokenizer * include docs * correct docs * add helper to auto pad block size * make style * remove fast tokenizer for now * fix some * add pad test * finish * fix some bugs * fix another bug * fix buffer tokens * fix comment and merge from master * add comments * make style * commit some suggestions Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix typos * fix some more suggestions * add another patch Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * another path Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * update * update nit suggestions * make style Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 29 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
* Instantiate model only once in pipeline * Remove documentation of deprecated method * Add FutureWarning * Update src/transformers/pipelines/base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 26 Mar, 2021 2 commits
-
-
Sylvain Gugger authored
* Add ImageFeatureExtractionMixin * Add dummy vision objects * Add require_vision * Add tests * Fix test
-
Tomy Hsieh authored
* Rename NLP library to Datasets library * Update github template * Fix styling
-
- 25 Mar, 2021 2 commits
-
-
Amir Tahmasbi authored
* Added embeddings layer * Added layoutlm layers, main model, maskedlm and token classification classes * Added model classes to tf auto models * Added model to PT to TF conversion script * Added model to doc README * Added tests * Removed unused imports * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py * Made tests pass! * Fixed typos in imports and docs * Fixed a typo in embeddings layer * Removed imports * Fixed formatting issues, imports, tests * Added layoutlm layers, main model, maskedlm and token classification classes * Added model classes to tf auto models * Added model to PT to TF conversion script * Removed unused imports * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py * Made tests pass! * Fixed typos in imports and docs * Removed imports * Fixed small formatting issues * Removed duplicates import from main __init__.py * Chnaged deafult arg to true for adding pooling layer to tf layoutlm * Fixed formatting issues * Style * Added copied from to classes copied from bert * Fixed doc strings examples to work with layoutlm inputs * Removed PyTorch reference in doc strings example * Added integration tests * Cleaned up initialization file * Updated model checkpoint identifiers * Fixed imports Co-authored-by:
Amir Tahmasbi <amir@ehsai.ca> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Philipp Schmid authored
-
- 24 Mar, 2021 1 commit
-
-
Eliza Szczechla authored
Co-authored-by:Eliza <eliza@habanero.tiger.com.pl>
-
- 23 Mar, 2021 1 commit
-
-
Philipp Schmid authored
* added finished documentation * changed version from 1.6 to 1.6.0 for distributed * updated versions * updated urls
-
- 22 Mar, 2021 1 commit
-
-
Patrick von Platen authored
* push * finish * finish * make fix copies * change name
-
- 21 Mar, 2021 1 commit
-
-
Eric Lam authored
* Add new community notebook - wav2vec2 with GPT * Update:community.md, new nb add * feat: notebook of wav2vec xlsr ctc decoding with gpt logit adjustment * Update: Wav2vec2 CTC decoding with gpt2 adjustment * Update docs/source/community.md Co-authored-by:Suraj Patil <surajp815@gmail.com>
-
- 18 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
-