- 23 Sep, 2020 1 commit
-
-
Stas Bekman authored
* skip decorators: docs, tests, bugs * another important note * style * bloody style * add @pytest.mark.parametrize * add note * no idea what it wants :(
-
- 22 Sep, 2020 5 commits
-
-
Ola Piktus authored
* added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * Formatting / renaming prior to actual work * First commit * improve comments * Retrieval evaluation scripts * refactor to include modeling outputs + MPI retriever * Fix rag-token model + refactor * Various fixes + finetuning logic * use_bos fix * Retrieval refactor * Finetuning refactoring and cleanup * Add documentation and cleanup * Remove set_up_rag_env.sh file * Fix retrieval wit HF index * Fix import errors * Fix quality errors * Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867 * fix quality * Fix RAG Sequence generation * minor cleanup plus initial tests * fix test * fix tests 2 * Comments fix * post-merge fixes * Improve readme + post-rebase refactor * Extra dependencied for tests * Fix tests * Fix tests 2 * Refactor test requirements * Fix tests 3 * Post-rebase refactor * rename nlp->datasets * RAG integration tests * add tokenizer to slow integration test and allow retriever to run on cpu * add tests; fix position ids warning * change structure * change structure * add from encoder generator * save working solution * make all integration tests pass * add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained * don't save paths * delete unnecessary imports * pass config to AutoTokenizer.from_pretrained for Rag tokenizers * init wiki_dpr only once * hardcode legacy index and passages paths (todo: add the right urls) * finalize config * finalize retriver api and config api * LegacyIndex index download refactor * add dpr to autotokenizer * make from pretrained more flexible * fix ragfortokengeneration * small name changes in tokenizer * add labels to models * change default index name * add retrieval tests * finish token generate * align test with previous version and make all tests pass * add tests * finalize tests * implement thoms suggestions * add first version of test * make first tests work * make retriever platform agnostic * naming * style * add legacy index URL * docstrings + simple retrieval test for distributed * clean model api * add doc_ids to retriever's outputs * fix retrieval tests * finish model outputs * finalize model api * fix generate problem for rag * fix generate for other modles * fix some tests * save intermediate * set generate to default * big refactor generate * delete rag_api * correct pip faiss install * fix auto tokenization test * fix faiss install * fix test * move the distributed logic to examples * model page * docs * finish tests * fix dependencies * fix import in __init__ * Refactor eval_rag and finetune scripts * start docstring * add psutil to test * fix tf test * move require torch to top * fix retrieval test * align naming * finish automodel * fix repo consistency * test ragtokenizer save/load * add rag model output docs * fix ragtokenizer save/load from pretrained * fix tokenizer dir * remove torch in retrieval * fix docs * fixe finetune scripts * finish model docs * finish docs * remove auto model for now * add require torch * remove solved todos * integrate sylvains suggestions * sams comments * correct mistake on purpose * improve README * Add generation test cases * fix rag token * clean token generate * fix test * add note to test * fix attention mask * add t5 test for rag * Fix handling prefix in finetune.py * don't overwrite index_name Co-authored-by:
Patrick Lewis <plewis@fb.com> Co-authored-by:
Aleksandra Piktus <piktus@devfair0141.h2.fair> Co-authored-by:
Aleksandra Piktus <piktus@learnfair5102.h2.fair> Co-authored-by:
Aleksandra Piktus <piktus@learnfair5067.h2.fair> Co-authored-by:
Your Name <you@example.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Quentin Lhoest <lhoest.q@gmail.com>
-
Lysandre authored
-
Lysandre authored
-
Sylvain Gugger authored
* is_pretokenized -> is_split_into_words * Fix tests
-
Minghao Li authored
* first version * finish test docs readme model/config/tokenization class * apply make style and make quality * fix layoutlm GitHub link * fix conflict in index.rst and add layoutlm to pretrained_models.rst * fix bug in test_parents_and_children_in_mappings * reformat modeling_auto.py and tokenization_auto.py * fix bug in test_modeling_layoutlm.py * Update docs/source/model_doc/layoutlm.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/layoutlm.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove inh, add tokenizer fast, and update some doc * copy and rename necessary class from modeling_bert to modeling_layoutlm * Update src/transformers/configuration_layoutlm.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/configuration_layoutlm.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/configuration_layoutlm.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/configuration_layoutlm.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_layoutlm.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_layoutlm.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_layoutlm.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * add mish to activations.py, import ACT2FN and import logging from utils Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 20 Sep, 2020 1 commit
-
-
Stas Bekman authored
Found an issue when `@slow` isn't the last decorator (gets ignored!), so documenting this significance.
-
- 17 Sep, 2020 1 commit
-
-
Stas Bekman authored
* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in https://github.com/huggingface/transformers/pull/6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad15ca6a9ef4e26178526e878e8fc5c85874. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 16 Sep, 2020 1 commit
-
-
Stas Bekman authored
-
- 15 Sep, 2020 1 commit
-
-
Stas Bekman authored
* [docs] add testing documentation * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * tweaks as suggested * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * tweaks * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more tweaks * suggestions from @LysandreJik Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 14 Sep, 2020 3 commits
-
-
sgugger authored
-
Sylvain Gugger authored
-
Bartosz Telenczuk authored
-
- 11 Sep, 2020 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* More readable dict * More nlp -> datasets * Revert "More nlp -> datasets" This reverts commit 3cd1883d226c63c4a686fc1fed35f2cd586ebe45. * Automate the lists in auto-xxx docs * More readable dict * Revert "More nlp -> datasets" This reverts commit 3cd1883d226c63c4a686fc1fed35f2cd586ebe45. * Automate the lists in auto-xxx docs * nlp -> datasets * Fix new key
-
- 10 Sep, 2020 5 commits
-
-
Patrick von Platen authored
* correct docs for bert generation * upload
-
Patrick von Platen authored
-
Sylvain Gugger authored
* Add TF Funnel Transformer * Proper dummy input * Formatting * Update src/transformers/modeling_tf_funnel.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Address review comments * One review comment forgotten Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Patrick von Platen authored
* add conversion script * improve conversion script * make style * add tryout files * fix * update * add causal bert * better names * add tokenizer file as well * finish causal_bert * fix small bugs * improve generate * change naming * renaming * renaming * renaming * remove leftover files * clean files * add fix tokenizer * finalize * correct slow test * update docs * small fixes * fix link * adapt check repo * apply sams and sylvains recommendations * fix import * implement Lysandres recommendations * fix logger warn
-
Stas Bekman authored
-
- 09 Sep, 2020 1 commit
-
-
Stas Bekman authored
* introduce TRANSFORMERS_VERBOSITY env var + test + test helpers * cleanup * remove helper function
-
- 08 Sep, 2020 2 commits
-
-
Sam Shleifer authored
-
Sylvain Gugger authored
* Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Fix copyright * Forgot some layers can be repeated * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/modeling_funnel.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> * Slow integration test * Make small integration test * Formatting * Add checkpoint and separate classification head * Formatting * Expand list, fix link and add in pretrained models * Styling * Add the model in all summaries * Typo fixes Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
-
- 03 Sep, 2020 1 commit
-
-
Antonio V Mendoza authored
Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793) * added template files for LXMERT and competed the configuration_lxmert.py * added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested] * added model card for lxmert * cleaning up lxmert code * Update src/transformers/modeling_lxmert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_lxmert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * tested torch lxmert, changed documtention, updated outputs, and other small fixes * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * renaming, other small issues, did not change TF code in this commit * added lxmert question answering model in pytorch * added capability to edit number of qa labels for lxmert * made answer optional for lxmert question answering * add option to return hidden_states for lxmert * changed default qa labels for lxmert * changed config archive path * squshing 3 commits: merged UI + testing improvments + more UI and testing * changed some variable names for lxmert * TF LXMERT * Various fixes to LXMERT * Final touches to LXMERT * AutoTokenizer order * Add LXMERT to index.rst and README.md * Merge commit test fixes + Style update * TensorFlow 2.3.0 sequential model changes variable names Remove inherited test * Update src/transformers/modeling_tf_pytorch_utils.py * Update docs/source/model_doc/lxmert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/lxmert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added suggestions * Fixes * Final fixes for TF model * Fix docs Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 02 Sep, 2020 2 commits
-
-
Suraj Patil authored
* add Text2TextGenerationPipeline * remove max length warning * remove comments * remove input_length * fix typo * add tests * use TFAutoModelForSeq2SeqLM * doc * typo * add the doc below TextGenerationPipeline * doc nit * style * delete comment
-
Harry Wang authored
-
- 01 Sep, 2020 6 commits
-
-
Patrick von Platen authored
* finish xlm-roberta * finish docs * expose XLMRobertaForCausalLM
-
Lysandre Debut authored
-
Lysandre authored
-
Lysandre authored
-
Patrick von Platen authored
* fix generate for GPT2 Double Head * fix gpt2 double head model * fix bart / t5 * also add for no beam search * fix no beam search * fix encoder decoder * simplify t5 * simplify t5 * fix t5 tests * fix BART * fix transfo-xl * fix conflict * integrating sylvains and sams comments * fix tf past_decoder_key_values * fix enc dec test
-
Sylvain Gugger authored
* Add logging doc * Foamtting * Update docs/source/main_classes/logging.rst * Update src/transformers/utils/logging.py Co-authored-by:Lysandre Debut <lysandre@huggingface.co>
-
- 27 Aug, 2020 1 commit
-
-
Lysandre Debut authored
-
- 26 Aug, 2020 1 commit
-
-
Patrick von Platen authored
-
- 25 Aug, 2020 1 commit
-
-
Quentin Lhoest authored
* add dpr to models summary * minor * minor * Update docs/source/model_summary.rst qa -> question answering Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_summary.rst qa -> question ansering (cont'd) Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 24 Aug, 2020 2 commits
-
-
Sam Shleifer authored
-
Stas Bekman authored
As suggested here: https://github.com/huggingface/transformers/issues/6651#issuecomment-678594233 this removes generic `generate` doc with examples not-relevant to bart.
-
- 21 Aug, 2020 3 commits
-
-
Suraj Patil authored
-
Patrick von Platen authored
* add pegasus to docs * Update docs/source/model_summary.rst
-
Suraj Patil authored
* added CamembertForCausalLM * add in __init__ and auto model * style * doc
-