1. 28 Sep, 2020 1 commit
  2. 24 Sep, 2020 1 commit
  3. 23 Sep, 2020 1 commit
  4. 22 Sep, 2020 2 commits
    • Ola Piktus's avatar
      RAG (#6813) · c754c41c
      Ola Piktus authored
      * added rag WIP
      
      * path fix
      
      * Formatting / renaming prior to actual work
      
      * added rag WIP
      
      * path fix
      
      * Formatting / renaming prior to actual work
      
      * added rag WIP
      
      * path fix
      
      * Formatting / renaming prior to actual work
      
      * added rag WIP
      
      * Formatting / renaming prior to actual work
      
      * First commit
      
      * improve comments
      
      * Retrieval evaluation scripts
      
      * refactor to include modeling outputs + MPI retriever
      
      * Fix rag-token model + refactor
      
      * Various fixes + finetuning logic
      
      * use_bos fix
      
      * Retrieval refactor
      
      * Finetuning refactoring and cleanup
      
      * Add documentation and cleanup
      
      * Remove set_up_rag_env.sh file
      
      * Fix retrieval wit HF index
      
      * Fix import errors
      
      * Fix quality errors
      
      * Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867
      
      
      
      * fix quality
      
      * Fix RAG Sequence generation
      
      * minor cleanup plus initial tests
      
      * fix test
      
      * fix tests 2
      
      * Comments fix
      
      * post-merge fixes
      
      * Improve readme + post-rebase refactor
      
      * Extra dependencied for tests
      
      * Fix tests
      
      * Fix tests 2
      
      * Refactor test requirements
      
      * Fix tests 3
      
      * Post-rebase refactor
      
      * rename nlp->datasets
      
      * RAG integration tests
      
      * add tokenizer to slow integration test and allow retriever to run on cpu
      
      * add tests; fix position ids warning
      
      * change structure
      
      * change structure
      
      * add from encoder generator
      
      * save working solution
      
      * make all integration tests pass
      
      * add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained
      
      * don't save paths
      
      * delete unnecessary imports
      
      * pass config to AutoTokenizer.from_pretrained for Rag tokenizers
      
      * init wiki_dpr only once
      
      * hardcode legacy index and passages paths (todo: add the right urls)
      
      * finalize config
      
      * finalize retriver api and config api
      
      * LegacyIndex index download refactor
      
      * add dpr to autotokenizer
      
      * make from pretrained more flexible
      
      * fix ragfortokengeneration
      
      * small name changes in tokenizer
      
      * add labels to models
      
      * change default index name
      
      * add retrieval tests
      
      * finish token generate
      
      * align test with previous version and make all tests pass
      
      * add tests
      
      * finalize tests
      
      * implement thoms suggestions
      
      * add first version of test
      
      * make first tests work
      
      * make retriever platform agnostic
      
      * naming
      
      * style
      
      * add legacy index URL
      
      * docstrings + simple retrieval test for distributed
      
      * clean model api
      
      * add doc_ids to retriever's outputs
      
      * fix retrieval tests
      
      * finish model outputs
      
      * finalize model api
      
      * fix generate problem for rag
      
      * fix generate for other modles
      
      * fix some tests
      
      * save intermediate
      
      * set generate to default
      
      * big refactor generate
      
      * delete rag_api
      
      * correct pip faiss install
      
      * fix auto tokenization test
      
      * fix faiss install
      
      * fix test
      
      * move the distributed logic to examples
      
      * model page
      
      * docs
      
      * finish tests
      
      * fix dependencies
      
      * fix import in __init__
      
      * Refactor eval_rag and finetune scripts
      
      * start docstring
      
      * add psutil to test
      
      * fix tf test
      
      * move require torch to top
      
      * fix retrieval test
      
      * align naming
      
      * finish automodel
      
      * fix repo consistency
      
      * test ragtokenizer save/load
      
      * add rag model output docs
      
      * fix ragtokenizer save/load from pretrained
      
      * fix tokenizer dir
      
      * remove torch in retrieval
      
      * fix docs
      
      * fixe finetune scripts
      
      * finish model docs
      
      * finish docs
      
      * remove auto model for now
      
      * add require torch
      
      * remove solved todos
      
      * integrate sylvains suggestions
      
      * sams comments
      
      * correct mistake on purpose
      
      * improve README
      
      * Add generation test cases
      
      * fix rag token
      
      * clean token generate
      
      * fix test
      
      * add note to test
      
      * fix attention mask
      
      * add t5 test for rag
      
      * Fix handling prefix in finetune.py
      
      * don't overwrite index_name
      Co-authored-by: default avatarPatrick Lewis <plewis@fb.com>
      Co-authored-by: default avatarAleksandra Piktus <piktus@devfair0141.h2.fair>
      Co-authored-by: default avatarAleksandra Piktus <piktus@learnfair5102.h2.fair>
      Co-authored-by: default avatarAleksandra Piktus <piktus@learnfair5067.h2.fair>
      Co-authored-by: default avatarYour Name <you@example.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarQuentin Lhoest <lhoest.q@gmail.com>
      c754c41c
    • Minghao Li's avatar
      Add LayoutLM Model (#7064) · cd9a0585
      Minghao Li authored
      
      
      * first version
      
      * finish test docs readme model/config/tokenization class
      
      * apply make style and make quality
      
      * fix layoutlm GitHub link
      
      * fix conflict in index.rst and add layoutlm to pretrained_models.rst
      
      * fix bug in test_parents_and_children_in_mappings
      
      * reformat modeling_auto.py and tokenization_auto.py
      
      * fix bug in test_modeling_layoutlm.py
      
      * Update docs/source/model_doc/layoutlm.rst
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update docs/source/model_doc/layoutlm.rst
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * remove inh, add tokenizer fast, and update some doc
      
      * copy and rename necessary class from modeling_bert to modeling_layoutlm
      
      * Update src/transformers/configuration_layoutlm.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/configuration_layoutlm.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/configuration_layoutlm.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/configuration_layoutlm.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/modeling_layoutlm.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/modeling_layoutlm.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/modeling_layoutlm.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * add mish to activations.py, import ACT2FN and import logging from utils
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      cd9a0585
  5. 17 Sep, 2020 1 commit
    • Stas Bekman's avatar
      [ported model] FSMT (FairSeq MachineTranslation) (#6940) · 1eeb206b
      Stas Bekman authored
      * ready for PR
      
      * cleanup
      
      * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST
      
      * fix
      
      * perfectionism
      
      * revert change from another PR
      
      * odd, already committed this one
      
      * non-interactive upload workaround
      
      * backup the failed experiment
      
      * store langs in config
      
      * workaround for localizing model path
      
      * doc clean up as in https://github.com/huggingface/transformers/pull/6956
      
      
      
      * style
      
      * back out debug mode
      
      * document: run_eval.py --num_beams 10
      
      * remove unneeded constant
      
      * typo
      
      * re-use bart's Attention
      
      * re-use EncoderLayer, DecoderLayer from bart
      
      * refactor
      
      * send to cuda and fp16
      
      * cleanup
      
      * revert (moved to another PR)
      
      * better error message
      
      * document run_eval --num_beams
      
      * solve the problem of tokenizer finding the right files when model is local
      
      * polish, remove hardcoded config
      
      * add a note that the file is autogenerated to avoid losing changes
      
      * prep for org change, remove unneeded code
      
      * switch to model4.pt, update scores
      
      * s/python/bash/
      
      * missing init (but doesn't impact the finetuned model)
      
      * cleanup
      
      * major refactor (reuse-bart)
      
      * new model, new expected weights
      
      * cleanup
      
      * cleanup
      
      * full link
      
      * fix model type
      
      * merge porting notes
      
      * style
      
      * cleanup
      
      * have to create a DecoderConfig object to handle vocab_size properly
      
      * doc fix
      
      * add note (not a public class)
      
      * parametrize
      
      * - add bleu scores integration tests
      
      * skip test if sacrebleu is not installed
      
      * cache heavy models/tokenizers
      
      * some tweaks
      
      * remove tokens that aren't used
      
      * more purging
      
      * simplify code
      
      * switch to using decoder_start_token_id
      
      * add doc
      
      * Revert "major refactor (reuse-bart)"
      
      This reverts commit 226dad15ca6a9ef4e26178526e878e8fc5c85874.
      
      * decouple from bart
      
      * remove unused code #1
      
      * remove unused code #2
      
      * remove unused code #3
      
      * update instructions
      
      * clean up
      
      * move bleu eval to examples
      
      * check import only once
      
      * move data+gen script into files
      
      * reuse via import
      
      * take less space
      
      * add prepare_seq2seq_batch (auto-tested)
      
      * cleanup
      
      * recode test to use json instead of yaml
      
      * ignore keys not needed
      
      * use the new -y in transformers-cli upload -y
      
      * [xlm tok] config dict: fix str into int to match definition (#7034)
      
      * [s2s] --eval_max_generate_length (#7018)
      
      * Fix CI with change of name of nlp (#7054)
      
      * nlp -> datasets
      
      * More nlp -> datasets
      
      * Woopsie
      
      * More nlp -> datasets
      
      * One last
      
      * extending to support allen_nlp wmt models
      
      - allow a specific checkpoint file to be passed
      - more arg settings
      - scripts for allen_nlp models
      
      * sync with changes
      
      * s/fsmt-wmt/wmt/ in model names
      
      * s/fsmt-wmt/wmt/ in model names (p2)
      
      * s/fsmt-wmt/wmt/ in model names (p3)
      
      * switch to a better checkpoint
      
      * typo
      
      * make non-optional args such - adjust tests where possible or skip when there is no other choice
      
      * consistency
      
      * style
      
      * adjust header
      
      * cards moved (model rename)
      
      * use best custom hparams
      
      * update info
      
      * remove old cards
      
      * cleanup
      
      * s/stas/facebook/
      
      * update scores
      
      * s/allen_nlp/allenai/
      
      * url maps aren't needed
      
      * typo
      
      * move all the doc / build /eval generators to their own scripts
      
      * cleanup
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * fix indent
      
      * duplicated line
      
      * style
      
      * use the correct add_start_docstrings
      
      * oops
      
      * resizing can't be done with the core approach, due to 2 dicts
      
      * check that the arg is a list
      
      * style
      
      * style
      Co-authored-by: default avatarSam Shleifer <sshleifer@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      1eeb206b
  6. 15 Sep, 2020 1 commit
  7. 10 Sep, 2020 1 commit
    • Patrick von Platen's avatar
      Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. (#6594) · 7fd1febf
      Patrick von Platen authored
      * add conversion script
      
      * improve conversion script
      
      * make style
      
      * add tryout files
      
      * fix
      
      * update
      
      * add causal bert
      
      * better names
      
      * add tokenizer file as well
      
      * finish causal_bert
      
      * fix small bugs
      
      * improve generate
      
      * change naming
      
      * renaming
      
      * renaming
      
      * renaming
      
      * remove leftover files
      
      * clean files
      
      * add fix tokenizer
      
      * finalize
      
      * correct slow test
      
      * update docs
      
      * small fixes
      
      * fix link
      
      * adapt check repo
      
      * apply sams and sylvains recommendations
      
      * fix import
      
      * implement Lysandres recommendations
      
      * fix logger warn
      7fd1febf
  8. 08 Sep, 2020 1 commit
  9. 03 Sep, 2020 1 commit
  10. 01 Sep, 2020 1 commit
  11. 17 Aug, 2020 2 commits
  12. 14 Aug, 2020 1 commit
    • Suraj Patil's avatar
      MBartForConditionalGeneration (#6441) · 680f1337
      Suraj Patil authored
      * add MBartForConditionalGeneration
      
      * style
      
      * rebase and fixes
      
      * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS
      
      * fix docs
      
      * don't ignore mbart
      
      * doc
      
      * fix mbart fairseq link
      
      * put mbart before bart
      
      * apply doc suggestions
      680f1337
  13. 11 Aug, 2020 1 commit
  14. 03 Aug, 2020 1 commit
  15. 30 Jul, 2020 1 commit
  16. 29 Jul, 2020 1 commit
  17. 24 Jul, 2020 1 commit
  18. 10 Jul, 2020 1 commit
  19. 07 Jul, 2020 2 commits
    • Joe Davison's avatar
      Guide to fixed-length model perplexity evaluation (#5449) · b4b33fdf
      Joe Davison authored
      * add first draft ppl guide
      
      * upload imgs
      
      * expand on strides
      
      * ref typo
      
      * rm superfluous past var
      
      * add tokenization disclaimer
      b4b33fdf
    • Quentin Lhoest's avatar
      Add DPR model (#5279) · fbd87921
      Quentin Lhoest authored
      
      
      * beginning of dpr modeling
      
      * wip
      
      * implement forward
      
      * remove biencoder + better init weights
      
      * export dpr model to embed model for nlp lib
      
      * add new api
      
      * remove old code
      
      * make style
      
      * fix dumb typo
      
      * don't load bert weights
      
      * docs
      
      * docs
      
      * style
      
      * move the `k` parameter
      
      * fix init_weights
      
      * add pretrained configs
      
      * minor
      
      * update config names
      
      * style
      
      * better config
      
      * style
      
      * clean code based on PR comments
      
      * change Dpr to DPR
      
      * fix config
      
      * switch encoder config to a dict
      
      * style
      
      * inheritance -> composition
      
      * add messages in assert startements
      
      * add dpr reader tokenizer
      
      * one tokenizer per model
      
      * fix base_model_prefix
      
      * fix imports
      
      * typo
      
      * add convert script
      
      * docs
      
      * change tokenizers conf names
      
      * style
      
      * change tokenizers conf names
      
      * minor
      
      * minor
      
      * fix wrong names
      
      * minor
      
      * remove unused convert functions
      
      * rename convert script
      
      * use return_tensors in tokenizers
      
      * remove n_questions dim
      
      * move generate logic to tokenizer
      
      * style
      
      * add docs
      
      * docs
      
      * quality
      
      * docs
      
      * add tests
      
      * style
      
      * add tokenization tests
      
      * DPR full tests
      
      * Stay true to the attention mask building
      
      * update docs
      
      * missing param in bert input docs
      
      * docs
      
      * style
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      fbd87921
  20. 02 Jul, 2020 1 commit
  21. 30 Jun, 2020 1 commit
  22. 27 Jun, 2020 1 commit
  23. 25 Jun, 2020 1 commit
  24. 24 Jun, 2020 2 commits
  25. 22 Jun, 2020 1 commit
  26. 19 Jun, 2020 1 commit
    • Vasily Shamporov's avatar
      Add MobileBert (#4901) · 9a3f9108
      Vasily Shamporov authored
      
      
      * Add MobileBert
      
      * Quality + Conversion script
      
      * style
      
      * Update src/transformers/modeling_mobilebert.py
      
      * Links to S3
      
      * Style
      
      * TFMobileBert
      
      Slight fixes to the pytorch MobileBert
      Style
      
      * MobileBertForMaskedLM (PT + TF)
      
      * MobileBertForNextSentencePrediction (PT + TF)
      
      * MobileFor{MultipleChoice, TokenClassification} (PT + TF)
      
      
      ss
      
      * Tests + Auto
      
      * Doc
      
      * Tests
      
      * Addressing @sgugger's comments
      
      * Adressing @patrickvonplaten's comments
      
      * Style
      
      * Style
      
      * Integration test
      
      * style
      
      * Model card
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      9a3f9108
  27. 17 Jun, 2020 2 commits
  28. 16 Jun, 2020 1 commit
    • Yacine Jernite's avatar
      Eli5 examples (#4968) · 49c52025
      Yacine Jernite authored
      
      
      * add eli5 examples
      
      * add dense query script
      
      * query_di
      
      * merging
      
      * merging
      
      * add_utils
      
      * adds nearest neighbor wikipedia
      
      * batch queries
      
      * training_retriever
      
      * new notebooks
      
      * moved retriever traiing script
      
      * finished wiki40b
      
      * max_len_fix
      
      * train_s2s
      
      * retriever_batch_checkpointing
      
      * cleanup
      
      * merge
      
      * dim_fix
      
      * fix_indexer
      
      * fix_wiki40b_snippets
      
      * fix_embed_for_r
      
      * fp32 index
      
      * fix_sparse_q
      
      * joint_training
      
      * remove obsolete datasets
      
      * add_passage_nn_results
      
      * add_passage_nn_results
      
      * add_batch_nn
      
      * add_batch_nn
      
      * add_data_scripts
      
      * notebook
      
      * notebook
      
      * notebook
      
      * fix_multi_gpu
      
      * add_app
      
      * full_caching
      
      * full_caching
      
      * notebook
      
      * sparse_done
      
      * images
      
      * notebook
      
      * add_image_gif
      
      * with_Gif
      
      * add_contr_image
      
      * notebook
      
      * notebook
      
      * notebook
      
      * train_functions
      
      * notebook
      
      * min_retrieval_length
      
      * pandas_option
      
      * notebook
      
      * min_retrieval_length
      
      * notebook
      
      * notebook
      
      * eval_Retriever
      
      * notebook
      
      * images
      
      * notebook
      
      * add_example
      
      * add_example
      
      * notebook
      
      * fireworks
      
      * notebook
      
      * notebook
      
      * joe's notebook comments
      
      * app_update
      
      * notebook
      
      * notebook_link
      
      * captions
      
      * notebook
      
      * assing RetriBert model
      
      * add RetriBert to Auto
      
      * change AutoLMHead to AutoSeq2Seq
      
      * notebook downloads from hf models
      
      * style_black
      
      * style_black
      
      * app_update
      
      * app_update
      
      * fix_app_update
      
      * style
      
      * style
      
      * isort
      
      * Delete WikiELI5training.ipynb
      
      * Delete evaluate_eli5.py
      
      * Delete WikiELI5explore.ipynb
      
      * Delete ExploreWikiELI5Support.html
      
      * Delete explainlikeimfive.py
      
      * Delete wiki_snippets.py
      
      * children before parent
      
      * children before parent
      
      * style_black
      
      * style_black_only
      
      * isort
      
      * isort_new
      
      * Update src/transformers/modeling_retribert.py
      Co-authored-by: default avatarJulien Chaumond <chaumond@gmail.com>
      
      * typo fixes
      
      * app_without_asset
      
      * cleanup
      
      * Delete ELI5animation.gif
      
      * Delete ELI5contrastive.svg
      
      * Delete ELI5wiki_index.svg
      
      * Delete choco_bis.svg
      
      * Delete fireworks.gif
      
      * Delete huggingface_logo.jpg
      
      * Delete huggingface_logo.svg
      
      * Delete Long_Form_Question_Answering_with_ELI5_and_Wikipedia.ipynb
      
      * Delete eli5_app.py
      
      * Delete eli5_utils.py
      
      * readme
      
      * Update README.md
      
      * unused imports
      
      * moved_info
      
      * default_beam
      
      * ftuned model
      
      * disclaimer
      
      * Update src/transformers/modeling_retribert.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * black
      
      * add_doc
      
      * names
      
      * isort_Examples
      
      * isort_Examples
      
      * Add doc to index
      Co-authored-by: default avatarJulien Chaumond <chaumond@gmail.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      49c52025
  29. 05 Jun, 2020 1 commit
  30. 19 May, 2020 1 commit
  31. 10 May, 2020 1 commit
  32. 07 May, 2020 1 commit
    • Patrick von Platen's avatar
      Reformer (#3351) · dca34695
      Patrick von Platen authored
      * first copy & past commit from Bert and morgans LSH code
      
      * add easy way to compare to trax original code
      
      * translate most of function
      
      * make trax lsh self attention deterministic with numpy seed + copy paste code
      
      * add same config
      
      * add same config
      
      * make layer init work
      
      * implemented hash_vectors function for lsh attention
      
      * continue reformer translation
      
      * hf LSHSelfAttentionLayer gives same output as trax layer
      
      * refactor code
      
      * refactor code
      
      * refactor code
      
      * refactor
      
      * refactor + add reformer config
      
      * delete bogus file
      
      * split reformer attention layer into two layers
      
      * save intermediate step
      
      * save intermediate step
      
      * make test work
      
      * add complete reformer block layer
      
      * finish reformer layer
      
      * implement causal and self mask
      
      * clean reformer test and refactor code
      
      * fix merge conflicts
      
      * fix merge conflicts
      
      * update init
      
      * fix device for GPU
      
      * fix chunk length init for tests
      
      * include morgans optimization
      
      * improve memory a bit
      
      * improve comment
      
      * factorize num_buckets
      
      * better testing parameters
      
      * make whole model work
      
      * make lm model work
      
      * add t5 copy paste tokenizer
      
      * add chunking feed forward
      
      * clean config
      
      * add improved assert statements
      
      * make tokenizer work
      
      * improve test
      
      * correct typo
      
      * extend config
      
      * add complexer test
      
      * add new axial position embeddings
      
      * add local block attention layer
      
      * clean tests
      
      * refactor
      
      * better testing
      
      * save intermediate progress
      
      * clean test file
      
      * make shorter input length work for model
      
      * allow variable input length
      
      * refactor
      
      * make forward pass for pretrained model work
      
      * add generation possibility
      
      * finish dropout and init
      
      * make style
      
      * refactor
      
      * add first version of RevNet Layers
      
      * make forward pass work and add convert file
      
      * make uploaded model forward pass work
      
      * make uploaded model forward pass work
      
      * refactor code
      
      * add namedtuples and cache buckets
      
      * correct head masks
      
      * refactor
      
      * made reformer more flexible
      
      * make style
      
      * remove set max length
      
      * add attention masks
      
      * fix up tests
      
      * fix lsh attention mask
      
      * make random seed optional for the moment
      
      * improve memory in reformer
      
      * add tests
      
      * make style
      
      * make sure masks work correctly
      
      * detach gradients
      
      * save intermediate
      
      * correct backprob through gather
      
      * make style
      
      * change back num hashes
      
      * rename to labels
      
      * fix rotation shape
      
      * fix detach
      
      * update
      
      * fix trainer
      
      * fix backward dropout
      
      * make reformer more flexible
      
      * fix conflict
      
      * fix
      
      * fix
      
      * add tests for fixed seed in reformer layer
      
      * fix trainer typo
      
      * fix typo in activations
      
      * add fp16 tests
      
      * add fp16 training
      
      * support fp16
      
      * correct gradient bug in reformer
      
      * add fast gelu
      
      * re-add dropout for embedding dropout
      
      * better naming
      
      * better naming
      
      * renaming
      
      * finalize test branch
      
      * finalize tests
      
      * add more tests
      
      * finish tests
      
      * fix
      
      * fix type trainer
      
      * fix fp16 tests
      
      * fix tests
      
      * fix tests
      
      * fix tests
      
      * fix issue with dropout
      
      * fix dropout seeds
      
      * correct random seed on gpu
      
      * finalize random seed for dropout
      
      * finalize random seed for dropout
      
      * remove duplicate line
      
      * correct half precision bug
      
      * make style
      
      * refactor
      
      * refactor
      
      * docstring
      
      * remove sinusoidal position encodings for reformer
      
      * move chunking to modeling_utils
      
      * make style
      
      * clean config
      
      * make style
      
      * fix tests
      
      * fix auto tests
      
      * pretrained models
      
      * fix docstring
      
      * update conversion file
      
      * Update pretrained_models.rst
      
      * fix rst
      
      * fix rst
      
      * update copyright
      
      * fix test path
      
      * fix test path
      
      * fix small issue in test
      
      * include reformer in generation tests
      
      * add docs for axial position encoding
      
      * finish docs
      
      * Update convert_reformer_trax_checkpoint_to_pytorch.py
      
      * remove isort
      
      * include sams comments
      
      * remove wrong comment in utils
      
      * correct typos
      
      * fix typo
      
      * Update reformer.rst
      
      * applied morgans optimization
      
      * make style
      
      * make gpu compatible
      
      * remove bogus file
      
      * big test refactor
      
      * add example for chunking
      
      * fix typo
      
      * add to README
      dca34695
  33. 28 Apr, 2020 1 commit
    • Patrick von Platen's avatar
      Clean Encoder-Decoder models with Bart/T5-like API and add generate possibility (#3383) · fa49b9af
      Patrick von Platen authored
      * change encoder decoder style to bart & t5 style
      
      * make encoder decoder generation dummy work for bert
      
      * make style
      
      * clean init config in encoder decoder
      
      * add tests for encoder decoder models
      
      * refactor and add last tests
      
      * refactor and add last tests
      
      * fix attn masks for bert encoder decoder
      
      * make style
      
      * refactor prepare inputs for Bert
      
      * refactor
      
      * finish encoder decoder
      
      * correct typo
      
      * add docstring to config
      
      * finish
      
      * add tests
      
      * better naming
      
      * make style
      
      * fix flake8
      
      * clean docstring
      
      * make style
      
      * rename
      fa49b9af
  34. 16 Apr, 2020 1 commit
  35. 03 Apr, 2020 1 commit
    • Lysandre Debut's avatar
      ELECTRA (#3257) · d5d7d886
      Lysandre Debut authored
      * Electra wip
      
      * helpers
      
      * Electra wip
      
      * Electra v1
      
      * ELECTRA may be saved/loaded
      
      * Generator & Discriminator
      
      * Embedding size instead of halving the hidden size
      
      * ELECTRA Tokenizer
      
      * Revert BERT helpers
      
      * ELECTRA Conversion script
      
      * Archive maps
      
      * PyTorch tests
      
      * Start fixing tests
      
      * Tests pass
      
      * Same configuration for both models
      
      * Compatible with base + large
      
      * Simplification + weight tying
      
      * Archives
      
      * Auto + Renaming to standard names
      
      * ELECTRA is uncased
      
      * Tests
      
      * Slight API changes
      
      * Update tests
      
      * wip
      
      * ElectraForTokenClassification
      
      * temp
      
      * Simpler arch + tests
      
      Removed ElectraForPreTraining which will be in a script
      
      * Conversion script
      
      * Auto model
      
      * Update links to S3
      
      * Split ElectraForPreTraining and ElectraForTokenClassification
      
      * Actually test PreTraining model
      
      * Remove num_labels from configuration
      
      * wip
      
      * wip
      
      * From discriminator and generator to electra
      
      * Slight API changes
      
      * Better naming
      
      * TensorFlow ELECTRA tests
      
      * Accurate conversion script
      
      * Added to conversion script
      
      * Fast ELECTRA tokenizer
      
      * Style
      
      * Add ELECTRA to README
      
      * Modeling Pytorch Doc + Real style
      
      * TF Docs
      
      * Docs
      
      * Correct links
      
      * Correct model intialized
      
      * random fixes
      
      * style
      
      * Addressing Patrick's and Sam's comments
      
      * Correct links in docs
      d5d7d886