- 16 Jun, 2020 3 commits
-
-
Yacine Jernite authored
* add eli5 examples * add dense query script * query_di * merging * merging * add_utils * adds nearest neighbor wikipedia * batch queries * training_retriever * new notebooks * moved retriever traiing script * finished wiki40b * max_len_fix * train_s2s * retriever_batch_checkpointing * cleanup * merge * dim_fix * fix_indexer * fix_wiki40b_snippets * fix_embed_for_r * fp32 index * fix_sparse_q * joint_training * remove obsolete datasets * add_passage_nn_results * add_passage_nn_results * add_batch_nn * add_batch_nn * add_data_scripts * notebook * notebook * notebook * fix_multi_gpu * add_app * full_caching * full_caching * notebook * sparse_done * images * notebook * add_image_gif * with_Gif * add_contr_image * notebook * notebook * notebook * train_functions * notebook * min_retrieval_length * pandas_option * notebook * min_retrieval_length * notebook * notebook * eval_Retriever * notebook * images * notebook * add_example * add_example * notebook * fireworks * notebook * notebook * joe's notebook comments * app_update * notebook * notebook_link * captions * notebook * assing RetriBert model * add RetriBert to Auto * change AutoLMHead to AutoSeq2Seq * notebook downloads from hf models * style_black * style_black * app_update * app_update * fix_app_update * style * style * isort * Delete WikiELI5training.ipynb * Delete evaluate_eli5.py * Delete WikiELI5explore.ipynb * Delete ExploreWikiELI5Support.html * Delete explainlikeimfive.py * Delete wiki_snippets.py * children before parent * children before parent * style_black * style_black_only * isort * isort_new * Update src/transformers/modeling_retribert.py Co-authored-by:
Julien Chaumond <chaumond@gmail.com> * typo fixes * app_without_asset * cleanup * Delete ELI5animation.gif * Delete ELI5contrastive.svg * Delete ELI5wiki_index.svg * Delete choco_bis.svg * Delete fireworks.gif * Delete huggingface_logo.jpg * Delete huggingface_logo.svg * Delete Long_Form_Question_Answering_with_ELI5_and_Wikipedia.ipynb * Delete eli5_app.py * Delete eli5_utils.py * readme * Update README.md * unused imports * moved_info * default_beam * ftuned model * disclaimer * Update src/transformers/modeling_retribert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * black * add_doc * names * isort_Examples * isort_Examples * Add doc to index Co-authored-by:
Julien Chaumond <chaumond@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Sam Shleifer authored
-
Sylvain Gugger authored
* Convert hans to Trainer * Tick box
-
- 15 Jun, 2020 3 commits
-
-
Anthony MOI authored
[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510) * Use tokenizers pre-tokenized pipeline * failing pretrokenized test * Fix is_pretokenized in python * add pretokenized tests * style and quality * better tests for batched pretokenized inputs * tokenizers clean up - new padding_strategy - split the files * [HUGE] refactoring tokenizers - padding - truncation - tests * style and quality * bump up requied tokenizers version to 0.8.0-rc1 * switched padding/truncation API - simpler better backward compat * updating tests for custom tokenizers * style and quality - tests on pad * fix QA pipeline * fix backward compatibility for max_length only * style and quality * Various cleans up - add verbose * fix tests * update docstrings * Fix tests * Docs reformatted * __call__ method documented Co-authored-by:
Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Sylvain Gugger authored
* Make DataCollator a callable * Update src/transformers/data/data_collator.py Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Stefan Schweter authored
* utils_ner: do not add extra sep token for RoBERTa model * run_pl_ner: do not add extra sep token for RoBERTa model
-
- 13 Jun, 2020 1 commit
-
-
Sylvain Gugger authored
* Update hans data to be able to use Trainer * Fixes * Deal with tokenizer that don't have token_ids * Clean up things * Simplify data use * Fix the input dict * Formatting + proper path in README
-
- 11 Jun, 2020 1 commit
-
-
VictorSanh authored
-
- 10 Jun, 2020 1 commit
-
-
Sylvain Gugger authored
* Remove unused arguments * Formatting * Remove second todo comment
-
- 09 Jun, 2020 3 commits
-
-
songyouwei authored
`is_leaf` may become `False` after `.to(device=device)` function call.
-
Sam Shleifer authored
-
Amil Khare authored
-
- 08 Jun, 2020 1 commit
-
-
daniel-shan authored
Co-authored-by:Daniel Shan <daniel.shan@workday.com>
-
- 06 Jun, 2020 1 commit
-
-
- 05 Jun, 2020 2 commits
-
-
Sam Shleifer authored
-
Julien Chaumond authored
-
- 04 Jun, 2020 3 commits
-
-
Stefan Schweter authored
* ner: add preprocessing script for examples that splits longer sentences * ner: example shell scripts use local preprocessing now * ner: add new example section for WNUT’17 NER task. Remove old English CoNLL-03 results * ner: satisfy black and isort
-
prajjwal1 authored
-
Jason Phang authored
-
- 02 Jun, 2020 4 commits
-
-
Jin Young Sohn authored
* Glue task cleaup * Enable writing cache to cache_dir in case dataset lives in readOnly filesystem. * Differentiate match vs mismatch for MNLI metrics. * Style * Fix pytype * Fix type * Use cache_dir in mnli mismatch eval dataset * Small Tweaks Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Julien Chaumond authored
*
🐛 Fix model ids for BART and Flaubert -
Julien Chaumond authored
* Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI
-
Lysandre Debut authored
-
- 01 Jun, 2020 15 commits
-
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Victor SANH authored
Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
-
Victor SANH authored
-
- 27 May, 2020 2 commits
-
-
Patrick von Platen authored
* improve memory benchmarking * correct typo * fix current memory * check torch memory allocated * better pytorch function * add total cached gpu memory * add total gpu required * improve torch gpu usage * update memory usage * finalize memory tracing * save intermediate benchmark class * fix conflict * improve benchmark * improve benchmark * finalize * make style * improve benchmarking * correct typo * make train function more flexible * fix csv save * better repr of bytes * better print * fix __repr__ bug * finish plot script * rename plot file * delete csv and small improvements * fix in plot * fix in plot * correct usage of timeit * remove redundant line * remove redundant line * fix bug * add hf parser tests * add versioning and platform info * make style * add gpu information * ensure backward compatibility * finish adding all tests * Update src/transformers/benchmark/benchmark_args.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/benchmark/benchmark_args_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * delete csv files * fix isort ordering * add out of memory handling * add better train memory handling Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Lysandre Debut authored
* per_device instead of per_gpu/error thrown when argument unknown * [docs] Restore examples.md symlink * Correct absolute links so that symlink to the doc works correctly * Update src/transformers/hf_argparser.py Co-authored-by:
Julien Chaumond <chaumond@gmail.com> * Warning + reorder * Docs * Style * not for squad Co-authored-by:
Julien Chaumond <chaumond@gmail.com>
-