- 25 Jun, 2021 2 commits
-
-
michal pitr authored
-
Patrick von Platen authored
-
- 24 Jun, 2021 5 commits
-
-
Marc van Zee authored
-
Sylvain Gugger authored
-
Lysandre Debut authored
* Fix torchscript tests * Better test * Remove bogus print
-
Suraj Patil authored
-
Richard Liaw authored
Signed-off-by:Richard Liaw <rliaw@berkeley.edu>
-
- 23 Jun, 2021 21 commits
-
-
Sylvain Gugger authored
-
Stas Bekman authored
* document sub_group_size * style * install + issues reporting * style * style * Update docs/source/main_classes/deepspeed.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * indent 4 * restore * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sam Havens authored
mention in `save_strategy` param description that `load_best_model_at_end` can override
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Patrick von Platen authored
* finish t5 flax fixes * improve naming
-
Sylvain Gugger authored
-
Michael Benayoun authored
Co-authored-by:Michael Benayoun <michael@huggingface.co>
-
Lysandre authored
-
Lysandre Debut authored
-
Sylvain Gugger authored
* Add all XxxPreTrainedModel to the main init * Add to template * Add to template bis * Add FlaxT5
-
Sylvain Gugger authored
* Clean push to hub API * Create working dir if it does not exist * Different tweak * New API + all models + test Flax * Adds the Trainer clean up * Update src/transformers/file_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Address review comments * (nit) output types * No need to set clone_from when folder exists * Update src/transformers/trainer.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Add generated_from_trainer tag * Update to new version * Fixes Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
chenht2010 authored
* fix error * make style check happy Co-authored-by:chenhaitao <chenhaitao@qiyi.com>
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * finish * make style
-
Lysandre Debut authored
-
Vasudev Gupta authored
* copy pytorch-t5 * init * boom boom * forward pass same * make generation work * add more tests * make test work * finish normal tests * make fix-copies * finish quality * correct slow example * correct slow test * version table * upload models * Update tests/test_modeling_flax_t5.py * correct incorrectly deleted line Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Patrick von Platen <patrick@huggingface.co>
-
David Fan authored
* Rewrite * [ONNX] rewrite
-
Suraj Patil authored
* add summrization script * fix arguments, preprocessing, metrics * add generation and metrics * auto model, prediction loop * prettify * label smoothing * adress Sylvain and Patricks suggestions * dynamically import shift_tokens_right * fix shift_tokens_right_fn call
-
Daniel Stancl authored
* Add output args to greedy search * Fix critical typo + make style quality * Handle generate_beam_search * Add dict_specific tests and fix the placement of encoder outputs * Add specific outputs * Update doc * Fix typo * Adjust handling encoder_outputs + Fix generating for T5 * Fix generate for RAG * Fix handling ouptut_attentions when target_mapping is not None Take care of situations when target_mapping is provided as there are 2-tuple of attentions Change from: if inputs["output_attentions"]: attentions = tuple(tf.transpose(t, perm(2, 3, 0, 1)) for t in attentions) to: if inputs["output_attentions"]: if inputs["target_mapping"] is not None: # when target_mapping is provided, there are 2-tuple of attentions attentions = tuple( tuple(tf.transpose(attn_stream, perm=(2, 3, 0, 1)) for attn_stream in t) for t in attentions ) else: attentions = tuple(tf.transpose(t, perm=(2, 3, 0, 1)) for t in attentions) * Rename kwargs to model_kwargs * make style quality * Move imports in test_modeling_tf_common.py Move ModelOutput-related imports in test_modeling_tf_common.py into the `is_tf_available():` statement. * Rewrite nested if-statements * Fix added tests -
Nicolas Patry authored
* Optimizing away the `fill-mask` pipeline. - Don't send anything to the tokenizer unless needed. Vocab check is much faster - Keep BC by sending data to the tokenizer when needed. User handling warning messages will see performance benefits again - Make `targets` and `top_k` work together better `top_k` cannot be higher than `len(targets)` but can be smaller still. - Actually simplify the `target_ids` in case of duplicate (it can happen because we're parsing raw strings) - Removed useless code to fail on empty strings. It works only if empty string is in first position, moved to ignoring them instead. - Changed the related tests as only the tests would fail correctly (having incorrect value in first position) * Make tests compatible for 2 different vocabs... (at the price of a warning). Co-authored-by: @EtaoinWu * ValueError working globally * Update src/transformers/pipelines/fill_mask.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * `tokenizer.vocab` -> `tokenizer.get_vocab()` for more compatiblity + fallback. Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Kevin Canwen Xu authored
* Add optional dependency * Add CodeCarbon integration * Add CodeCarbon integration * Add CodeCarbon integration * typo
-
- 22 Jun, 2021 10 commits
-
-
Stas Bekman authored
* initial performance document * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * rewrites based on suggestions * 8x multiple is for AMP only * add contribute section Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sylvain Gugger authored
-
Stas Bekman authored
* bug fixes and a rename * add extended DDP test
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * push * finish * some typos * add more info on communication * add suggestions
-
Kilian Kluge authored
* Replace conditional generation example (fixes #12268) * Replace model in summarization example with finetuned checkpoint, adapt example text * Fix typo in new summarization example * Fix docstring formatting, add missing import statement to example
-
Suraj Patil authored
-
Stefan Schweter authored
-
Hamid Shojanazeri authored
* registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing * sytle format * adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue * adding the try catch to the fix as persistent flag is only available from PT >1.6 * adding version check * added the condition to only use the token_type_ids buffer when its autogenerated not passed by user * adding comments and making the conidtion where token_type_ids are None to use the registered buffer * taking out position-embeddding from the if block * adding comments * handling the case if buffer for position_ids was not registered * reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings * reverting the token_type_ids in case of None to the previous version * reverting changes on position_ids adding back the if block * changes added by running make fix-copies * changes added by running make fix-copies and added the import version as it was getting used * changes added by running make fix-copies * changes added by running make fix-copies * fixing the import format * fixing the import format * modified to use temp tensor for trimed and expanded token_type_ids buffer * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * clean up * clean up * clean up * clean up * Nit * Nit * Nit * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * changes based on latest in master * Adapt templates * Add version import Co-authored-by:
Ubuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Stas Bekman authored
* [tests] multiple improvements * cleanup * style * todo to investigate * fix
-
Stas Bekman authored
* set log level from CLI * add log_level_replica + test + extended docs * cleanup * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * rename datasets objects to allow datasets module * improve the doc * style * doc improve Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 21 Jun, 2021 2 commits
-
-
Stas Bekman authored
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * add commands for flax/jax
-