- 16 Nov, 2021 3 commits
-
-
Valentin authored
* stop training when a finite IterableDataset is exhausted when using an iterable dataset num_epochs is set to sys.maxsize to make sure all data is consumed likewise we want to set max_steps high enough but still stop when all data is consumed (cherry picked from commit 6f0e1d6363153da9051e93acffe1cbab3a3f3b12) * fix typo flase -> false * add test for stopping training on exhausted finite iterable dataset * remove redundant gradient_accumulation_steps * run make style reformat training_args docstring
-
Sylvain Gugger authored
* Add forward method to dummy models * Fix quality
-
Sylvain Gugger authored
* Fix gradient_checkpointing backward compatibility * Remove needless line * make sure mask prob is big enough and length small enough * Fix tests Co-authored-by:patrickvonplaten <patrick.v.platen@gmail.com>
-
- 15 Nov, 2021 8 commits
-
-
Lysandre Debut authored
* Allow per-version configurations * Update tests/test_configuration_common.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_configuration_common.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Patrick von Platen authored
* [Wav2Vec2] Make sure that gradient checkpointing is only run if needed * make fix-copies
-
Eldar Kurtic authored
Running Movement pruning experiments with the newest HuggingFace would crash due to non-existing BertLayerNorm.
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
NielsRogge authored
-
Patrick von Platen authored
* [Speech2Text2] Enable tokenizers * minor fix * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Matt authored
-
Stas Bekman authored
* [doc] performance and parallelism doc update * improve * improve
-
- 14 Nov, 2021 1 commit
-
-
nbertagnolli authored
* Raise exceptions instead of using asserts for control flow in modeling_openai #12789 * reformatted file
-
- 13 Nov, 2021 2 commits
-
-
Suraj Patil authored
* add return_tensors paramter * fix test * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Suraj Patil authored
-
- 12 Nov, 2021 4 commits
-
-
Li-Huai (Allan) Lin authored
* Add normalizer to FNetConverter * Style * Directly use AlbertConverter
-
Patrick von Platen authored
* improve some stuff * finish * correct last
-
Suraj Patil authored
-
Nicolas Patry authored
* Adding support for raw python `generator` in addition to `Dataset` The main goal is to ease the create of streaming data to the pipe. `Dataset` is more involved and pytorch specific. This PR, provides a way to use a python iterator too. This enabled #14250 but can be proposed as a standalone PR. ```python from transformers import pipeline def read_data(filename): with open(filename, 'r') as f: for line in f: yield f pipe = pipeline("text-classification") for classified in pipe(read_data("large_file.txt")): print("Success ! ", classified) ``` The main caveat of this, is the interaction with `DataLoader` with `num_workers>1`. When you have multiple workers, each receive a copy of the generator (like `IterableDataset`). That means the naive Iterator will fail since all workers iterate on all items of the generator. There are ways to do clever "skipping", but it could be bad still because all workers still do have to pass through all items of the generator (they just ignore items they don't handle), depending on the case it might be bad. Using `num_workers=1` is the simplest fix and if the cost of loading your data is small enough should be good enough. In the above example trying to do smart tricks to skip some lines is unlikely to be a net positive for instance. If there are better ways to do "jumps" on some data, then using `Dataset` is more advised (since then differents workers can just jump themselves). * Adding iterator support for `tf` too.
-
- 11 Nov, 2021 7 commits
-
-
Stas Bekman authored
-
Suraj Patil authored
* fix loading flax bf16 weights in pt * fix clip test * fix t5 test * add logging statement * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * switch back to native any * fix check for bf16 weights Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Matt authored
* Fixing requirements for TF LM models and use correct model mappings * make style
-
Matt authored
* Experimenting with adding proper get_config() and from_config() methods * Adding a test for get/from config * Fix test for get/from config
-
Suraj Patil authored
-
Suraj Patil authored
* fix inits * fix embed dtype * fix embed dtype * add test to check default dtype * quality * add type conversion methods for flax models * more robust casting * cast sinusoidal positions * update pegasus * update albert * update test * make sure dtype is passed to every module * style * fix electra dense * fix t5 * quality * add more tests * better name * use the dtype for lm head computation * fix albert * style * fix albert embed dtype * more tests * fix vision enc-dec * cleanup * fix embed dtype pegasus * fix default param test * doc * update template * fix final_logits_bias dtype * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * fix doc * fix doc * add detailed docstring for dtype parameter * remove un-necessary import Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Stas Bekman authored
-
- 10 Nov, 2021 6 commits
-
-
Li-Huai (Allan) Lin authored
* Fix index out of range when padding * Apply suggestions from code review * Style
-
Chang Wang authored
-
Ella Charlaix authored
* Add notebook applying Intel Neural Compressor quantization for text classification tasks * Add Optimum notebooks section
-
Li-Huai (Allan) Lin authored
* Fix albert mask token tokenization. * Ensure special tokans sanitized. * Style * Fix * Apply suggestions from code review
-
Nicolas Patry authored
* Adding some quality of life for `pipeline` function. * Update docs/source/main_classes/pipelines.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Improve the tests. Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Elad Segal authored
`BatchFeature`: Convert `List[np.ndarray]` to `np.ndarray` before converting to pytorch tensors (#14306) * update * style fix * retrigger checks * check first element * fix syntax error * Update src/transformers/feature_extraction_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove import Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 09 Nov, 2021 9 commits
-
-
Sylvain Gugger authored
-
Patrick von Platen authored
* [Bert2Bert] allow bert2bert + relative embeddings * up * Update README_ko.md * up * up
-
Steven Liu authored
* rewrite guides for fine-tuning with datasets * simple qa code example * use anonymous rST links * style
-
Suraj Patil authored
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
karthikrangasai authored
* Update postporcessing accordingly to use SQuAD metric. * Update assets accordingly based on SQuAD metrics. * Fix function naming error.
-
Yih-Dar authored
* Start the work for TFViTModel * Convert to TF code - need to check in the follow up commits * Clean up model code * Expose TFViTModel * make style * make quality * Add test * make style & quality * Fix some imports * fix wrong usage - *kwargs => ** kwargs * Fix Conv2D weight loading (PT->TF) issue * Add tests for images with different sizes + fix model * Fix some common tests for TFViTModel * Use inputs instead of input_ids in test_compile_tf_model * Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name * Avoid transpose in TFViT call * Fix Conv2D issue in load_tf2_weights_in_pytorch_model * Use tf.keras.layers.Conv2D instead of tf.nn.conv2d * Using simpler heuristic to detect Conv2D layer * Change convert_tf_weight_name_to_pt_weight_name to return TransposeType * Check tf_weight_shape is not None before using it * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix missing comma * fix input dtype Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Apoorv Garg authored
* correct order of overflowing tokens for LayoutLmV2 tokenizer * test to check order of overflowing_tokens for a seq of input_ids * fix up quality * added suggested changes * check that tests the bbox sequence * pair_input test added * pass quality test * check bbox sequence added * unittest method * comments added * add overflowing bbox test * improved "seq_1" Co-authored-by:
SaulLu <55560583+SaulLu@users.noreply.github.com> * improve code quality Co-authored-by:
SaulLu <lucilesaul.com@gmail.com> Co-authored-by:
SaulLu <55560583+SaulLu@users.noreply.github.com>
-
Yih-Dar authored
* Start the work on FlaxVisionEncoderDecoderModel * Add FlaxVisionEncoderDecoderModel * Add VisionEncoderDecoderConfig * Make FlaxVisionEncoderDecoderModel visible to transformers * Add test * Fix wrong getattr usage * Fix tests * Add FlaxAutoModelForVision2Seq * Expose FLAX_MODEL_FOR_VISION_2_SEQ_MAPPING * clean-up * add integration test * update expected logits * update expected scores * Add ViT2GPT2ModelIntegrationTest + some cleaning * Add projection layer + PT/Flax equivalence tests * Fix import * minor changes * make test slow again * Apply suggestions * Add modeling_flax_vision_encoder_decoder to _ignore_modules in get_model_modules() * fix copies * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> * split long strings in multiple lines * decoder_input_ids can't be None * Add back test_configuration_tie * Remove attention_mask parameter * fix test - encoder_last_hidden_state should be encoder_outputs.last_hidden_state instead of the projected vector * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Remove more encoder_attention_mask * remove encoder_attention_mask when calling self.decode (in FlaxVisionEncoderDecoderModule) * Fix style + pass 1s instead of None as encoder_attention_mask * fix init_weights * pass None for encoder_attention_mask * pass 1s instead of None as encoder_attention_mask * Fix doc style Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-