- 01 Dec, 2023 7 commits
-
-
Sanchit Gandhi authored
[MusicGen] Fix mono logit test
-
Marc Sun authored
* better error message * fix logic * fix log
-
Nicolas Patry authored
* [WIP] Make using safetensors files automated. If `use_safetensors=True` is used, and it doesn't exist: - Don't crash just yet - Lookup for an open PR containing it. - If yes, use that instead - If not, touch the space to convert, wait for conversion to be finished and the PR to be opened - Use that new PR - Profit. * Remove the token. * [Auto Safetensors] Websocket -> SSE (#27656) * Websocket -> SSE * Support sharded + tests +cleanup a * env var * Apply suggestions from code review * Thanks Simon * Thanks Wauplin Co-authored-by:
Wauplin <lucainp@gmail.com> * Cleanup * Update tests * Tests should pass * Apply to other tests * Extend extension * relax requirement on latest hfh * Revert * Correct private handling & debug statements * Skip gated repos as of now * Address review comments Co-authored-by:
ArthurZucker <arthur.zucker@gmail.com> --------- Co-authored-by:
Lysandre Debut <hi@lysand.re> Co-authored-by:
Lysandre <lysandre@huggingface.co> Co-authored-by:
Wauplin <lucainp@gmail.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
ArthurZucker <arthur.zucker@gmail.com>
-
Wesley Gifford authored
* Remove config reference and pass num_patches for PatchTSTforPrediction * ensure return_dict is properly set --------- Co-authored-by:Wesley M. Gifford <wmgifford@us.ibm.com>
-
Nolwenn Bernard authored
* partial traduction of installation * Finish translation of installation * Update installation.mdx * Rename installation.mdx to installation.md * Typos * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Address review comments --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Joshua Lochner authored
Fix typo in README
-
Liangliang-Ma authored
change xpu _n_gpu = 1
-
- 30 Nov, 2023 4 commits
-
-
Yoach Lacombe authored
* add working convertion script * first non-working version of modeling code * update modeling code (working) * make style * make fix-copies * add config docstrings * add config to ignore docstrings formatage due to unconventional markdown * fix copies * fix generation num_return_sequences * enrich docs * add and fix tests beside integration tests * update integration tests * update repo id * add tie weights and make style * correct naming in .md * fix imports and so on * correct docstrings * fix fp16 speech forward * fix speechencoder attention * make style * fix copied from * rename SeamlessM4Tv2-v2 to SeamlessM4Tv2 * Apply suggestions on configuration Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove useless public models * fix private models + better naming for T2U models * clean speech encoder relative position embeddings * refactor chunk attention * add docstrings to chunk attention method * improve naming and docstrings * rename some attention variables + add temperature sampling in T2U model * rename DOCSTRINGS variable names * make style + remove 2 useless config parameters * enrich model card * remove any attention_head reference + fix temperature in T2U * new fmt and make style * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * rename spkr_id->speaker_id and change docstrings of get_char_input_ids * simplify v2attention * make style * Update seamless_m4t_v2.md * update code and tests with last update * update repo ids * fill article name, abstract andauthors * update not_doctested and slow_doc tests --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Joao Gante authored
-
Dave Berenbaum authored
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 29 Nov, 2023 2 commits
-
-
Kevin Hu authored
* Update modeling_llama.py * Update modeling_open_llama.py * Update modeling_gpt_neox.py * Update modeling_mistral.py * Update modeling_persimmon.py * Update modeling_phi.py * Update modeling_falcon.py * Update modeling_gpt_neox_japanese.py
-
Kashif Rasul authored
* add distribution head to forecasting * formatting * Add generate function for forecasting * Add generate function to prediction task * formatting * use argsort * add past_observed_mask ordering * fix arguments * docs * add back test_model_outputs_equivalence test * formatting * cleanup * formatting * use ACT2CLS * formatting * fix add_start_docstrings decorator * add distribution head and generate function to regression task add distribution head and generate function to regression task. Also made add PatchTSTForForecastingOutput, PatchTSTForRegressionOutput. * add distribution head and generate function to regression task add distribution head and generate function to regression task. Also made add PatchTSTForForecastingOutput, PatchTSTForRegressionOutput. * fix typos * add forecast_masking * fixed tests * use set_seed * fix doc test * formatting * Update docs/source/en/model_doc/patchtst.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * better var names * rename PatchTSTTranspose * fix argument names and docs string * remove compute_num_patches and unused class * remove assert * renamed to PatchTSTMasking * use num_labels for classification * use num_labels * use default num_labels from super class * move model_type after docstring * renamed PatchTSTForMaskPretraining * bs -> batch_size * more review fixes * use hidden_state * rename encoder layer and block class * remove commented seed_number * edit docstring * Add docstring * formatting * use past_observed_mask * doc suggestion * make fix-copies * use Args: * add docstring * add docstring * change some variable names and add PatchTST before some class names * formatting * fix argument types * fix tests * change x variable to patch_input * format * formatting * fix-copies * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * move loss to forward * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * formatting * fix a bug when pre_norm is set to True * output_hidden_states is set to False as default * set pre_norm=True as default * format docstring * format * output_hidden_states is None by default * add missing docs * better var names * docstring: remove default to False in output_hidden_states * change labels name to target_values in regression task * format * fix tests * change to forecast_mask_ratios and random_mask_ratio * change mask names * change future_values to target_values param in the prediction class * remove nn.Sequential and make PatchTSTBatchNorm class * black * fix argument name for prediction * add output_attentions option * add output_attentions to PatchTSTEncoder * formatting * Add attention output option to all classes * Remove PatchTSTEncoderBlock * create PatchTSTEmbedding class * use config in PatchTSTPatchify * Use config in PatchTSTMasking class * add channel_attn_weights * Add PatchTSTScaler class * add output_attentions arg to test function * format * Update doc with image patchtst.md * fix-copies * rename Forecast <-> Prediction * change name of a few parameters to match with PatchTSMixer. * Remove *ForForecasting class to match with other time series models. * make style * Remove PatchTSTForForecasting in the test * remove PatchTSTForForecastingOutput class * change test_forecast_head to test_prediction_head * style * fix docs * fix tests * change num_labels to num_targets * Remove PatchTSTTranspose * remove arguments in PatchTSTMeanScaler * remove arguments in PatchTSTStdScaler * add config as an argument to all the scaler classes * reformat * Add norm_eps for batchnorm and layernorm * reformat. * reformat * edit docstring * update docstring * change variable name pooling to pooling_type * fix output_hidden_states as tuple * fix bug when calling PatchTSTBatchNorm * change stride to patch_stride * create PatchTSTPositionalEncoding class and restructure the PatchTSTEncoder * formatting * initialize scalers with configs * edit output_hidden_states * style * fix forecast_mask_patches doc string * doc improvements * move summary to the start * typo * fix docstring * turn off masking when using prediction, regression, classification * return scaled output * adjust output when using distribution head * remove _num_patches function in the config * get config.num_patches from patchifier init * add output_attentions docstring, remove tuple in output_hidden_states * change SamplePatchTSTPredictionOutput and SamplePatchTSTRegressionOutput to SamplePatchTSTOutput * remove print("model_class: ", model_class) * change encoder_attention_heads to num_attention_heads * change norm to norm_layer * change encoder_layers to num_hidden_layers * change shared_embedding to share_embedding, shared_projection to share_projection * add output_attentions * more robust check of norm_type * change dropout_path to path_dropout * edit docstring * remove positional_encoding function and add _init_pe in PatchTSTPositionalEncoding * edit shape of cls_token and initialize it * add a check on the num_input_channels. * edit head_dim in the Prediction class to allow the use of cls_token * remove some positional_encoding_type options, remove learn_pe arg, initalize pe * change Exception to ValueError * format * norm_type is "batchnorm" * make style * change cls_token shape * Change forecast_mask_patches to num_mask_patches. Remove forecast_mask_ratios. * Bring PatchTSTClassificationHead on top of PatchTSTForClassification * change encoder_ffn_dim to ffn_dim and edit the docstring. * update variable names to match with the config * add generation tests * change num_mask_patches to num_forecast_mask_patches * Add examples explaining the use of these models * make style * Revert "Revert "[time series] Add PatchTST (#25927)" (#27486)" This reverts commit 78f6ed6c . * make style * fix default std scaler's minimum_scale * fix docstring * close code blocks * Update docs/source/en/model_doc/patchtst.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtst/configuration_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtst/modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix tests * add add_start_docstrings * move examples to the forward's docstrings * update prepare_batch * update test * fix test_prediction_head * fix generation test * use seed to create generator * add output_hidden_states and config.num_patches * add loc and scale args in PatchTSTForPredictionOutput * edit outputs if if not return_dict * use self.share_embedding to check instead checking type. * remove seed * make style * seed is an optional int * fix test * generator device * Fix assertTrue test * swap order of items in outputs when return_dict=False. * add mask_type and random_mask_ratio to unittest * Update modeling_patchtst.py * add add_start_docstrings for regression model * make style * update model path * Edit the ValueError comment in forecast_masking * update examples * make style * fix commented code * update examples: remove config from from_pretrained call * Edit example outputs * Set default target_values to None * remove config setting in regression example * Update configuration_patchtst.py * Update configuration_patchtst.py * remove config from examples * change default d_model and ffn_dim * norm_eps default * set has_attentions to Trye and define self.seq_length = self.num_patche * update docstring * change variable mask_input to do_mask_input * fix blank space. * change logger.debug to logger.warning. * remove unused PATCHTST_INPUTS_DOCSTRING * remove all_generative_model_classes * set test_missing_keys=True * remove undefined params in the docstring. --------- Co-authored-by:
nnguyen <nnguyen@us.ibm.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Nam Nguyen <namctin@gmail.com> Co-authored-by:
Wesley Gifford <79663411+wgifford@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 28 Nov, 2023 11 commits
-
-
Steven Liu authored
* first draft * benchmarks * feedback
-
Tom Aarsen authored
~transformer. -> ~transformers.
-
Susnato Dhar authored
* fixes * more fixes * style fix * more fix * comments
-
Yih-Dar authored
* fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Quentin Gallou茅dec authored
if use_cpu: dataloader_pin_memory = False
-
Juarez Bochi authored
* Add madlad-400 models * Add madlad-400 to the doc table * Update docs/source/en/model_doc/madlad-400.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fill missing details in documentation * Update docs/source/en/model_doc/madlad-400.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Do not doctest madlad-400 Tests are timing out. --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Yih-Dar authored
* log * log --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
update Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
NielsRogge authored
* First draft * Add backwards compatibility * More improvements * More improvements * Improve error message * Address comment * Add conversion script * Fix style * Update code snippet * Adddress comment * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Charbel Abi Daher authored
* Fix passing scheduler-specific kwargs through TrainingArguments `lr_scheduler_kwargs` * Added test for lr_scheduler_kwargs
-
- 27 Nov, 2023 13 commits
-
-
Rockerz authored
* Add `model_docs` * Add * Update Model adoc * Update docs/source/ja/model_doc/bark.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/beit.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/bit.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/blenderbot.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/blenderbot-small.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * update reiew-1 * Update toctree.yml * translating docs and fixes of PR #27401 * Update docs/source/ja/model_doc/bert.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/bert-generation.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update the model docs --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
jiaqiw09 authored
* translate work * update * update * update [[autodoc]] * Update callback.md --------- Co-authored-by:jiaqiw <wangjiaqi50@huawei.com>
-
Matt authored
* Update default ChatML template * Update docs/warnings * Update docs/source/en/chat_templating.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Slight rework --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Peter Pan authored
* docs: replace torch.distributed.run by torchrun `transformers` now officially support pytorch >= 1.10. The entrypoint `torchrun`` is present from 1.10 onwards. Signed-off-by:
Peter Pan <Peter.Pan@daocloud.io> * Update src/transformers/trainer.py with @ArthurZucker's suggestion Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Signed-off-by:
Peter Pan <Peter.Pan@daocloud.io> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
NielsRogge authored
* Fix code snippet * Improve code snippet
-
Yixiao Yuan authored
* fix group_sub_entities bug * add space
-
NielsRogge authored
* Update forward signature * Empty-Commit
-
jiqing-feng authored
* fix assisted decoding attention_cat * fix attention_mask for assisted decoding * fix attention_mask len * fix attn len * Use a more clean way to prepare assistant models inputs * fix param meaning * fix param name * fix assistant model inputs * update token type ids * fix assistant kwargs copy * add encoder-decoder tests of assisted decoding * check if assistant kwargs contains updated keys * revert test * fix whisper tests * fix assistant kwargs * revert whisper test * delete _extend funcs
-
yhshin11 authored
-
Yanan Xie authored
* Fix mistral generate for long prompt / response * Add unit test * fix linter * fix linter * fix test * add assisted generation test for mistral and load the model in 4 bit + fa2
-
Lysandre Debut authored
Reorder
-
Arthur authored
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 26 Nov, 2023 1 commit
-
-
Ilya Gusev authored
* Fix sliding_window hasattr in Mistral * hasattr -> getattr for sliding_window in Mistral --------- Co-authored-by:Ilya Gusev <ilya.gusev@booking.com>
-
- 24 Nov, 2023 2 commits
-
-
Yih-Dar authored
* fix * fix * fix * fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Chi authored
* Successfully resolved the ZeroDivisionError exception in the utils.notebook.y file. * Now I update little code mentioned by Peter * Using Black package to reformat my file * Now I using ruff libary to reformated my file
-