- 05 Dec, 2023 12 commits
-
-
Younes Belkada authored
Update quantization.md
-
Aaron Jimenez authored
* Copy perplexity.md file to es/ folder * Adding perplexity to es/_toctree.yml * Translate first section * Calculating PPL section translate * Example section translate * fix translate of log-likehood * Fix title translate * Fix \ in second paragraph * Change verosimilitud for log-likelihood * Run 'make style'
-
Vedat Baday authored
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
NielsRogge authored
Address test
-
Arindam Jati authored
* patchtsmixer initial commit * x,y->context_values,target_values, unittest addded * cleanup code * minor * return hidden states * model tests, partial integration tests * ettm notebook temporary * minor * config mask bug fix, tests updated * final ETT notebooks * add selfattn * init * added docstrings * PatchTSMixerForPretraining -> PatchTSMixerForMaskPretraining * functionality tests added * add start and input docstrings * docstring edits * testcase edits * minor changes * docstring error fixed * ran make fixup * finalize integration tests and docs * minor * cleaned gitignore * added dataclass decorator, ran black formatter * ran ruff * formatting * add slow decorator * renamed in_Channel to input_size and default to 1 * shorten dataclass names * use smaller model for testing * moved the 3 heads to the modeling file * use scalers instead of revin * support forecast_channel_indices * fix regression scaling * undo reg. scaling * removed unneeded classes * forgot missing * add more layers * add copied positional_encoding * use patchmask from patchtst * removed dependency on layers directory * formatting * set seed * removed unused imports * fixed forward signature test * adding distributional head for PatchTSMixerForecasting * add generate to forecast * testcases for generate * add generate and distributional head for regression * raise Exception for negative values for neg binominal distribution * formatting changes * remove copied from patchtst and add TODO for test passing * make copies * doc edits * minor changes * format issues * minor changes * minor changes * format docstring * change some class names to PatchTSMixer + class name Transpose to PatchTSMixerTranspose GatedAttention to PatchTSMixerGatedAttention * change NormLayer to PatchTSMixerNormLayer * change MLP to PatchTSMixerMLP * change PatchMixer to PatchMixerBlock, FeatureMixer to FeatureMixerBlock * change ChannelFeatureMixer to ChannelFeatureMixerBlock * change PatchMasking to PatchTSMixerMasking * change Patchify to PatchTSMixerPatchify * list to `list` * fix docstrings * formatting * change bs to batch_size, edit forecast_masking * edit random_masking * change variable name and update docstring in PatchTSMixerMasking * change variable name and update docstring in InjectScalerStatistics4D * update forward call in PatchTSMixerTranspose * change variable name and update docstring in PatchTSMixerNormLayer * change variable name and update docstring in PatchTSMixerMLP * change variable name and update docstring in ChannelFeatureMixerBlock * formatting * formatting issues * docstring issue * fixed observed_mask type in docstrings * use FloatTensor type * formatting * fix rescaling issue in forecasting, fixed integration tests * add docstring from decorator * fix docstring * Update README.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/configuration_patchtsmixer.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/configuration_patchtsmixer.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * PatchTSMixerChannelFeatureMixerBlock * formatting * ForPretraining * use num_labels instead of n_classes * remove commented out code * docstring fixed * nn.functional used instead of one letter F * x_tmp renamed * one letter variable x removed from forward calls * one letter variable y removed * remove commented code * rename patch_size, in_channels, PatchTSMixerBackbone * add config to heads * add config to heads tests * code reafactoring to use config instead of passing individual params * Cdocstring fixes part 1 * docstring fixes part 2 * removed logger.debug * context_values -> past_values * formatting changes * pe -> positional_encoding * removed unused target variable * self.mode logic fixed * formatting change * edit docstring and var name * change n_targets to num_targets * rename input_size to num_input_channels * add head names with prefix PatchTSMixer * edit docstring in PatchTSMixerForRegression * fix var name change in testcases * add PatchTSMixerAttention * return dict for all exposed classes, test cases added * format * move loss function to forward call * make style * adding return dict/tuple * make repo-consistency * remove flatten mode * code refactoring * rename data * remove PatchTSMixer and keep only PatchTSMixerEncoder * docstring fixes * removed unused code * format * format * remove contiguous and formatting changes * remove model description from config * replace asserts with ValueError * remove nn.Sequential from PatchTSMixerNormLayer * replace if-else with map * remove all nn.Sequential * format * formatting * fix gradient_checkpointing error after merge, and formatting * make fix-copies * remove comments * reshape * doesnt support gradient checkpointing * corect Patchify * masking updates * batchnorm copy from * format checks * scaler edits * remove comments * format changes * remove self.config * correct class PatchTSMixerMLP(nn.Module): * makr fix * doc updates * fix-copies * scaler class correction * doc edits * scaler edits * update readme with links * injectstatistics add * fix-copies * add norm_eps option to LayerNorm * format changes * fix copies * correct make copies * use parametrize * fix doc string * add docs to toctree * make style * doc segmenting * docstring edit * change forecast to prediction * edit doc * doc edits * remove PatchTSMixerTranspose * add PatchTSMixerPositionalEncoding and init position_enc * remove positional_encoding * edit forecast_masking, remove forecast_mask_ratios * fix broken code * var rename target_values -> future_values * num_features -> d_model * fix broken code after master merge * repo consistency * use postional embedding * prediction_logits -> prediction_outputs, make fix-copies * uncommented @slow * minor changes * loss first in tuple * tuple and dict same ordering * style edits * minor changes * dict/tuple consistent enablement * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix formatting * formatting * usage tip * test on cpu only * add sample usage * change PatchTSMixerForClassification to PatchTSMixerForTimeSeriesClassification * push changes * fix copies * std scaling set to default True case * minor changes * stylechanges --------- Co-authored-by:
Arindam Jati <arindam.jati@ibm.com> Co-authored-by:
vijaye12 <vijaye12@in.ibm.com> Co-authored-by:
Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by:
nnguyen <nnguyen@us.ibm.com> Co-authored-by:
vijaye12 <vijaykr.e@gmail.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Nam Nguyen <namctin@gmail.com> Co-authored-by:
Wesley Gifford <79663411+wgifford@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Bram Willemsen authored
-
Younes Belkada authored
support accelerate for clip-vision
-
Joao Gante authored
update test result, due to bug fix in decoder-only beam search
-
Younes Belkada authored
* v1 fusing modules * add fused mlp support * up * fix CI * block save_pretrained * fixup * small fix * add new condition * add v1 docs * add some comments * style * fix nit * adapt from suggestion * add check * change arg names * change variables name * Update src/transformers/integrations/awq.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * style * split up into 3 different private methods * more conditions * more checks * add fused tests for custom models * fix * fix tests * final update docs * final fixes * fix importlib metadata * Update src/transformers/utils/quantization_config.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change it to `do_fuse` * nit * Update src/transformers/utils/quantization_config.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * few fixes * revert * fix test * fix copies * raise error if model is not quantized * add test * use quantization_config.config when fusing * Update src/transformers/modeling_utils.py --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
NielsRogge authored
* Make image processors more general * Add backwards compatibility for KOSMOS-2 * Remove use_square_size everywhere * Remove script
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 04 Dec, 2023 19 commits
-
-
Rockerz authored
* Create asr.md * Create audio_classification.md * Create document_question_answering.md * Update document_question_answering.md * add * add * ggg * gg * add masked_language_modeling.md * add monocular_depth estimation * new * dd * add * add * cl * add * Add Traslation.md * hgf * Added docs to Toctree file * Update docs/source/ja/tasks/asr.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/asr.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/image_classification.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/idefics.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/image_captioning.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix docs and revert changes * Update docs/source/en/tasks/idefics.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/language_modeling.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/language_modeling.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/language_modeling.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/prompting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/masked_language_modeling.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/masked_language_modeling.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/prompting.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/object_detection.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/semantic_segmentation.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/semantic_segmentation.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/token_classification.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/translation.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/visual_question_answering.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/summarization.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * changes in review 1 and 2 * add * Update docs/source/ja/tasks/asr.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/tasks/translation.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * changes * Update docs/source/ja/_toctree.yml Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/_toctree.yml Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/_toctree.yml Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
jiaqiw09 authored
* translate * update * update --------- Co-authored-by:jiaqiw <wangjiaqi50@huawei.com>
-
Sanchit Gandhi authored
-
Yih-Dar authored
* fix * fix * Use TRUST_REMOTE_CODE * fix doc * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Arthur authored
-
Yeounoh Chung authored
* [XLA] Re-enable broken _tpu_save for XLATensors, by explicitly moving to cpu * linter-fix
-
fxmarty authored
* support FA2 * fix typo * fix broken tests * fix more test errors * left/right * fix bug * more test * typo * fix layout flash attention falcon * do not support this case * use allclose instead of equal * fix various bugs with flash attention * bump * fix test * fix mistral * use skiptest instead of return that may be misleading * add fix causal arg flash attention * fix copies * more explicit comment * still use self.is_causal * fix causal argument * comment * fixes * update documentation * add link * wrong test * simplify FA2 RoCm requirements * update opt * make flash_attn_uses_top_left_mask attribute private and precise comment * better error handling * fix copy & mistral * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/import_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * use is_flash_attn_greater_or_equal_2_10 instead of is_flash_attn_greater_or_equal_210 * fix merge * simplify * inline args --------- Co-authored-by:
Felix Marty <felix@hf.co> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Nilesh authored
* Added test cases for rembert refering to albert and reformer test_tokenization * removed CURL_CA_BUNDLE=' * Added flag test_sentencepiece_ignore_case and space_between_special_tokens to True * Overrided test_added_tokens_serialization * As slow->fast token failed due to the different initialization for [MASK] for slow and fast, Therefore it required to make the initialization for [MASK] token uniform between fast and slow token * Added few more test cases in test_encode_decode_round_trip and modefied the slow token (mask_token) to have AddedToken instance with lstrip=True * Added few test cases in test_encoder_decoder round trip and also modified slow tokenizer of rembert to have mask_token as AddedToken with lstrip = True * Cleaned the code and added fmt: skip to avoid line breaks after make style + added comments to indicate from the copied test cases * Corrected few comments * Fixed quality issue * Ran fix-copies * Fixed few minor issues as (make fix-copies) broke few test cases while stripping the text * Reverted the changes made by repo-consistancy --------- Co-authored-by:Kokane <kokanen@apac.corpdir.net>
-
Sanchit Gandhi authored
-
Sanchit Gandhi authored
-
RogerWhisper authored
Keypoints 0.0 are confusing ../transformers/models/detr/image_processing_detr.py which are fixed (#26250) * Keypoints 0.0 is fixed * fixed keypoints for image_processing_yolos * fixed keypoints for image_processing_deta * fixed keypoints for image_processing_deformable_detr * fixed keypoints for image_processing_conditional_detr * fixed styles * Removed Comments * Removed comment form conditional detr too * Removed Extra code * make fix-copes * Fixed code quality * keypoints changes
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* fix * fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Arthur authored
* mark test as slow for now * style
-
Ilya authored
added param Co-authored-by:Ilya Fedorov <ilyaf@nvidia.com>
-
Noah Siegel authored
-
NielsRogge authored
* First draft * Extend test_forward_signature * Update tests/test_modeling_common.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Revert suggestion --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Roy Hvaara authored
An upcoming change to JAX will include non-local (addressable) CPU devices in jax.devices() when JAX is used multicontroller-style, where there are multiple Python processes. This change preserves the current behavior by replacing uses of jax.devices("cpu"), which previously only returned local devices, with jax.local_devices("cpu"), which will return local devices both now and in the future. This change is always safe (i.e., it should always preserve the previous behavior), but it may sometimes be unnecessary if code is never used in a multicontroller setting. Co-authored-by:Peter Hawkins <phawkins@google.com>
-
- 01 Dec, 2023 7 commits
-
-
Sanchit Gandhi authored
[MusicGen] Fix mono logit test
-
Marc Sun authored
* better error message * fix logic * fix log
-
Nicolas Patry authored
* [WIP] Make using safetensors files automated. If `use_safetensors=True` is used, and it doesn't exist: - Don't crash just yet - Lookup for an open PR containing it. - If yes, use that instead - If not, touch the space to convert, wait for conversion to be finished and the PR to be opened - Use that new PR - Profit. * Remove the token. * [Auto Safetensors] Websocket -> SSE (#27656) * Websocket -> SSE * Support sharded + tests +cleanup a * env var * Apply suggestions from code review * Thanks Simon * Thanks Wauplin Co-authored-by:
Wauplin <lucainp@gmail.com> * Cleanup * Update tests * Tests should pass * Apply to other tests * Extend extension * relax requirement on latest hfh * Revert * Correct private handling & debug statements * Skip gated repos as of now * Address review comments Co-authored-by:
ArthurZucker <arthur.zucker@gmail.com> --------- Co-authored-by:
Lysandre Debut <hi@lysand.re> Co-authored-by:
Lysandre <lysandre@huggingface.co> Co-authored-by:
Wauplin <lucainp@gmail.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
ArthurZucker <arthur.zucker@gmail.com>
-
Wesley Gifford authored
* Remove config reference and pass num_patches for PatchTSTforPrediction * ensure return_dict is properly set --------- Co-authored-by:Wesley M. Gifford <wmgifford@us.ibm.com>
-
Nolwenn Bernard authored
* partial traduction of installation * Finish translation of installation * Update installation.mdx * Rename installation.mdx to installation.md * Typos * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/fr/installation.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Address review comments --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Joshua Lochner authored
Fix typo in README
-
Liangliang-Ma authored
change xpu _n_gpu = 1
-
- 30 Nov, 2023 2 commits
-
-
Yoach Lacombe authored
* add working convertion script * first non-working version of modeling code * update modeling code (working) * make style * make fix-copies * add config docstrings * add config to ignore docstrings formatage due to unconventional markdown * fix copies * fix generation num_return_sequences * enrich docs * add and fix tests beside integration tests * update integration tests * update repo id * add tie weights and make style * correct naming in .md * fix imports and so on * correct docstrings * fix fp16 speech forward * fix speechencoder attention * make style * fix copied from * rename SeamlessM4Tv2-v2 to SeamlessM4Tv2 * Apply suggestions on configuration Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove useless public models * fix private models + better naming for T2U models * clean speech encoder relative position embeddings * refactor chunk attention * add docstrings to chunk attention method * improve naming and docstrings * rename some attention variables + add temperature sampling in T2U model * rename DOCSTRINGS variable names * make style + remove 2 useless config parameters * enrich model card * remove any attention_head reference + fix temperature in T2U * new fmt and make style * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * rename spkr_id->speaker_id and change docstrings of get_char_input_ids * simplify v2attention * make style * Update seamless_m4t_v2.md * update code and tests with last update * update repo ids * fill article name, abstract andauthors * update not_doctested and slow_doc tests --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Joao Gante authored
-