- 09 Jan, 2024 4 commits
-
-
Xuehai Pan authored
* Fix initialization for missing parameters in `from_pretrained` under ZeRO-3 * Test initialization for missing parameters under ZeRO-3 * Add more tests * Only enable deepspeed context for per-module level parameters * Enable deepspeed context only once * Move class definition inside test case body
-
Sangbum Daniel Choi authored
* fix auxiliary loss training in detrSegmentation * add auxiliary_loss testing
-
Patrick von Platen authored
* [SDPA] Make sure attn mask creation is always done on CPU * Update docker to 2.1.1 * revert test change
-
Yih-Dar authored
* info * update * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 08 Jan, 2024 8 commits
-
-
NielsRogge authored
* Add first draft * Use appropriate gelu function * More improvements * More improvements * More improvements * Convert checkpoint * More improvements * Improve docs, remove print statements * More improvements * Add link * remove unused masking function * begin tokenizer * do_lower_case * debug * set split_special_tokens=True * Remove script * Fix style * Fix rebase * Use same design as CLIP * Add fast tokenizer * Add SiglipTokenizer to init, remove extra_ids * Improve conversion script * Use smaller inputs in conversion script * Update conversion script * More improvements * Add processor to conversion script * Add tests * Remove print statements * Add tokenizer tests * Fix more tests * More improvements related to weight initialization * More improvements * Make more tests pass * More improvements * More improvements * Add copied from * Add canonicalize_text * Enable fast tokenizer tests * More improvements * Fix most slow tokenizer tests * Address comments * Fix style * Remove script * Address some comments * Add copied from to tests * Add more copied from * Add more copied from * Add more copied from * Remove is_flax_available * More updates * Address comment * Remove SiglipTokenizerFast for now * Add caching * Remove umt5 test * Add canonicalize_text inside _tokenize, thanks Arthur * Fix image processor tests * Skip tests which are not applicable * Skip test_initialization * More improvements * Compare pixel values * Fix doc tests, add integration test * Add do_normalize * Remove causal mask and leverage ignore copy * Fix attention_mask * Fix remaining tests * Fix dummies * Rename temperature and bias * Address comments * Add copied from to tokenizer tests * Add SiglipVisionModel to auto mapping * Add copied from to image processor tests * Improve doc * Remove SiglipVisionModel from index * Address comments * Improve docs * Simplify config * Add first draft * Make it like mistral * More improvements * Fix attention_mask * Fix output_attentions * Add note in docs * Convert multilingual model * Convert large checkpoint * Convert more checkpoints * Add pipeline support, correct image_mean and image_std * Use padding=max_length by default * Make processor like llava * Add code snippet * Convert more checkpoints * Set keep_punctuation_string=None as in OpenCLIP * Set normalized=False for special tokens * Fix doc test * Update integration test * Add figure * Update organization * Happy new year * Use AutoModel everywhere --------- Co-authored-by:patil-suraj <surajp815@gmail.com>
-
Rosie Wood authored
* add segmentation map processing to sam image processor * fixup * add tests * reshaped_input_size is shape before padding * update tests for size/shape outputs * fixup * add code snippet to docs * Update docs/source/en/model_doc/sam.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add missing backticks * add `segmentation_maps` as arg for SamProcessor.__call__() --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Avimanyu Bandyopadhyay authored
Remove shell=True from subprocess.Popen to mitigate security risk
-
zspo authored
fix tensor device
-
Ondrej Major authored
* fix input audio device for windows. * ffmpeg audio device Windows * Fixes wrong input device assignment in Windows * Fixed getting mic on Windows systems by adding _get_microphone_name() function.
-
Hz, Ji authored
-
Mohamed Abu El-Nasr authored
* Fix building alibi tensor when num_heads is not a power of 2 * Remove print function
-
Chi authored
Enhancing Code Readability and Maintainability with Simplified Activation Function Selection. (#28349) * Little bit change code in get_activation() * proper area to deffine gelu_activation() in this two file * Fix github issue * Mistake some typo * My mistake to self using to call config * Reformat my two file * Update src/transformers/activations.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/electra/modeling_electra.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/convbert/modeling_convbert.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Rename gelu_act to activatioin --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 07 Jan, 2024 1 commit
-
-
Susnato Dhar authored
* modified script and added test for phi2 * changes
-
- 05 Jan, 2024 6 commits
-
-
hugo-syn authored
-
Ella Charlaix authored
* Update vits modeling for onnx export compatibility * fix style * Update src/transformers/models/vits/modeling_vits.py
-
Susnato Dhar authored
* fix fa2 autocasting when using quantization * Update src/transformers/models/distilbert/modeling_distilbert.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/distilbert/modeling_distilbert.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Sangbum Daniel Choi authored
* [DETA] fix freeze/unfreeze function * Update src/transformers/models/deta/modeling_deta.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deta/modeling_deta.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * add freeze/unfreeze test case in DETA * fix type * fix typo 2 * fix : enable aux and enc loss in training pipeline * Add unsynced variables from original DETA for training * modification for passing CI test * make style * make fix * manual make fix * change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking * remove print * divide configuration in DetaModel and DetaForObjectDetection * image smaller size than 224 will give topk error * pred_boxes and logits should be equivalent to two_stage_num_proposals * add missing part in DetaConfig * Update src/transformers/models/deta/modeling_deta.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add docstring in configure and prettify TO DO part * change distribute related code to accelerate * Update src/transformers/models/deta/configuration_deta.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/deta/test_modeling_deta.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * protect importing accelerate * change variable name to specific value * wrong import --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Fernando Rodriguez Sanchez authored
* Fix pos_mask application and update tests accordingly * Fix style * Adding comments --------- Co-authored-by:Fernando Rodriguez <fernando.rodriguez@nielseniq.com>
-
yuanwu2017 authored
When running the case on multi-cards server with devcie_map-auto, It will not always be allocated to device 0, Because other processes may be using these cards. It will select the devices that can accommodate this model. Signed-off-by:yuanwu <yuan.wu@intel.com>
-
- 04 Jan, 2024 3 commits
-
-
Kevin Herro authored
Switch to the conda-forge channel for transformer installation, as the huggingface channel does not offer the latest version. Fixes #28248
-
Yoach Lacombe authored
* fix M4T FE error when no attention mask * modify logic * add test * go back to initial test situation + add other tests
-
Sangbum Daniel Choi authored
* fix get_num_masks output as [int] to int * fix loss size from torch.Size([1]) to torch.Size([])
-
- 03 Jan, 2024 6 commits
-
-
Aaron Jimenez authored
* Sort es/_toctree.yml like en/_toctree.yml * Run make style * Add -Rendimiento y escalabilidad- section to es/_toctree.yml * Run make style * Add s to section * Add translate of performance.md * Add performance.md to es/_toctree.yml * Run make styele * Fix docs links * Run make style
-
Mayfsz authored
* Translate contributing.md into Chinese * Update review comments
-
Apsod authored
* remove token_type_ids from model_input_names (like #24788) * removed test that assumed token_type_ids should be present and updated a model reference so that it points to an available model)
-
Connor Henderson authored
* start - docs, SpeechT5 copy and rename * add relevant code from FastSpeech2 draft, have tests pass * make it an actual conformer, demo ex. * matching inference with original repo, includes debug code * refactor nn.Sequentials, start more desc. var names * more renaming * more renaming * vocoder scratchwork * matching vocoder outputs * hifigan vocoder conversion script * convert model script, rename some config vars * replace postnet with speecht5's implementation * passing common tests, file cleanup * expand testing, add output hidden states and attention * tokenizer + passing tokenizer tests * variety of updates and tests * g2p_en pckg setup * import structure edits * docstrings and cleanup * repo consistency * deps * small cleanup * forward signature param order * address comments except for masks and labels * address comments on attention_mask and labels * address second round of comments * remove old unneeded line * address comments part 1 * address comments pt 2 * rename auto mapping * fixes for failing tests * address comments part 3 (bart-like, train loss) * make style * pass config where possible * add forward method + tests to WithHifiGan model * make style * address arg passing and generate_speech comments * address Arthur comments * address Arthur comments pt2 * lint changes * Sanchit comment * add g2p-en to doctest deps * move up self.encoder * onnx compatible tensor method * fix is symbolic * fix paper url * move models to espnet org * make style * make fix-copies * update docstring * Arthur comments * update docstring w/ new updates * add model architecture images * header size * md wording update * make style
-
lain authored
remove broken space
-
dependabot[bot] authored
Bumps [tj-actions/changed-files](https://github.com/tj-actions/changed-files) from 22.2 to 41. - [Release notes](https://github.com/tj-actions/changed-files/releases) - [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md) - [Commits](https://github.com/tj-actions/changed-files/compare/v22.2...v41 ) --- updated-dependencies: - dependency-name: tj-actions/changed-files dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 02 Jan, 2024 5 commits
-
-
Daniel Bustamante Ospina authored
-
Marco Carosi authored
[Whisper] Fix errors with MPS backend introduced by new code on word-level timestamps computation (#28288) * Update modeling_whisper.py to support MPS backend Fixed some issue with MPS backend. First, the torch.std_mean is not implemented and is not scheduled for implementation, while the single torch.std and torch.mean are. Second, MPS backend does not support float64, so it can not cast from float32 to float64. Inverting the double() when the matrix is in the cpu fixes the issue while should not change the logic. * Found another instruction in modeling_whisper.py not implemented byor MPS After a load test, where I transcribed a 2 hours audio file, I got into a branch that did not fix in the previous commit. Similar fix, where the torch.std_mean is changed into torch.std and torch.mean * Update modeling_whisper.py removed trailing white spaces Removed trailing white spaces * Update modeling_whisper.py to use is_torch_mps_available() Using is_torch_mps_available() instead of capturing the NotImplemented exception * Update modeling_whisper.py sorting the import block Sorting the utils import block * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
frankenliu authored
Co-authored-by:liujizhong1 <liujizhong1@xiaomi.com>
-
hoshi-hiyouga authored
* Update trainer.py * format
-
Dean Wyatte authored
update docs around mixing hf scheduler with deepspeed optimizer
-
- 26 Dec, 2023 2 commits
-
-
Stas Bekman authored
Update modeling_utils.py
-
Sourab Mangrulkar authored
-
- 25 Dec, 2023 1 commit
-
-
Younes Belkada authored
* v1 * add docstring * add tests * add awq 0.1.8 * oops * fix test
-
- 22 Dec, 2023 4 commits
-
-
Younes Belkada authored
* fix llava index errors * forward contrib credits from original implementation and fix * better fix * final fixes and fix all tests * fix * fix nit * fix tests * add regression tests --------- Co-authored-by:gullalc <gullalc@users.noreply.github.com>
-
lin yudong authored
Co-authored-by:yudong.lin <yudong.lin@funplus.com>
-
Anindyadeep authored
* fix: minor enhancement and fix in bounding box visualization example The example that was trying to visualize the bounding box was not considering an edge case, where the bounding box can be un-normalized. So using the same set of code, we can not get results with a different dataset with un-normalized bounding box. This commit fixes that. * run make clean * add an additional note on the scenarios where the box viz code works --------- Co-authored-by:Anindyadeep <anindya@pop-os.localdomain>
-
Yoach Lacombe authored
* fix frames * use smaller chunk length * correct beam search + tentative stride * fix whisper word timestamp in batch * add test batch generation with return token timestamps * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * clean a test * make style + correct typo * write clearer comments * explain test in comment --------- Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-