- 23 Oct, 2023 9 commits
-
-
Arthur authored
fix copies on newly merged model
-
Younes Belkada authored
fix NLLB MoE 4bit
-
Yoach Lacombe authored
* first raw commit * still POC * tentative convert script * almost working speech encoder conversion scripts * intermediate code for encoder/decoders * add modeling code * first version of speech encoder * make style * add new adapter layer architecture * add adapter block * add first tentative config * add working speech encoder conversion * base model convert works now * make style * remove unnecessary classes * remove unecessary functions * add modeling code speech encoder * rework logics * forward pass of sub components work * add modeling codes * some config modifs and modeling code modifs * save WIP * new edits * same output speech encoder * correct attention mask * correct attention mask * fix generation * new generation logics * erase comments * make style * fix typo * add some descriptions * new state * clean imports * add tests * make style * make beam search and num_return_sequences>1 works * correct edge case issue * correct SeamlessM4TConformerSamePadLayer copied from * replace ACT2FN relu by nn.relu * remove unecessary return variable * move back a class * change name conformer_attention_mask ->conv_attention_mask * better nit code * add some Copied from statements * small nits * small nit in dict.get * rename t2u model -> conditionalgeneration * ongoing refactoring of structure * update models architecture * remove SeamlessM4TMultiModal classes * add tests * adapt tests * some non-working code for vocoder * add seamlessM4T vocoder * remove buggy line * fix some hifigan related bugs * remove hifigan specifc config * change * add WIP tokenization * add seamlessM4T working tokenzier * update tokenization * add tentative feature extractor * Update converting script * update working FE * refactor input_values -> input_features * update FE * changes in generation, tokenizer and modeling * make style and add t2u_decoder_input_ids * add intermediate outputs for ToSpeech models * add vocoder to speech models * update valueerror * update FE with languages * add vocoder convert * update config docstrings and names * update generation code and configuration * remove todos and update config.pad_token_id to generation_config.pad_token_id * move block vocoder * remove unecessary code and uniformize tospeech code * add feature extractor import * make style and fix some copies from * correct consistency + make fix-copies * add processor code * remove comments * add fast tokenizer support * correct pad_token_id in M4TModel * correct config * update tests and codes + make style * make some suggested correstion - correct comments and change naming * rename some attributes * rename some attributes * remove unecessary sequential * remove option to use dur predictor * nit * refactor hifigan * replace normalize_mean and normalize_var with do_normalize + save lang ids to generation config * add tests * change tgt_lang logic * update generation ToSpeech * add support import SeamlessM4TProcessor * fix generate * make tests * update integration tests, add option to only return text and update tokenizer fast * fix wrong function call * update import and convert script * update integration tests + update repo id * correct paths and add first test * update how new attention masks are computed * update tests * take first care of batching in vocoder code * add batching with the vocoder * add waveform lengths to model outputs * make style * add generate kwargs + forward kwargs of M4TModel * add docstrings forward methods * reformate docstrings * add docstrings t2u model * add another round of modeling docstrings + reformate speaker_id -> spkr_id * make style * fix check_repo * make style * add seamlessm4t to toctree * correct check_config_attributes * write config docstrings + some modifs * make style * add docstrings tokenizer * add docstrings to processor, fe and tokenizers * make style * write first version of model docs * fix FE + correct FE test * fix tokenizer + add correct integration tests * fix most tokenization tests * make style * correct most processor test * add generation tests and fix num_return_sequences > 1 * correct integration tests -still one left * make style * correct position embedding * change numbeams to 1 * refactor some modeling code and correct one test * make style * correct typo * refactor intermediate fnn * refactor feedforward conformer * make style * remove comments * make style * fix tokenizer tests * make style * correct processor tests * make style * correct S2TT integration * Apply suggestions from Sanchit code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct typo * replace torch.nn->nn + make style * change Output naming (waveforms -> waveform) and ordering * nit renaming and formating * remove return None when not necessary * refactor SeamlessM4TConformerFeedForward * nit typo * remove almost copied from comments * add a copied from comment and remove an unecessary dropout * remove inputs_embeds from speechencoder * remove backward compatibiliy function * reformate class docstrings for a few components * remove unecessary methods * split over 2 lines smthg hard to read * make style * replace two steps offset by one step as suggested * nice typo * move warnings * remove useless lines from processor * make generation non-standard test more robusts * remove torch.inference_mode from tests * split integration tests * enrich md * rename control_symbol_vocoder_offset->vocoder_offset * clean convert file * remove tgt_lang and src_lang from FE * change generate docstring of ToText models * update generate docstring of tospeech models * unify how to deal withtext_decoder_input_ids * add default spkr_id * unify tgt_lang for t2u_model * simplify tgt_lang verification * remove a todo * change config docstring * make style * simplify t2u_tgt_lang_id * make style * enrich/correct comments * enrich .md * correct typo in docstrings * add torchaudio dependency * update tokenizer * make style and fix copies * modify SeamlessM4TConverter with new tokenizer behaviour * make style * correct small typo docs * fix import * update docs and add requirement to tests * add convert_fairseq2_to_hf in utils/not_doctested.txt * update FE * fix imports and make style * remove torchaudio in FE test * add seamless_m4t.md to utils/not_doctested.txt * nits and change the way docstring dataset is loaded * move checkpoints from ylacombe/ to facebook/ orga * refactor warning/error to be in the 119 line width limit * round overly precised floats * add stereo audio behaviour * refactor .md and make style * enrich docs with more precised architecture description * readd undocumented models * make fix-copies * apply some suggestions * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * correct bug from previous commit * refactor a parameter allowing to clean the code + some small nits * clean tokenizer * make style and fix * make style * clean tokenizers arguments * add precisions for some tests * move docs from not_tested to slow * modify tokenizer according to last comments * add copied from statements in tests * correct convert script * correct parameter docstring style * correct tokenization * correct multi gpus * make style * clean modeling code * make style * add copied from statements * add copied statements * add support with ASR pipeline * remove file added inadvertently * fix docstrings seamlessM4TModel * add seamlessM4TConfig to OBJECTS_TO_IGNORE due of unconventional markdown * add seamlessm4t to assisted generation ignored models --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Younes Belkada authored
* Update modeling_utils.py * fixup * let's change it to 5GB * fix
-
Omar Sanseviero authored
Update llama2.md
-
Arthur authored
* skip two tests * skip torch as well * fixup
-
Gema Parre帽o authored
git python falcon typo
-
Lysandre Debut authored
Pin fsspec
-
YQ authored
* fix logit to multi-hot converstion * add comments * typo
-
- 20 Oct, 2023 5 commits
-
-
Akhil authored
* Create index.md * Create _toctree.yml * Updated index.md in telugu * Update _toctree.yml * Create quicktour.md * Update quicktour.md * Create index.md * Update quicktour.md * Update docs/source/te/quicktour.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Delete docs/source/hi/index.md * Update docs/source/te/quicktour.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update build_documentation.yml Added telugu [te] * Update build_pr_documentation.yml Added Telugu [te] * Update _toctree.yml --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Biswa Baibhab Subudhi authored
* Update README_hd.md - Fixed broken links I hope this small contribution adds value to this project. * Update README_hd.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Pedro Cuenca authored
* Fix Fuyu image scaling bug It could produce negative padding and hence inference errors for certain image sizes. * Fix aspect ratio scaling test
-
Diego Machado authored
* fix set_transform link * Update docs/source/en/preprocessing.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * use doc-builder sintax --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Adam Ross authored
* Fix docstring for speech-to-text config * Refactor doc line len <= 119 char * Remove Speech2TextConfig from OBJECTS_TO_IGNORE * Fix Speech2TextConfig doc str * Fix Speech2TextConfig doc using doc-builder * Refactor Speech2TextConfig doc
-
- 19 Oct, 2023 9 commits
-
-
letohx authored
Update README_ru.md Corrected modalities description in README
-
Joao Gante authored
-
Younes Belkada authored
supprot fa-2 + right padding + forward
-
Matt authored
* Pin Keras for now out of paranoia * Add the keras pin to _tests_requirements.txt too * Make sure the Keras version matches the TF one * make fixup
-
Mohamed Aymane Farhi authored
-
Daniil authored
* remove docstrings CodeGen from objects_to_ignore * autofix codegen docstrings * fill in the missing types and docstrings * fixup * change descriptions to be in a separate line * apply docstring suggestions from code review Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * update n_ctx description in CodeGenConfig --------- Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com>
-
Matt authored
* Fix and re-enable conversationalpipeline tests * Fix the batch test so the change only applies to conversational pipeline
-
Patrick von Platen authored
better docstrings whisper
-
Sparty authored
* Remove ChineseCLIPImageProcessor, ChineseCLIPTextConfig, ChineseCLIPVisionConfig from check_docstrings * Run fix_and_overwrite for ChineseCLIPImageProcessor, ChineseCLIPTextConfig, ChineseCLIPVisionConfig * Replace <fill_type> and <fill_docstring> in configuration_chinese_clip.py, image_processing_chinese_clip.py with type and docstring values --------- Co-authored-by:vignesh-raghunathan <vignesh_raghunathan@intuit.com>
-
- 18 Oct, 2023 13 commits
-
-
Younes Belkada authored
revert
-
Pablo Montalvo authored
* initial commit * add processor, add fuyu naming * add draft processor * fix processor * remove dropout to fix loading of weights * add image processing fixes from Pedro * fix * fix processor * add basic processing fuyu test * add documentation and TODO * address comments, add tests, add doc * replace assert with torch asserts * add Mixins and fix tests * clean imports * add model tester, clean imports * fix embedding test * add updated tests from pre-release model * Processor: return input_ids used for inference * separate processing and model tests * relax test tolerance for embeddings * add test for logit comparison * make sure fuyu image processor is imported in the init * fix formattingh * more formatting issues * and more * fixups * remove some stuff * nits * update init * remove the fuyu file * Update integration test with release model * Update conversion script. The projection is not used, as confirmed by the authors. * improve geenration * Remove duplicate function * Trickle down patches to model call * processing fuyu updates * remove things * fix prepare_inputs_for_generation to fix generate() * remove model_input * update * add generation tests * nits * draft leverage automodel and autoconfig * nits * fix dtype patch * address comments, update READMEs and doc, include tests * add working processing test, remove refs to subsequences * add tests, remove Sequence classification * processing * update * update the conversion script * more processing cleanup * safe import * take out ModelTesterMixin for early release * more cl;eanup * more cleanup * more cleanup * and more * register a buffer * nits * add postprocessing of generate output * nits * updates * add one working test * fix test * make fixup works * fixup * Arthur's updates * nits * update * update * fix processor * update tests * passe more fixups * fix * nits * don't import torch * skip fuyu config for now * fixup done * fixup * update * oups * nits * Use input embeddings * no buffer * update * styling processing fuyu * fix test * update licence * protect torch import * fixup and update not doctested * kwargs should be passed * udpates * update the impofixuprts in the test * protect import * protecting imports * protect imports in type checking * add testing decorators * protect top level import structure * fix typo * fix check init * move requires_backend to functions * Imports * Protect types --------- Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> Co-authored-by:
ArthurZucker <arthur.zucker@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Lysandre <lysandre@huggingface.co>
-
Younes Belkada authored
* final fix for FA2 dtype * try * oops * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * apply fix everywhere --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Yeyang authored
docs: translate fast_tokenizers into Chinese
-
Rockerz authored
Refactor code in documentation
-
Matt authored
* Add default template warnings * make fixup * Move warnings to FutureWarning * Move warnings to FutureWarning * fix make fixup * Remove futurewarning
-
Matt authored
-
Arthur authored
* fix * last attempt * current work * fix forward compatibility * save all special tokens * current state * revert additional changes * updates * remove tokenizer.model * add a test and the fix * nit * revert one more break * fix typefield issue * quality * more tests * fix fields for FC * more nits? * new additional changes * how * some updates * simplify all * more nits * revert some things to original * nice * nits * a small hack * more nits * ahhaha * fixup * update * make test run on ci * use subtesting * update * Update .circleci/create_circleci_config.py * updates * fixup * nits * replace typo * fix the test * nits * update * None max dif pls * a partial fix * had to revert one thing * test the fast * updates * fixup * and more nits * more fixes * update * Oupsy
馃憗 * nits * fix marian * on our way to heaven * Update src/transformers/models/t5/tokenization_t5.py Co-authored-by:Lysandre Debut <hi@lysand.re> * fixup * Update src/transformers/tokenization_utils_fast.py Co-authored-by:
Leo Tronchon <leo.tronchon@gmail.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Leo Tronchon <leo.tronchon@gmail.com> * fix phobert * skip some things, test more * nits * fixup * fix deberta * update * update * more updates * skip one test * more updates * fix camembert * can't test this one * more good fixes * kind of a major update - seperate what is only done in fast in fast init and refactor - add_token(AddedToken(..., speicla = True)) ignores it in fast - better loading * fixup * more fixups * fix pegasus and mpnet * remove skipped tests * fix phoneme tokenizer if self.verbose * fix individual models * update common tests * update testing files * all over again * nits * skip test for markup lm * fixups * fix order of addition in fast by sorting the added tokens decoder * proper defaults for deberta * correct default for fnet * nits on add tokens, string initialized to special if special * skip irrelevant herbert tests * main fixes * update test added_tokens_serialization * the fix for bart like models and class instanciating * update bart * nit! * update idefix test * fix whisper! * some fixup * fixups * revert some of the wrong chanegs * fixup * fixup * skip marian * skip the correct tests * skip for tf and flax as well --------- Co-authored-by:
Lysandre Debut <hi@lysand.re> Co-authored-by:
Leo Tronchon <leo.tronchon@gmail.com>
-
Matt authored
Don't drop decoder_input_ids without also dropping decoder_attention_mask
-
Merve Noyan authored
* Knowledge distillation for vision guide * Update knowledge_distillation_for_image_classification.md * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Iterated on Rafael's comments * Added to toctree * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Addressed comments * Update knowledge_distillation_for_image_classification.md * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update knowledge_distillation_for_image_classification.md * Update knowledge_distillation_for_image_classification.md * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Address comments * Update knowledge_distillation_for_image_classification.md * Explain KL Div --------- Co-authored-by:
Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Maria Khalusova <kafooster@gmail.com>
-
dependabot[bot] authored
Bump urllib3 in /examples/research_projects/decision_transformer Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to 1.26.18. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18 ) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
dependabot[bot] authored
Bump urllib3 in /examples/research_projects/visual_bert Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to 1.26.18. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18 ) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Joao Gante authored
improve docstrings
-
- 17 Oct, 2023 4 commits
-
-
jayfurmanek authored
Add tf-nightly-rocm to _is_tf_available check
-
Rockerz authored
* Add translation to fitst 3 file of internal folder * Update Toctree.md and add files * Update docs/source/ja/internal/generation_utils Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Rename generation_utils file * rename pipelines_utils.md * Change file names --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Bingchen Zhao authored
Fix a typo in mistral.md
-
louietouie authored
* Deleted LukeConfig and ran check_docstrings.py * Filled docstring information --------- Co-authored-by:louie <louisparizeau@Chicken.local>
-