- 03 Apr, 2023 3 commits
-
-
Eli Simhayev authored
added > 0.5 to `past_observed_mask`
-
amyeroberts authored
* Add out_indices to backbones, deprecate out_features * Update - can specify both out_features and out_indices but not both * Can specify both * Fix copies * Add out_indices to convnextv2 configuration
-
kevinpro authored
-
- 31 Mar, 2023 6 commits
-
-
Sylvain Gugger authored
* Test fetcher v2 * Fix regexes * Remove sanity check * Fake modification to OPT * Fixes some .sep issues * Remove fake OPT change * Fake modif for BERT * Fake modif for init * Exclude SageMaker tests * Fix test and remove fake modif * Fake setup modif * Fake pipeline modif * Remove all fake modifs * Adds options to skip/force tests * [test-all-models] Fake modif for BERT * Try this way * Does the command actually work? * [test-all-models] Try again! * [skip circleci] Remove fake modif * Remove debug statements * Add the list of important models * Quality * Update utils/tests_fetcher.py Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comments * Address review comments * Fix and add test * Apply suggestions from code review Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Address review comments --------- Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com>
-
Sabine authored
* update NeptuneCallback docstring * formatting * apply make style --------- Co-authored-by:Aleksander Wojnarowicz <alwojnarowicz@gmail.com>
-
dependabot[bot] authored
Bump redis in /examples/research_projects/decision_transformer Bumps [redis](https://github.com/redis/redis-py) from 4.5.3 to 4.5.4. - [Release notes](https://github.com/redis/redis-py/releases) - [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES) - [Commits](https://github.com/redis/redis-py/compare/v4.5.3...v4.5.4 ) --- updated-dependencies: - dependency-name: redis dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Nicolas Patry authored
* Making sure we can use safetensors to serialize all the time. * Expanding the tests for increased coverage. * Update the test. * Getting current state of affairs. * Tentative fix. * Fixing black version. * Fixing the worst offenders. * Try to modify less files. * Fixing blip_2 (Weird solution right now). * Fixing deta. * Fix blip ? * Missing extra newline. * No deta modification. * Adding some comments. * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Addressing comments. * Addressing comments. * creating warn_once. * Warning_once ! --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
lewtun authored
* Relax checks from to warning * Fix style * Replace warnings with logger * Use warning vs warn
-
- 30 Mar, 2023 10 commits
-
-
Yih-Dar authored
* Enable Nightly + Past CI * put schedule --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Manuel de Prada authored
Docs fix: Multinomial sampling decoding needs "num_beams=1", since by default it is usually not 1. (#22473) Fix: Multinomial sampling needs "num_beams=1", since by default is 5.
-
Joao Gante authored
* Llama now supports max_position_embeddings * Save config; Cosmetic edits
-
Arthur authored
edit default model type and testing path set to hf-internal-testing
-
Roy Hvaara authored
Guard imports that use the tokenizers library
-
amyeroberts authored
Fix ordering of height,width for BLIP
-
Joao Gante authored
* haha tokens go brrrr
-
amyeroberts authored
Skip flaky test for now
-
amyeroberts authored
* Rescale image back if it was scaled during PIL conversion * do_rescale is defined if PIL image passed in
-
amyeroberts authored
* Move common properties to BackboneMixin * Fix failing tests * Update ConvNextV2 backbone
-
- 29 Mar, 2023 14 commits
-
-
Stefan Heng authored
* Update: ignore padding support for TransfoXL training when n_clusters==0 * Update: transformer XL always pad * Update: drop doc
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Sabine authored
-
jeffhataws authored
This reverts commit fd81746dbec5f17c8285a0fdc72ca4b4c025cc33.
-
Younes Belkada authored
fix slow test
-
Sylvain Gugger authored
Revert "Error (also in original) model, scaling only q matrix not qk.T dot product (qk.T/sqrt(dim_per_head))" (#22444) Revert "Error (also in original) model, scaling only q matrix not qk.T dot product (qk.T/sqrt(dim_per_head)) (#21627)" This reverts commit bad83008.
-
Yih-Dar authored
Fix some tiny model creation issues Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Sylvain Gugger authored
-
Younes Belkada authored
* add conditional generation * add comments
-
Younes Belkada authored
* fix bnb failing test * fix * fix * fixup
-
Nolwenn Bernard authored
Fixes #22429
-
Arthur authored
* add draft changes * fix failing wav2vec * style * make sure that the argument is saved + add tests * style * fixup * update test * default clean_up_tokenization_spaces to False for Bloom and Llama * Update code based on review Co-authored-by:
Nicolas Patry <patry.nicolas@gmail.com> * style * quality --------- Co-authored-by:
Nicolas Patry <patry.nicolas@gmail.com>
-
- 28 Mar, 2023 4 commits
-
-
Joao Gante authored
Fix docs and doctests
-
Jeff Rasley authored
* ensure causal_mask is created directly on device * add copy tag to opt, update bart implementation * add device to all _make_causal_mask copies * formatting fixes * more manual fixes due to unlinked versions of _prepare_decoder_attention_mask
-
fpgaminer authored
Fix bug in perplexity guide calculations and update perplexity numbers.
-
dependabot[bot] authored
Bump redis in /examples/research_projects/decision_transformer Bumps [redis](https://github.com/redis/redis-py) from 4.1.4 to 4.5.3. - [Release notes](https://github.com/redis/redis-py/releases) - [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES) - [Commits](https://github.com/redis/redis-py/compare/v4.1.4...v4.5.3 ) --- updated-dependencies: - dependency-name: redis dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 27 Mar, 2023 3 commits
-
-
Kshiteej K authored
* [neptune] fix checkpoint bug with relative out_dir * update imports * reformat with black * check neptune without imports * fix typing-related issue * run black on code * use os.path.sep instead of raw \ * simplify imports and remove type annotation * make ruff happy * apply review suggestions --------- Co-authored-by:Aleksander Wojnarowicz <alwojnarowicz@gmail.com>
-
Arthur authored
* Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update *
❗ local groups are supported here *⚠ ️ Support for local groups is now removed⚠ ️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing *🎉 encoder and decoder logits match🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
-